TensorFusion Remote/Local vGPU Benchmark Helm Chart

This Helm chart deploys the TensorFusion Remote/Local vGPU Benchmark application, which includes a deployment for running the benchmark tests and a cronjob for automated testing.

Prerequisites

Kubernetes 1.19+
Helm 3.2.0+
PV provisioner support in the underlying infrastructure
A GPU node with NVIDIA drivers installed

Installing the Chart

To install the chart with the release name my-release:

helm install my-release ./helm/torchbench

The command deploys the benchmark application on the Kubernetes cluster with default configuration.

Configuration

The following table lists the configurable parameters of the chart and their default values.

Parameter	Description	Default
`replicaCount`	Number of replicas	`1`
`image.repository`	Image repository	`crpi-wpzfqfci37r0ad3n.cn-hangzhou.personal.cr.aliyuncs.com/tensorfusionrobin/tensorfusionrobin`
`image.tag`	Image tag	`latest`
`image.pullPolicy`	Image pull policy	`Always`
`serviceAccount.create`	Create service account	`true`
`serviceAccount.name`	Service account name	`cronjob-sa`
`podAnnotations`	Pod annotations	See values.yaml
`podLabels`	Pod labels	See values.yaml
`resources`	Pod resource requests and limits	See values.yaml
`nodeSelector`	Node selector	`kubernetes.io/hostname: gpu-2`
`cronjob.schedule`	Cronjob schedule	`0 0 * * *`
`cronjob.concurrencyPolicy`	Cronjob concurrency policy	`Allow`
`cronjob.successfulJobsHistoryLimit`	Number of successful jobs to keep	`3`
`cronjob.failedJobsHistoryLimit`	Number of failed jobs to keep	`1`

Usage

Running the Benchmark

The benchmark will run automatically according to the cronjob schedule. You can also manually trigger a benchmark run by:

Finding the cronjob:

kubectl get cronjob

Creating a job from the cronjob:

kubectl create job --from=cronjob/my-release-torchbench-test-runner manual-run

Viewing Results

To view the benchmark results:

kubectl logs -l app=my-release-torchbench

Customizing the Configuration

To customize the configuration, create a custom values file:

helm install my-release ./helm/torchbench -f custom-values.yaml

Uninstalling the Chart

To uninstall/delete the deployment:

helm uninstall my-release

Troubleshooting

If you encounter any issues:

Check the pod status:

kubectl get pods -l app=my-release-torchbench

Check the pod logs:

kubectl logs -l app=my-release-torchbench

Check the cronjob status:

kubectl get cronjob
kubectl get jobs

Check the service account:

kubectl get serviceaccount

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
benchmark @ dba5a26		benchmark @ dba5a26
helm/torchbench		helm/torchbench
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TensorFusion Remote/Local vGPU Benchmark Helm Chart

Prerequisites

Installing the Chart

Configuration

Usage

Running the Benchmark

Viewing Results

Customizing the Configuration

Uninstalling the Chart

Troubleshooting

About

Releases

Packages

Contributors 2

Languages

License

NexusGPU/benchmark

Folders and files

Latest commit

History

Repository files navigation

TensorFusion Remote/Local vGPU Benchmark Helm Chart

Prerequisites

Installing the Chart

Configuration

Usage

Running the Benchmark

Viewing Results

Customizing the Configuration

Uninstalling the Chart

Troubleshooting

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages