How does Salad handle nodes with an out of date CUDA version?
Most of the Salad nodes on the Salad network have been updated with recent GPU drivers. However, you may see slight differences in the driver versions and CUDA versions across different nodes. You can check the CUDA version by logging into a running instance with the interactive shell and running the nvidia-smi
command:
The driver version refers to the version of the NVIDIA driver installed on the node that the instance runs on. The CUDA version indicates the version of the CUDA Toolkit that the driver supports. The NVIDIA driver maintains backward compatibility to continue support of applications built on older CUDA Toolkits.
Given the lag between the release of a new CUDA version and its support in AI frameworks such as TensorFlow and PyTorch—due to the time needed for integration, testing, and validation—container images built upon the most recent AI frameworks can generally run on Salad Cloud without any problems.
Salad Cloud doesn't allocate nodes based on NVIDIA driver versions at this time. We suggest building your images using recent and stable versions instead of the latest versions from these frameworks or their images to avoid edge cases.