Error Reference Guide

Here are some common errors you may encounter while running container deployments on Salad Cloud and steps to resolve.


Run Failures and StartFailures

These errors can occur when a container group is started. They can be found in the Salad Portal under Recent Errors.


Runfailure:137

    • Indicates the container ran out of memory.
  • Resolution
        • Check the minimum RAM requirement for your model and edit your Container Group Deployment

Networking Failures

These errors are found when attempting to access your container with the Access Domain Name. If you see any of the errors below it indicates that there is an issue with your container networking.


503 Error

    • If the container is unable to respond to incoming requests, you will receive a 503 error.
  • Resolution
        • Ensure your configure has IPv6 configured correctly.
        • It is best practice to have a readiness probe, so that the load balancer knows when a particular instance is ready.
        • Salad will automatically reallocate isolated networking issues we are able to detect, however if there continues to be a problem with a single node, reallocate that node.

524 Error

    • You may experience this issue when some containers in a container group with the container gateway enabled fail to reply with an HTTP response for a long time after the initial TCP connection is established
  • Resolution

The Readiness Probe should be configured to ensure the Load Balancer forwards requests to a container only when it is ready.

  • Health Probes assess the reference time (seconds, minutes or longer) based on your AI model over the selected resource type (CPU/GPU/RAM/VRAM). If the reference time is significantly longer then expected, please check your code:
      1. Whether the code is explicitly using GPU.
      2. Whether the correct resource type is selected to run the model inference.
      3. Whether VRAM could run out by some requests during the inference.
  • Multiprocessing or multithreading-based concurrent inference over a single GPU might limit optimal GPU cache utilization and impact performance, which should be avoided. If the reference takes a very long time (tens of minutes or longer), please consider re-architecting your application from the push mode using the container gateway to the pull mode using a job queue. With the job queue, a container will only get and process a new job when the existing one is done, significantly simplifying your application.

Container Error Codes

These are errors associated with the Container image and are not typically associated with Salad. They usually result in the container exited with a unique code.


Type Error

    • May experience this error when the layer running Salad makes a request returning HTML instead of the expected output from your API.
  • Resolution
    • Check response status and response content type before parsing.

Still need help? Contact Us Contact Us