Provisions a model after billing checks and forwards to ModelService.
Bearer authentication using an access token. Format: Bearer <access_token>
Model nick name. Must be unique per user.
User ID who owns this model. Must be a valid UUID.
Optional description of the model.
Infra specifies the infrastructure configuration for deploying and running models.
This message defines where a model deployment should run by specifying both the cloud provider and the specific region. It uses a oneof for region selection to ensure type-safe region specification based on the chosen provider.
See also: tensormesh/common/v1/cloud_provider.proto for provider and region enum definitions
Model path (e.g., HuggingFace model ID).
Number of GPUs to allocate for this model.
GPUType specifies the type of GPU to use for a model deployment.
This enum defines the supported GPU types for model deployments. It allows clients to specify the exact GPU hardware they need for their models.
enum definitions
GPU_TYPE_UNSPECIFIED, GPU_TYPE_A100, GPU_TYPE_H100, GPU_TYPE_H200, GPU_TYPE_B200 Additional model-specific configuration.
Enable KV cache.
Enable CPU offloading.
A successful response.
Model represents a model instance created by user.