curl --request POST \
--url https://api.tensormesh.ai/v1/models \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '
{
"modelName": "<string>",
"userId": "<string>",
"description": "<string>",
"infra": {
"cloudProvider": "CLOUD_PROVIDER_UNSPECIFIED",
"nebiusRegion": "NEBIUS_REGION_UNSPECIFIED",
"lambdaRegion": "LAMBDA_REGION_UNSPECIFIED",
"onpremRegion": "<string>"
},
"modelPath": "<string>",
"gpuCount": 123,
"gpuType": "GPU_TYPE_UNSPECIFIED",
"modelSpec": {},
"apiKey": "<string>",
"hfToken": "<string>",
"kvCacheEnabled": true,
"cpuOffloadingEnabled": true,
"nodeId": "<string>"
}
'{
"model": {
"modelId": "<string>",
"deploymentId": "<string>",
"userId": "<string>",
"description": "<string>",
"modelPath": "<string>",
"modelName": "<string>",
"status": "MODEL_STATUS_UNSPECIFIED",
"events": [
{
"createdAt": "2023-11-07T05:31:56Z",
"log": "<string>",
"eventType": "EVENT_TYPE_UNSPECIFIED"
}
],
"createdAt": "2023-11-07T05:31:56Z",
"updatedAt": "2023-11-07T05:31:56Z",
"modelSpec": {},
"infra": {
"cloudProvider": "CLOUD_PROVIDER_UNSPECIFIED",
"nebiusRegion": "NEBIUS_REGION_UNSPECIFIED",
"lambdaRegion": "LAMBDA_REGION_UNSPECIFIED",
"onpremRegion": "<string>"
},
"gpuCount": 123,
"gpuType": "GPU_TYPE_UNSPECIFIED",
"replicas": 123,
"endpoint": "<string>",
"apiKey": "<string>"
}
}Provisions a model after billing checks and forwards to ModelService.
curl --request POST \
--url https://api.tensormesh.ai/v1/models \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '
{
"modelName": "<string>",
"userId": "<string>",
"description": "<string>",
"infra": {
"cloudProvider": "CLOUD_PROVIDER_UNSPECIFIED",
"nebiusRegion": "NEBIUS_REGION_UNSPECIFIED",
"lambdaRegion": "LAMBDA_REGION_UNSPECIFIED",
"onpremRegion": "<string>"
},
"modelPath": "<string>",
"gpuCount": 123,
"gpuType": "GPU_TYPE_UNSPECIFIED",
"modelSpec": {},
"apiKey": "<string>",
"hfToken": "<string>",
"kvCacheEnabled": true,
"cpuOffloadingEnabled": true,
"nodeId": "<string>"
}
'{
"model": {
"modelId": "<string>",
"deploymentId": "<string>",
"userId": "<string>",
"description": "<string>",
"modelPath": "<string>",
"modelName": "<string>",
"status": "MODEL_STATUS_UNSPECIFIED",
"events": [
{
"createdAt": "2023-11-07T05:31:56Z",
"log": "<string>",
"eventType": "EVENT_TYPE_UNSPECIFIED"
}
],
"createdAt": "2023-11-07T05:31:56Z",
"updatedAt": "2023-11-07T05:31:56Z",
"modelSpec": {},
"infra": {
"cloudProvider": "CLOUD_PROVIDER_UNSPECIFIED",
"nebiusRegion": "NEBIUS_REGION_UNSPECIFIED",
"lambdaRegion": "LAMBDA_REGION_UNSPECIFIED",
"onpremRegion": "<string>"
},
"gpuCount": 123,
"gpuType": "GPU_TYPE_UNSPECIFIED",
"replicas": 123,
"endpoint": "<string>",
"apiKey": "<string>"
}
}Documentation Index
Fetch the complete documentation index at: https://docs.tensormesh.ai/llms.txt
Use this file to discover all available pages before exploring further.
Bearer authentication using an access token. Format: Bearer <access_token>
Model nick name. Must be unique per user.
User ID who owns this model. Must be a valid UUID.
Optional description of the model.
Infra specifies the infrastructure configuration for deploying and running models.
This message defines where a model deployment should run by specifying both the cloud provider and the specific region. It uses a oneof for region selection to ensure type-safe region specification based on the chosen provider.
See also: tensormesh/common/v1/cloud_provider.proto for provider and region enum definitions
Show child attributes
Model path (e.g., HuggingFace model ID).
Number of GPUs to allocate for this model.
GPUType specifies the type of GPU to use for a model deployment.
This enum defines the supported GPU types for model deployments. It allows clients to specify the exact GPU hardware they need for their models.
enum definitions
GPU_TYPE_UNSPECIFIED, GPU_TYPE_A100, GPU_TYPE_H100, GPU_TYPE_H200, GPU_TYPE_B200 Additional model-specific configuration.
Enable KV cache.
Enable CPU offloading.
A successful response.
Model represents a model instance created by user.
Show child attributes
Was this page helpful?