vastai update endpoint

Update an existing endpoint group

Usage

vastai update endpoint ID [OPTIONS]

Arguments

integer

required

id of endpoint group to update

Options

--min_load

number

minimum floor load in perf units/s (token/s for LLms)

--min_cold_load

number

minimum floor load in perf units/s (token/s for LLms), but allow handling with cold workers

--endpoint_state

string

active, suspended, or stopped

--target_util

number

target capacity utilization (fraction, max 1.0, default 0.9)

--cold_mult

number

cold/stopped instance capacity target as multiple of hot capacity target (default 2.5)

--cold_workers

integer

min number of workers to keep ‘cold’ when you have no load (default 5)

--max_workers

integer

max number of workers your endpoint group can have (default 20)

--endpoint_name

string

deployment endpoint name (allows multiple workergroups to share same deployment endpoint)

--max_queue_time

number

maximum seconds requests may be queued on each worker (default 30.0)

--target_queue_time

number

target seconds for the queue to be cleared (default 10.0)

--inactivity_timeout

integer

seconds of no traffic before the endpoint can scale to zero active workers

Description

Example: vastai update endpoint 4242 --min_load 100 --target_util 0.9 --cold_mult 2.0 --endpoint_name “LLama”

Examples

vastai update endpoint <ID>

Global Options

The following options are available for all commands:

Option	Description
`--url URL`	Server REST API URL
`--retry N`	Retry limit
`--raw`	Output machine-readable JSON
`--explain`	Verbose explanation of API calls
`--api-key KEY`	API key (defaults to `~/.config/vastai/vast_api_key`)

Get Started

Commands

Usage

Arguments

Options

Description

Examples

Global Options

Get Started

Commands

​Usage

​Arguments

​Options

​Description

​Examples

​Global Options

Usage

Arguments

Options

Description

Examples

Global Options