Managed Inference
Set up and interact with CosmicAC inference services for chat completions. These commands let you configure your API credentials, select models, and send prompts directly from the terminal. Use them to initialize your connection, test your setup, and run inference requests.
Command
npx cosmicac inference <subcommand> [options]SubCommands
| Subcommand | Description |
|---|---|
| init | Set up API key and default model. |
| list-models | Display all available models. |
| chat | Send chat completion requests. |
inference init
Sets up your API key and default model for inference operations with an interactive setup.
Usage
npx cosmicac inference initExample
$ npx cosmicac inference init
? Enter your API key: ttr-proj-xxxxxxxxxxxx
Validating API key...
Project ID: proj_abc123
Key prefix: ttr-proj-xxx
? Select default model:
❯ TinyLlama/TinyLlama-1.1B-Chat-v1.0
meta-llama/Llama-2-7b-chat
API key and default model saved successfully.inference list-models
Displays all available models for inference operations.
Usage
npx cosmicac inference list-modelsExample
$ npx cosmicac inference list-models
Available models:
──────────────────────────────────────────────────
- TinyLlama/TinyLlama-1.1B-Chat-v1.0
ID: TinyLlama/TinyLlama-1.1B-Chat-v1.0 (default)
Context length: 2,048 tokens
──────────────────────────────────────────────────
Total models: 1
Default model: TinyLlama/TinyLlama-1.1B-Chat-v1.0Note: You must run npx cosmicac inference init first, to set up your API key.
Options
| Option | Description |
|---|---|
--message | Message to send (required in non-interactive mode). |
--model | Model to use (optional if default is set). |
--api-key | API key for authentication. |
--max-tokens | Maximum tokens to generate (default: 1000). |
--temperature | Temperature for sampling (default: 1.0). |
--stream | Enable streaming response (default: false). |
--interactive | Enable interactive chat mode (default: false) |
Examples
# Basic chat completion
$ npx cosmicac inference chat --message "Explain quantum computing"
# Interactive chat mode with streaming
$ npx cosmicac inference chat --interactive --stream
# Chat with streaming response
$ npx cosmicac inference chat --message "Say Hello" --stream
# Chat with specific model and parameters
$ npx cosmicac inference chat --message "Write a Python function" --model "gpt-4" --max-tokens 500 --temperature 0.7
# Using API key from the command line
$ npx cosmicac inference chat --api-key "ttr-proj-xxxx" --message "Hello"API Key Resolution
The API key is resolved in the following order:
--api-keyflagCOSMICAC_API_KEYenvironment variable- Stored key from
inference initcommand
Model Resolution
--modelflag- Default model from
inference initcommand
Note: If no model is provided and no default is set, you will be prompted to run npx cosmicac inference init.