Managed Inference

Set up and interact with CosmicAC inference services for chat completions. These commands let you configure your API credentials, select models, and send prompts directly from the terminal. Use them to initialize your connection, test your setup, and run inference requests.

Command

npx cosmicac inference <subcommand> [options]

SubCommands

Subcommand	Description
init	Set up API key and default model.
list-models	Display all available models.
chat	Send chat completion requests.

`inference init`

Sets up your API key and default model for inference operations with an interactive setup.

Usage

npx cosmicac inference init

Example

$ npx cosmicac inference init

? Enter your API key: ttr-proj-xxxxxxxxxxxx
Validating API key...
Project ID: proj_abc123
Key prefix: ttr-proj-xxx

? Select default model:
❯ TinyLlama/TinyLlama-1.1B-Chat-v1.0
  meta-llama/Llama-2-7b-chat

API key and default model saved successfully.

`inference list-models`

Displays all available models for inference operations.

Usage

npx cosmicac inference list-models

Example

$ npx cosmicac inference list-models

Available models:
──────────────────────────────────────────────────
- TinyLlama/TinyLlama-1.1B-Chat-v1.0
  ID: TinyLlama/TinyLlama-1.1B-Chat-v1.0 (default)
  Context length: 2,048 tokens

──────────────────────────────────────────────────
Total models: 1

Default model: TinyLlama/TinyLlama-1.1B-Chat-v1.0

Note: You must run npx cosmicac inference init first, to set up your API key.

Options

Option	Description
`--message`	Message to send (required in non-interactive mode).
`--model`	Model to use (optional if default is set).
`--api-key`	API key for authentication.
`--max-tokens`	Maximum tokens to generate (default: 1000).
`--temperature`	Temperature for sampling (default: 1.0).
`--stream`	Enable streaming response (default: false).
`--interactive`	Enable interactive chat mode (default: false)

Examples

# Basic chat completion
$ npx cosmicac inference chat --message "Explain quantum computing"

# Interactive chat mode with streaming
$ npx cosmicac inference chat --interactive --stream

# Chat with streaming response
$ npx cosmicac inference chat --message "Say Hello" --stream

# Chat with specific model and parameters
$ npx cosmicac inference chat --message "Write a Python function" --model "gpt-4" --max-tokens 500 --temperature 0.7

# Using API key from the command line
$ npx cosmicac inference chat --api-key "ttr-proj-xxxx" --message "Hello"

API Key Resolution

The API key is resolved in the following order:

--api-key flag
COSMICAC_API_KEY environment variable
Stored key from inference init command

Model Resolution

--model flag
Default model from inference init command

Note: If no model is provided and no default is set, you will be prompted to run npx cosmicac inference init.

Managed Inference

On this page