Radiance provides an OpenAI-compatible API for running high-performance inference on distributed edge nodes. You can use standard OpenAI client libraries by simply changing the de>base_url.
https://api.radiance.cloud/v1
All API requests (except de>/models) require an API Key. You must include this key in the de>Authorization header.
Authorization: Bearer YOUR_API_KEY
Retrieve a list of available models, their specific capabilities, and pricing.
curl https://api.radiance.cloud/v1/models
{
"object": "list",
"data": [
{
"id": "DeepSeek-V3",
"object": "model",
"name": "DeepSeek V3",
"context_length": 131072,
"pricing": { "prompt": "0.3", "completion": "1.0" },
"supported_sampling_parameters": ["temperature", "top_p", "max_tokens"]
},
{
"id": "Llama-3.3-70B-Instruct",
"pricing": { "prompt": "0.15", "completion": "0.50" }
}
]
}
Create a model response for a chat conversation. Fully compatible with OpenAI's Chat API.
curl https://api.radiance.cloud/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "DeepSeek-V3",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum entanglement."}
],
"temperature": 0.7,
"max_tokens": 1024
}'
curl https://api.radiance.cloud/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "Llama-3.3-70B-Instruct",
"messages": [{"role": "user", "content": "Write a creative story."}],
"temperature": 0.8,
"frequency_penalty": 0.5,
"presence_penalty": 0.2
}'
| Name | Type | Description |
|---|---|---|
| model | string | The ID of the model (e.g., de>DeepSeek-V3, de>Llama-3.3-70B-Instruct). |
| messages | array | A list of messages comprising the conversation so far. |
| temperature | number | Sampling temperature (0.0 to 2.0). Higher = more random. |
| top_p | number | Nucleus sampling. Consider tokens with top_p probability mass. |
| max_tokens | integer | The maximum number of tokens to generate. |
| stream | boolean | If true, partial message deltas will be sent as SSE. |
| frequency_penalty | number | -2.0 to 2.0. Penalizes new tokens based on their existing frequency. (Supported by Llama 3.3/3.2) |
| presence_penalty | number | -2.0 to 2.0. Penalizes new tokens based on whether they appear in text. (Supported by Llama 3.3/3.2) |
| repetition_penalty | number | > 1.0. Penalizes repetition. (Supported by Qwen 2.5) |
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "DeepSeek-V3",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "\n\nQuantum entanglement is a phenomenon..."
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 9,
"completion_tokens": 12,
"total_tokens": 21
}
}
You can use the official OpenAI Python or Node.js SDKs.
from openai import OpenAI client = OpenAI( base_url="https://api.radiance.cloud/v1", api_key="your-api-key" ) response = client.chat.completions.create( model="Qwen-2.5-72B-Instruct", messages=[ {"role": "system", "content": "You are a coding expert."}, {"role": "user", "content": "Write a Rust function to reverse a string."} ], temperature=0.2 ) print(response.choices[0].message.content)