Skip to navigationSkip to loginSkip to main contentSkip to footer section

Generative APIs

Serve the latest AI models via API, pay by million token

OpenAI-compatible APIs

Easily integrate with existing tools like OpenAI libraries and LangChain SDKs. Our APIs are designed to work out-of-the-box with your existing workflows, including adapters for Retrieval-Augmented Generation (RAG).

Cost-effective usage

Optimize your budget with a pay-per-use model, billed per million tokens. Benefit from additional discount for non-realtime use cases through Batches APIs.

Quick model testing

Start serving and testing AI models in just a few minutes. Our streamlined onboarding process and serverless architecture let you deploy endpoints instantly, enabling rapid iteration and minimal setup time.

Towards a sovereign AI where your data remains yours, and only in Europe.

Security and privacy for your data and applications

Security and privacy for your data and applications

We do not collect, read, reuse, or analyse the content of your inputs, prompts or outputs generated by the APIs. Your business is yours and has nothing to do with Scaleway’s

Everything you need to create apps with Generative AI

Models' prices

Enjoy a free tier of 1,000,000 tokens. Every new customer gets 1,000,000 free tokens—start paying only from the 1,000,001st token.

qwen3.5-397b-a17bChat, Code and Vision€0.60 /million tokens€3.60 /million tokens
qwen3-235b-a22b-instruct-2507Chat€0.75 /million tokens€2.25 /million tokens
gpt-oss-120bChat€0.15 /million tokens€0.60 /million tokens
gemma-3-27b-itChat and Vision€0.25 /million tokens€0.50 /million tokens
whisper-large-v3Audio transcription€0.003 /Audio minuteFree
holo2-30b-a3bChat and Vision€0.30 /million tokens€0.70 /million tokens
voxtral-small-24b-2507Audio transcription and Chat€0.15 /million tokens€0.35 /million tokens
mistral-small-3.2-24b-instruct-2506Chat and Vision€0.15 /million tokens€0.35 /million tokens
llama-3.3-70b-instructChat€0.90 /million tokens€0.90 /million tokens
deepseek-r1-distill-llama-70bChat€0.90 /million tokens€0.90 /million tokens
qwen3-embedding-8bEmbeddings€0.10 /million tokensFree
qwen3-coder-30b-a3b-instructChat€0.20 /million tokens€0.80 /million tokens
pixtral-12b-2409Chat and Vision€0.20 /million tokens€0.20 /million tokens
mistral-nemo-instruct-2407Chat€0.20 /million tokens€0.20 /million tokens
bge-multilingual-gemma2Embeddings€0.10 /million tokensFree
llama-3.1-8b-instructChat€0.20 /million tokens€0.20 /million tokens

Evaluate the cost of your Generative API

Compare the Generative API with Managed Inference based on your request volume and token usage.