MAI-Image-2
Version: 2026-02-20
Direct from Azure models
Direct from Azure models are a select portfolio curated for their market-differentiated capabilities:- Secure and managed by Microsoft: Purchase and manage models directly through Azure with a single license, consistent support, and no third-party dependencies, backed by Azure's enterprise-grade infrastructure.
- Streamlined operations: Benefit from unified billing, governance, and seamless PTU portability across models hosted on Azure - all as part of one Microsoft Foundry platform.
- Future-ready flexibility: Access the latest models as they become available, and easily test, deploy, or switch between them within Microsoft Foundry; reducing integration effort.
- Cost control and optimization: Scale on demand with pay-as-you-go flexibility or reserve PTUs for predictable performance and savings.
Key capabilities
About this model
MAI‑Image‑2 is a text‑to‑image generation model designed to create high‑quality, visually rich images from natural language prompts.The model is optimized to produce diverse and coherent images across a wide range of creative and design scenarios, making it well suited for tasks such as concept visualization, creative content generation, and image design workflows.
Key model capabilities
- Text‑to‑image generation: Generates high‑quality images from natural language prompts, enabling users to translate textual descriptions into visually coherent outputs suitable for a wide range of creative and design use cases.
- Photorealistic image synthesis: Capable of generating realistic imagery with consistent visual structure, making it suitable for concept visualization and content creation scenarios
- Creative Range: Generates visually diverse outputs across styles and compositions, reducing repetitive or templated results.
- Professional-Grade Outputs: Trained with carefully curated datasets and evaluated against real creative use cases to support production-quality design workflows.
Use cases
See Responsible AI for additional considerations for responsible use.Key use cases
MAI-Image-2 is a general-purpose text-to-image generative model, intended for creative generation and design tasks. The model is particularly capable at generating photorealistic imagery.Out of scope use cases
The provider has not supplied this information.Pricing
Pricing is based on a number of factors, including deployment type and tokens used. See pricing details here.Technical specs
Training cut-off date
The provider has not supplied this information.Training time
The provider has not supplied this information.Input formats
Preferred input is structured text promptsOutput formats
ImageSupported languages
EnglishSample JSON response
The provider has not supplied this information.Model architecture
The provider has not supplied this information.Long context
The provider has not supplied this information.Optimizing model performance
The provider has not supplied this information.Additional assets
The provider has not supplied this information.Training disclosure
Training, testing and validation
The provider has not supplied this information.Distribution
Distribution channels
The provider has not supplied this information.More information
Model developer: Microsoft AI Model Release Date: April 2, 2026Responsible AI considerations
Safety techniques
Despite technical mitigations such as data filtering, image generation models are known to produce harmful or unexpected content based on user requests. In addition to technical work on the model such as data filtering, additional mitigations are also applied at the system level to further enhance end user safety (e.g., content classifiers). Some common risk areas associated with image generation models include violent or gory content, sexual content or nudity, depictions of public figures, replication of trademarked or other protected material.Safety evaluations
The provider has not supplied this information.Known limitations
The provider has not supplied this information.Acceptable use
Acceptable use policy
The provider has not supplied this information.Quality and performance evaluations
Source: Microsoft AI The model was evaluated by human raters alongside comparable models and across a range of capability areas. Specifically, raters were tasked with selecting a preferred model output, in different topic areas based on real user intents (for example, “product/branding,” “cartoon,” “photorealistic”) and by reference to the output’s alignment with the prompt intent as well as the output’s visual appeal. This resulted in an Elo score calculation. Generally, the model was found to perform as well or to exceed the performance of the comparable models, performing particularly well when generating photorealistic imagery.| Category | MAI-Image-2 | MAI-Image-1 |
|---|---|---|
| Photorealistic & Cinematic Imagery | 1201 ± 12 | 1104 ± 5 |
| Product, Branding, commercial design | 1191 ± 11 | 1085 ± 5 |
| 3D Imaging & Modeling | 1184 ± 22 | 1096 ± 8 |
| Cartoon, Anime & Fantasy | 1186 ± 14 | 1100 ± 5 |
| Art | 1191 ± 18 | 1104 ± 7 |
| Portraits | 1201 ± 17 | 1095 ± 6 |
| Text rendering | 1186 ± 12 | 1069 ± 5 |
| Overall | 1190 ± 8 | 1093 ± 4 |
Benchmarking methodology
Source: Microsoft AI The provider has not supplied this information.Public data summary
Source: Microsoft AI The provider has not supplied this information.Model Specifications
Context Length131072
LicenseCustom
Last UpdatedApril 2026
Input TypeText
Output TypeImage
ProviderMicrosoft
Languages1 Language