MAI-Image-2

Version: 2026-02-20

Microsoft•Last updated April 2026

Built for creatives, delivering enhanced photorealism at scale.

Vision

Direct from Azure models

Direct from Azure models are a select portfolio curated for their market-differentiated capabilities:

Secure and managed by Microsoft: Purchase and manage models directly through Azure with a single license, consistent support, and no third-party dependencies, backed by Azure's enterprise-grade infrastructure.
Streamlined operations: Benefit from unified billing, governance, and seamless PTU portability across models hosted on Azure - all as part of one Microsoft Foundry platform.
Future-ready flexibility: Access the latest models as they become available, and easily test, deploy, or switch between them within Microsoft Foundry; reducing integration effort.
Cost control and optimization: Scale on demand with pay-as-you-go flexibility or reserve PTUs for predictable performance and savings.

Learn more about Direct from Azure models .

Key capabilities

About this model

MAI‑Image‑2 is a text‑to‑image generation model designed to create high‑quality, visually rich images from natural language prompts.
The model is optimized to produce diverse and coherent images across a wide range of creative and design scenarios, making it well suited for tasks such as concept visualization, creative content generation, and image design workflows.

Key model capabilities

Text‑to‑image generation: Generates high‑quality images from natural language prompts, enabling users to translate textual descriptions into visually coherent outputs suitable for a wide range of creative and design use cases.
Photorealistic image synthesis: Capable of generating realistic imagery with consistent visual structure, making it suitable for concept visualization and content creation scenarios
Creative Range: Generates visually diverse outputs across styles and compositions, reducing repetitive or templated results.
Professional-Grade Outputs: Trained with carefully curated datasets and evaluated against real creative use cases to support production-quality design workflows.

Use cases

See Responsible AI for additional considerations for responsible use.

Key use cases

MAI-Image-2 is a general-purpose text-to-image generative model, intended for creative generation and design tasks. The model is particularly capable at generating photorealistic imagery.

Out of scope use cases

The provider has not supplied this information.

Pricing

Pricing is based on a number of factors, including deployment type and tokens used. See pricing details here.

Technical specs

Training cut-off date

The provider has not supplied this information.

Training time

The provider has not supplied this information.

Input formats

Preferred input is structured text prompts

Output formats

Image

Supported languages

English

Sample JSON response

The provider has not supplied this information.

Model architecture

The provider has not supplied this information.

Long context

The provider has not supplied this information.

Optimizing model performance

The provider has not supplied this information.

Additional assets

The provider has not supplied this information.

Training disclosure

Training, testing and validation

The provider has not supplied this information.

Distribution

Distribution channels

The provider has not supplied this information.

More information

Model developer: Microsoft AI Model Release Date: April 2, 2026

Responsible AI considerations

Safety techniques

Despite technical mitigations such as data filtering, image generation models are known to produce harmful or unexpected content based on user requests. In addition to technical work on the model such as data filtering, additional mitigations are also applied at the system level to further enhance end user safety (e.g., content classifiers). Some common risk areas associated with image generation models include violent or gory content, sexual content or nudity, depictions of public figures, replication of trademarked or other protected material.

Safety evaluations

The provider has not supplied this information.

Known limitations

The provider has not supplied this information.

Acceptable use

Acceptable use policy

The provider has not supplied this information.

Quality and performance evaluations

Source: Microsoft AI The model was evaluated by human raters alongside comparable models and across a range of capability areas. Specifically, raters were tasked with selecting a preferred model output, in different topic areas based on real user intents (for example, “product/branding,” “cartoon,” “photorealistic”) and by reference to the output’s alignment with the prompt intent as well as the output’s visual appeal. This resulted in an Elo score calculation. Generally, the model was found to perform as well or to exceed the performance of the comparable models, performing particularly well when generating photorealistic imagery.

Category	MAI-Image-2	MAI-Image-1
Photorealistic & Cinematic Imagery	1201 ± 12	1104 ± 5
Product, Branding, commercial design	1191 ± 11	1085 ± 5
3D Imaging & Modeling	1184 ± 22	1096 ± 8
Cartoon, Anime & Fantasy	1186 ± 14	1100 ± 5
Art	1191 ± 18	1104 ± 7
Portraits	1201 ± 17	1095 ± 6
Text rendering	1186 ± 12	1069 ± 5
Overall	1190 ± 8	1093 ± 4

Benchmarking methodology

Source: Microsoft AI The provider has not supplied this information.

Public data summary

Source: Microsoft AI The provider has not supplied this information.

Model Specifications

Context Length131072

LicenseCustom

Last UpdatedApril 2026

Input TypeText

Output TypeImage

ProviderMicrosoft

Languages1 Language

Quick Start