WAN 2.7 Image

Alibaba · AI image

WAN 2.7 Image is Alibaba's Wan 2.7 image model for text-to-image, image editing, and multi-reference fusion. It supports resolutions up to about 2K (for example 2048×2048), flexible aspect ratios via preset sizes or custom dimensions, and optional coherent image-set generation for the same character or product across multiple shots. A thinking mode is available on some APIs for stronger reasoning on text-to-image requests.

Alibaba also documents WAN 2.7 Image Pro as a higher-tier variant with emphasis on quality and 4K-class output where the host exposes it—the standard and Pro lines differ in caps and options; check the model card for the route you integrate.

Key features and benefits

Text-to-image

Describe the scene in natural language and get a high-quality image. Prompts can be long (many implementations support extensive character budgets), so you can specify composition, lighting, style, and subject detail in one go. Size can be set with presets such as 1K or 2K, or with explicit width-by-height strings for exact aspect ratios—useful for vertical shorts, landscape thumbnails, or square assets.

Image editing and multi-reference fusion

You can supply reference images together with a text instruction to edit, restyle, or fuse content. The model accepts multiple inputs for style transfer, element swaps, and blending several references into one output. This fits workflows where you already have a product shot, a character reference, or a mood board and want a new composite or variant without starting from a blank canvas.

Image sets and consistency

Image-set mode is designed for coherent batches from one prompt—for example the same character in different seasons, product angles, or storyboard beats. Providers document generating multiple related images in one request when batch APIs are enabled.

Thinking mode and reproducibility

Thinking mode is aimed at improved quality for text-to-image by doing more internal reasoning before rendering. A seed can be used when the API supports it for reproducible runs. Together, these help when you need to iterate on small prompt tweaks or lock down a look for a series of images.

Technical specifications

ProviderAlibaba Wan (WAN 2.7 Image)

ModesText-to-image; image edit / fusion; optional image sets

Reference imagesUp to 9 (when using image inputs, per documented APIs)

ResolutionUp to ~2K on standard tier; Pro may add higher caps

Outputs per call (upstream)1–4 or more in image-set mode depending on host

Use cases and applications

WAN 2.7 Image suits social and marketing content where you need sharp 2K-class imagery, vertical or custom aspect ratios, and optional reference-driven edits. Use it for short-form start frames, stylized thumbnails, storyboard stills, and e-commerce or brand visuals when you want to fuse or restyle existing shots.

When you maintain a reference image or edit an existing frame, the model can treat that image as input for edits—similar to other multimodal image APIs. Multi-reference fusion helps campaigns that need the same product in several environments or the same character with consistent identity across frames.

Why this model

Choose WAN 2.7 Image when you want Alibaba's current general-purpose image stack with strong support for both pure text-to-image and reference-heavy edits, up to roughly 2K resolution on the standard line.

If you need maximum text-in-image fidelity or 4K stills, compare WAN 2.7 Image Pro and other top-tier image models on paper. WAN 2.7 Image targets flexible sizing, multi-image workflows, and modern Wan 2.7 quality.

Pricing · Docs

What you should know

What is the difference between WAN 2.7 Image and WAN 2.7 Image Pro?

Pro emphasizes higher quality, larger resolution ceilings, and additional options on hosts that publish it. Standard WAN 2.7 Image targets broad production use with strong 2K-class output.

How many reference images can I use?

Documentation commonly allows up to nine images for editing, style transfer, or fusion; confirm limits on your API schema.

Does image-set mode return multiple files at once?

Where the API exposes batch or image-set endpoints, yes. Single-output calls remain common for simpler integrations.

← All AI models