Modality

Also known as: media modality, output modality, generative modality

A modality is the type of media a model produces: image, video, audio, 3D, or text. Generative-media platforms route a single brief across multiple modalities — for example, a still hero image and a 9:16 reel from the same brand kit — without rewriting prompts for each one.

Example

A virtualTryOn run produces an image modality (still try-on) and a video modality (turnaround reel) from the same garment+body inputs — routed through a still-image model and a video model in parallel.

The four modalities that matter for brands

Image is the most mature modality — product shots, lifestyle, banners, try-on. Video took the leap in 2024-2025 with Sora, Veo, Runway Gen-4, and Kling delivering temporally coherent short-form. Audio (voiceovers, music beds, ambient) is now production-ready through token-based generation. 3D matured into a production modality in 2025, compressing modeling timelines from weeks to minutes. Most brand workflows mix at least two of these per campaign.

References

Related terms