AI Codex
Foundation Models & LLMs Claude

Multimodal Model

Also: vision-language model

A model that can process and generate multiple types of data — typically text and images, sometimes also audio, video, or code.