AI Codex
Foundation Models & LLMsDevelopersCTOs

Quantization

Reducing the numerical precision of model weights to decrease memory usage and speed up inference — often trading small accuracy losses for large efficiency gains.