Evaluation & SafetyDevelopersCTOs
Red Teaming
Deliberately attempting to elicit harmful, incorrect, or unintended behavior from an AI system — an adversarial evaluation practice borrowed from cybersecurity.
Deliberately attempting to elicit harmful, incorrect, or unintended behavior from an AI system — an adversarial evaluation practice borrowed from cybersecurity.