Glossary · AI
Adversarial Machine Learning
Techniques attackers use to manipulate AI models through crafted inputs that cause incorrect predictions or behaviors.
Full definition
Adversarial machine learning exploits vulnerabilities in AI systems by introducing carefully designed perturbations to input data that humans cannot detect but cause models to fail. Attacks include evasion (fooling deployed models), poisoning (corrupting training data), and model extraction (stealing proprietary algorithms). Defense strategies involve adversarial training, input validation, and model robustness testing. Researchers demonstrated that adding imperceptible noise to stop sign images caused autonomous vehicles to misclassify them as speed limit signs, highlighting critical safety implications.
AIsecuritymachine learningvulnerability