Mattia Gualandi
• Presepi Alex
• Sciarrillo Alessandro
abstract
The growing integration of deep learning models in critical domains has brought urgent
attention to their vulnerabilities. One of the most concerning threats is adversarial attacks
on image-based AI systems: subtle perturbations to input images that can mislead even
the most accurate models into making incorrect predictions, often without detection.
These attacks challenge not only the technical robustness of AI models but also raise
significant ethical concerns, especially when human safety, fairness, and accountability
are at stake.
In this project, we explore the nature of adversarial attacks on image data, aiming
to deepen our understanding of their mechanisms and implications. We review and
categorize popular attack techniques, evaluate their impact on standard vision models,
and evaluate the effectiveness of different models we designed and trained to counteract
the considered attacks. Moreover, we aim to build a module capable of recognizing if an
image is corrupted. By framing the problem from both a technical and ethical standpoint,
we highlight the importance of designing AI systems that are not only performant but
also resilient and trustworthy in adversarial settings.
outcomes