Ensuring Trustworthy Medical AI: Enhancing Robustness and Ethical Integrity of Image Classifiers Against Adversarial Attacks

Matteo Fasulo  •  Luca Babboni  •  Maksim Omelchenko
abstract

Deep learning models in medical image classification are critically vulnerable to adversar-
ial attacks, imperceptible input manipulations causing misclassifications with severe clinical
consequences [1]. This project addresses this by systematically evaluating attack impacts
(e.g., Fast Gradient Sign Method (FGSM)) on a ResNet[2] model using public medical
datasets (e.g., PatchCamelyon). We will develop and benchmark robust countermeasures,
including adversarial training and advanced input purification techniques. Our comprehen-
sive benchmarking framework will measure how defense techniques impact model perfor-
mance across multiple dimensions: classification accuracy on clean and adversarial samples,
computational overhead during training and inference, memory requirements, and processing
latency. These metrics will provide a holistic assessment of the trade-offs between security,
performance, and computational efficiency in clinical deployment scenarios. Our aim is to
provide practical strategies for fortifying medical image classifiers, enhancing their safety,
reliability, and ethical integrity in clinical settings.

outcomes