Robustness of Image-based Malware Detection against Adversarial Attacks

In this project, we aim to assess the robustness of image-based malware detection against adversarial attacks. To that end, we design and construct a lightweight CNN image-based malware detection model to detect Windows PE malware, based on the family it belongs to. It is worth mentioning that adversarial attacks, which are relatively easy to apply to images in the computer vision domain, are extremely difficult to apply to transformed images of malware samples due to the high risk of breaking functionality when converting back to binary. We select only attacks that preserve malware functionality and compare our approach against the state‑of‑the‑art MalConv classifier under both white‑box and black‑box settings, performing four distinct adversarial attacks that maintain binary integrity.

Project Abstract: We present a reproducible framework that transforms Windows PE binaries into grayscale images, trains a compact CNN for family‑level malware classification, and systematically evaluates its resilience to four functionality‑preserving adversarial attacks, benchmarking against MalConv to demonstrate improved robustness with minimal overhead.

 


Overview of UAV Authentication Landscape

Figure: Visual representation of malicious and benign UAV communication with the Ground Control Station

Project Description

Motivation

Machine and deep learning models have achieved outstanding performance in malware detection, but their security‑oriented deployments demand trustworthiness against adversarial manipulations. Prior work shows that carefully crafted perturbations can drastically degrade classifier accuracy, and simple transferability means even black‑box attacks can succeed. We are motivated to explore whether an image‑based representation of PE binaries can intrinsically resist these adversarial threats while preserving malware functionality.

Problem Scope

  • Domain constraints: Byte perturbations must preserve PE functionality (≤10% size increase).
  • Adversarial settings: Both white‑box (full model knowledge) and black‑box (query‑only) scenarios.
  • Evaluation target: Compare resilience of our CNN image‑based classifier against MalConv under four attacks.

Key Contributions

  • Design and implement a lightweight CNN for grayscale‑image malware classification with high accuracy and low overhead.
  • Perform four functionality‑preserving adversarial attacks (Random/BENIGN Byte Append, Random/BENIGN Byte FGSM) under both attack models.
  • Benchmark robustness and performance against MalConv, demonstrating a significant drop in MalConv’s evasion rates for image‑based under most attacks.

Background

PE File Format & Visualization: Windows PE binaries (.text, .rdata, .data, .rsrc) are read as uint8 arrays and reshaped into grayscale images for classification, revealing consistent family‑level patterns.

Adversarial ML in Malware: Attacks must craft byte perturbations imperceptible to functionality; prior image‑based attacks often break executables due to non‑localized pixel shifts.

Threat Model

An adversary aims to evade ML‑based malware classifiers by adding ≤10% byte perturbations to Windows PE samples, under two knowledge regimes:

  • White‑box: Full access to model architecture, parameters, and training data.
  • Black‑box: Only input‑output query access; no internal details.

Goal: Force misclassification of malicious samples as benign while ensuring binary functionality is preserved.

Proposed Approach

  1. Data Preprocessing: Convert PE binaries to normalized 100×100 grayscale images.
  2. CNN Training: Train a 3‑block convolutional network (16→32→64 filters) with (3×3) kernels, max‑pool (2×2), two dense layers, 100 epochs, batch size 32.
  3. Adversarial Generation: Apply four attacks preserving functionality:
    • Brute‑Force Random Byte Append
    • Brute‑Force Benign Byte Append
    • Random Byte FGSM
    • Benign Byte FGSM
  4. Robustness Evaluation: Measure evasion rates on our classifier and MalConv over held‑out 20% validation split.

Key Findings

High‑Level Insight: Image‑based malware classification demonstrates strong intrinsic robustness to byte‑append attacks, outperforming MalConv in most adversarial scenarios.

Include Table 1: Classification accuracy and overhead comparison (Image‑based vs MalConv).
Include Table 2: Evasion rates for four adversarial attacks on both classifiers.

  • Classification Accuracy: Our CNN achieved 96.30% accuracy vs 95.29% for MalConv, with similar training time (~12 min 50 s) and lower RAM usage (36.8% vs 43.6%).
  • Random Append Attack: MalConv evasion 54.66% vs 5.66% image‑based.
  • Benign Append Attack: MalConv 44.22% vs 5.11% image‑based.
  • Random FGSM: MalConv 55.18% vs 100% image‑based (vulnerability to gradient‑based in embedding space).
  • Benign FGSM: MalConv 55.19% vs 46.69% image‑based.

Conclusion

Our lightweight CNN image‑based classifier not only matches MalConv in detection performance but also dramatically reduces evasion rates under byte‑append attacks. Gradient‑based FGSM remains a challenge due to end‑to‑end differentiability in embedding space. Future work will explore append/perturbations in non‑terminal sections and defense mechanisms tailored to image representations.

Project Team Members

Yassine Mekdad
Graduate Research Assistant
Ahmet Aris
Post Doctoral Associate
Abbas Acar
Post Doctoral Associate
Leonardo Babun
Adjunct Professor
Güliz Seray Tuncay
Senior Research Scientist
Nasir Ghani
Full Professor
Selcuk Uluagac
Eminent Scholar Chaired Professor

Publications:

  • Yassine Mekdad, Faraz Naseem, Ahmet Aris, Harun Oz, Abbas Acar, Leonardo Babun, Selcuk Uluagac, Güliz Seray Tuncay, and Nasir Ghani. “On the robustness of image-based malware detection against adversarial attacks.” In Network Security Empowered by Artificial Intelligence, pp. 355-375. Cham: Springer Nature Switzerland, 2024.[pdf] [bibtex]

Presentations and Talks:

  • Harun Oz, Faraz Naseem, Ahmet Aris, Abbas Acar, Guliz Seray Tuncay, and A. Selcuk Uluagac. “Poster: feasibility of malware visualization techniques against adversarial machine learning attacks.” In 43rd IEEE symposium on security and privacy (S&P). 2022.[poster] [bibtex]

 

Project Funding

We gratefully thank the partial funding of our sponsors. The views expressed are those of the authors and do not reflect the supporting organizations.