Malware Detection in PDF Files Using Machine Learning - LAAS - Laboratoire d'Analyse et d'Architecture des Systèmes Accéder directement au contenu
Rapport De Contrat (Rapport De Recherche) Année : 2018

Malware Detection in PDF Files Using Machine Learning

Résumé

In this report we present how we used machine learning techniques to detect malicious behaviours in PDF files.At this aim, we first set up a SVM (Support Machine Vector) classifier that was able to detect 99.7% of malware. However, this classifier was easy to lure with malicious PDF, we forged to make them look like clean ones. We first proposed a very naive attack, that was easily stopped by the establishment of a threshold. We also implemented a gradientdescent attack to evade this SVM. This attack was almost 100% successful. In order to fix this problem, we provided counter-measures to the latter attack. A more elaborated features selection, and the use of a threshold, allowed us to stop up to 99.99% of these attacks.Finally, using adversarial learning techniques, we were able to prevent gradient descent attacks by iteratively feeding the SVM with malicious forged PDF. We found that after 3 iterations, every gradient-descent forged PDF were detected, completely preventing the attack.
Fichier principal
Vignette du fichier
Rapport_REDOCS_PDF_Malware.pdf (327.56 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01704766 , version 1 (08-02-2018)
hal-01704766 , version 2 (20-08-2018)

Identifiants

  • HAL Id : hal-01704766 , version 1

Citer

Bonan Cuan, Aliénor Damien, Claire Delaplace, Mathieu Valois. Malware Detection in PDF Files Using Machine Learning. [Research Report] Rapport LAAS n° 18030, REDOCS. 2018, 16p. ⟨hal-01704766v1⟩

Collections

LARA
1929 Consultations
5268 Téléchargements

Partager

Gmail Facebook X LinkedIn More