Experience Report: Log Mining using Natural Language Processing and Application to Anomaly Detection

Abstract : Event logging is a key source of information on a system state. Reading logs provides insights on its activity, assess its correct state and allows to diagnose problems. However, reading does not scale: with the number of machines increasingly rising, and the complexification of systems, the task of auditing systems' health based on logfiles is becoming overwhelming for system administrators. This observation led to many proposals automating the processing of logs. However, most of these proposal still require some human intervention, for instance by tagging logs, parsing the source files generating the logs, etc. In this work, we target minimal human intervention for logfile processing and propose a new approach that considers logs as regular text (as opposed to related works that seek to exploit at best the little structure imposed by log formatting). This approach allows to leverage modern techniques from natural language processing. More specifically, we first apply a word embedding technique based on Google's word2vec algorithm: logfiles' words are mapped to a high dimensional metric space, that we then exploit as a feature space using standard classifiers. The resulting pipeline is very generic, computationally efficient, and requires very little intervention. We validate our approach by seeking stress patterns on an experimental platform. Results show a strong predictive performance (≈ 90% accuracy) using three out-of-the-box classifiers.
Type de document :
Communication dans un congrès
28th International Symposium on Software Reliability Engineering (ISSRE 2017), Oct 2017, Toulouse, France. 10p., 2017
Liste complète des métadonnées

Littérature citée [25 références]  Voir  Masquer  Télécharger

https://hal.laas.fr/hal-01576291
Contributeur : Carla Sauvanaud <>
Soumis le : mardi 22 août 2017 - 17:56:11
Dernière modification le : jeudi 29 mars 2018 - 10:46:02

Fichier

PID4955309.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01576291, version 1

Citation

Christophe Bertero, Matthieu Roy, Carla Sauvanaud, Gilles Trédan. Experience Report: Log Mining using Natural Language Processing and Application to Anomaly Detection. 28th International Symposium on Software Reliability Engineering (ISSRE 2017), Oct 2017, Toulouse, France. 10p., 2017. 〈hal-01576291〉

Partager

Métriques

Consultations de la notice

416

Téléchargements de fichiers

1507