Anomaly Detection and Diagnosis for Cloud services: Practical experiments and lessons learned - LAAS - Laboratoire d'Analyse et d'Architecture des Systèmes Accéder directement au contenu
Article Dans Une Revue Journal of Systems and Software Année : 2018

Anomaly Detection and Diagnosis for Cloud services: Practical experiments and lessons learned

Résumé

The dependability of cloud computing services is a major concern of cloud providers. In particular, anomaly detection techniques are crucial to detect anomalous service behaviors that may lead to the violation of service level agreements (SLAs) drawn with users. This paper describes an anomaly detection system (ADS) designed to detect errors related to the erroneous behavior of the service, and SLA violations in cloud services. One major objective is to help providers to diagnose the anomalous virtual machines (VMs) on which a service is deployed as well as the type of error associated to the anomaly. Our ADS includes a system monitoring entity that collects software counters characterizing the cloud service, as well as a detection entity based on machine learning models. Additionally, a fault injection entity is integrated into the ADS for the training the machine learning models. This entity is also used to validate the ADS and to assess its anomaly detection and diagnosis performance. We validated our ADS with two case studies deployments: a NoSQL database, and a virtual IP Multimedia Subsystem developed implementing a virtual network function. Experimental results show that our ADS can achieve a high detection and diagnosis performance.
Fichier principal
Vignette du fichier
SauvanaudJSS-Vauteur.pdf (791.28 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01864357 , version 1 (29-08-2018)

Identifiants

Citer

Carla Sauvanaud, Mohamed Kaâniche, Karama Kanoun, Kahina Lazri, Guthemberg Silvestre. Anomaly Detection and Diagnosis for Cloud services: Practical experiments and lessons learned. Journal of Systems and Software, 2018, 139, pp.84-106. ⟨10.1016/j.jss.2018.01.039⟩. ⟨hal-01864357⟩
193 Consultations
106 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More