Repertory used for the 4th year project in Applied Mathematics at INSA Toulouse on the subject Evaluation of the performance of the new statistical tool.
Go to file
Paul-Corbalan 07f1cac62a Add dependencies to readme 2022-05-21 17:14:03 +02:00
data Import rainfall data 2022-02-08 10:00:12 +01:00
.gitignore Start code with Poisson 2022-02-01 10:48:52 +01:00
Comparaison_of_methods.pdf Create Comparaison_of_methods.pdf 2022-04-18 20:13:29 +02:00
Comparaison_of_methods.rmd Update Comparaison_of_methods.rmd 2022-05-16 10:30:03 +02:00
Dataset_study.rmd Update Dataset_study.rmd 2022-04-19 08:09:37 +02:00
README.md Add dependencies to readme 2022-05-21 17:14:03 +02:00

README.md

Detection of anomalies using the Scan Statistic and Local Score methods

Abstract

Mathematical researchers are regularly looking for new scan methods to detect anomalies in datasets. Scan methods have wide applications in medicine, actuary science or finance. Previous work has tried to find methods to reveal irregularities within a discrete sequence of events using the local score method, defined by the log-likelihood ratio. However, these methods have not proposed a time-continuous scan on an interval of 0 to T. We propose a new method of scan statistic and the Local Score to apply on continuous data and to detect anomalies more accurately. We compare the performance of both algorithms. By simulating multiple samples of Poisson processes, we compute the scan statistic on a window of a given size that represents the maximal number of occurrences of an event. Then, we compute the time between events on these sequences, compute the local score and compare the two results, in terms of significance of occurrences. Results show that both methods have a good prediction quality. The Scan statistic method performs more accurately when knowing the window-size. Local Score outperforms the Scan statistic method when using random values for the window size. Applications on real datasets, such as epileptic seizures are an opening for the use of these methods.

Keywords: scan statistic, Poisson process, local score, log-likelihood ratio, detection of atypical regions, window size

Dependencies

R libraries

To run and compile the code Comparaison_of_methods in PDF format, please use the following command in R terminal before running the file:

install.packages(c("localScore", "latex2exp", "Rcpp", "caret", "ROCR", "pROC"))