From cf39a3c6f327a057ae03cefced4d69b225023835 Mon Sep 17 00:00:00 2001 From: Paul-Corbalan <58653590+Paul-Corbalan@users.noreply.github.com> Date: Mon, 23 May 2022 22:54:25 +0200 Subject: [PATCH] Update Abstract in readme --- README.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index cefd253..a10d067 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,9 @@ # Detection of anomalies using the Scan Statistic and Local Score methods ## Abstract -Mathematical researchers are regularly looking for new scan methods to detect anomalies in datasets. Scan methods have wide applications in medicine, actuary science or finance. Previous work has tried to find methods to reveal irregularities within a discrete sequence of events using the local score method, defined by the log-likelihood ratio. However, these methods have not proposed a time-continuous scan on an interval of $0$ to $T$. We propose a new method of scan statistic and the Local Score to apply on continuous data and to detect anomalies more accurately. We compare the performance of both algorithms. By simulating multiple samples of Poisson processes, we compute the scan statistic on a window of a given size that represents the maximal number of occurrences of an event. Then, we compute the time between events on these sequences, compute the local score and compare the two results, in terms of significance of occurrences. Results show that both methods have a good prediction quality. The Scan statistic method performs more accurately when knowing the window-size. Local Score outperforms the Scan statistic method when using random values for the window size. Applications on real datasets, such as epileptic seizures are an opening for the use of these methods. +Mathematical researchers are regularly looking for new scan methods to detect atypical segments in datasets. Scan methods have wide applications in medicine or actuary science. Previous work has tried to find methods to reveal irregularities within a discrete sequence of events using the local score method, defined by the log-likelihood ratio. However, these methods have not proposed a time-continuous scan on an interval from $0$ to $T$. We propose a new method using scan statistic and Local Score to apply on continuous data and to detect atypical segments. We compare the performance of both algorithms. We simulate three sets of $10^4$ Poisson processes of a certain parameter $\lambda$. We use one set to estimate the empirical distribution of the scan statistic, using a Monte Carlo estimator. On the same set, we compute the sequence of times between events to estimate the score distribution. With the two other sets, we associate to each scan or score value, its p-value, that states if the detected segment is atypical. We conduct this experiment, knowing the value of the scan window, and then by choosing it at random. Results show that both methods have a good prediction quality. The Scan statistic method performs more accurately when knowing the window-size. Local Score outperforms the Scan statistic method when using random values for the window size. Applications on real datasets, such as epileptic seizures, are an opening for the use of these methods. -**Keywords:** scan statistic, Poisson process, local score, log-likelihood ratio, detection of atypical regions, window size + +**Keywords:** scan statistic, Poisson process, local score, detection of atypical regions, window size ## Dependencies