--- title: "Comparaison of methods" output: pdf_document --- # Scan statistique - Méthode de Monte Carlo et calcul de p-value ## Import libraries ```{r} library("localScore") library("latex2exp") library("Rcpp") library("caret") library("ROCR") library("pROC") ``` ## 1. Proposition for simulations under $\mathcal{H}_1$ In this part, we propose a method that simulates a Poisson process under the hypothesis $\mathcal{H}_1$. The idea is to simulate a sample under $\mathcal{H}_0$, and add randomly a subsequence under the alternative hypothesis in this sequence. The function `PoissonProcess` creates a sequence of Poisson process of a parameter lambda ```{r} PoissonProcess <- function(lambda,T) { return(sort(runif(rpois(1,lambda*T),0,T))) } ``` The following function creates a sequence under H0 and add a sequence under H1. ```{r} SimulationH1 <- function(lambda0, lambda1,T,tau){ ppH0=PoissonProcess(lambda0,T) ppH1.segt=PoissonProcess(lambda1,tau) dbt=runif(1,0,T-tau) ppH0bis=PoissonProcess(lambda0,T) ppH1.repo=dbt+ppH1.segt ppH0_avant=ppH0bis[which(ppH0bisppH1.repo[length(ppH1.repo)])] ppH1=c(ppH0_avant,ppH1.repo,ppH0_apres) return (ppH1) } ``` `TimeBetweenEvent` compute Time Between Event for a `pp` interval. ```{r} TimeBetweenEvent <- function(pp){ n=length(pp) tbe=pp[2:n]-pp[1:n1-1] tbe=c(0,tbe) return (tbe) } DataFrame <- function(pp,tbe){ list=data.frame(ProcessusPoisson=pp, TimeBetweenEvent=tbe) } ``` ## 2. Simulation of the sequences under $\mathcal{H}_0$ via a Monte Carlo Method In this part, we will try to simulate, using a Monte Carlo method, a set of $10^5$ independant samples, under the assumption that $\lambda=\lambda_0$, hence, that we are under the null hypothesis $\mathcal{H}_0$. The function `ScanStat` compute the scan statistic for a sequence, given some parameters $T$ and $\tau$. This function returns the value of the scan stat, and the index of the sequence where it happens ```{r} ScanStat <- function(pp, T, tau){ n=length(pp) stop=n-length(which(pp>(T-tau))) ScanStat=0 for (i in (1:stop)) { x=which((pp>=pp[i])&(pp<=(pp[i]+tau))) scan=length(x) if (scan>ScanStat) {ScanStat=scan max=i} } return (c(max,ScanStat)) } ``` We test the scan statistic method for different values of $\lambda_0$. The method of scan statistic we implemented will allow us to have access to the scan test statistic and where it happens in the sequence. This function `EmpDistrib` compute the empirical distribution using a Monte Carlo estimator for the scan statistic method. It returns a Data Frame, containing the value of the scan, its probability and the value of its cumulative distribution function. ```{r} EmpDistrib <- function(lambda, n_sample,T,tau){ pp=PoissonProcess(lambda,T) scan=c(ScanStat(pp,T, tau)[2]) index=c(ScanStat(pp,T, tau)[1]) for (i in 2:(n_sample)){ pp=PoissonProcess(lambda,T) scan=rbind(scan,ScanStat(pp,T, tau)[2]) index=rbind(index,ScanStat(pp,T, tau)[1]) } scan=unlist(scan) min_scan=min(scan)-1 max_scan=max(scan) table1=table(factor(scan, levels = min_scan:max_scan)) EmpDis=data.frame(cdf=cumsum(table1)/sum(table1), proba=table1/sum(table1), index_scan=min_scan:max_scan) EmpDis<-EmpDis[,-2] return(EmpDis) } ``` This function plot the cumulative distribution function associated to an empirical distribution function ```{r} Plot_CDF <- function(lambda,n_sample,T,tau){ Emp=EmpDistrib(lambda,n_sample,T,tau) title=TeX(paste(r'(Cumulative distribution function for $\lambda=$)', lambda)) plot(Emp$index_scan, Emp$cdf,type="s",xlab="Number of occurrences",ylab="Probability", main=title, col="red") return(Emp) } ``` ### 2.1. Test of $\mathcal{H}_0: \lambda=\lambda_0$ against $\mathcal{H}_0: \lambda=\lambda_1$, where $\lambda_1 > \lambda_0$ In this part, we will test different values for $\lambda_0$ and $\lambda_1$, and compute the probability of occurrence of a certain scan statistic. ```{r} #Empiricial distribution under H0 n_sample=10**4 lambda0=3 T=10 tau=1 ppH0=PoissonProcess(lambda0,T) CDF=Plot_CDF(lambda0,n_sample,T,tau) ``` Compute $p$-value for scan statistic of `ppH1` with `Emp`: ```{r} PValue <- function(Emp,ppH, T, tau){ SS = ScanStat(ppH,T,tau) scanH = SS[2] index_scanH = SS[1] index = Emp$index_scan n=length(index) if (scanH< min(Emp$index_scan)){ return (c(scanH,1,index_scanH)) } else{ if(min(Emp$index_scan)