Scan-Statistics-Project-4Y-.../Comparaison_of_methods.rmd

---
title: "Comparaison of methods"
output: pdf_document
---

# Scan statistique - Méthode de Monte Carlo et calcul de p-value

## Import libraries
```{r}
library("localScore")
library("latex2exp")
library("Rcpp")
library("caret")
```

## 1. Proposition for simulations under $\mathcal{H}_1$

In this part, we propose a method that simulates a Poisson process under the hypothesis $\mathcal{H}_1$. The idea is to simulate a sample under $\mathcal{H}_0$, and add randomly a subsequence under the alternative hypothesis in this sequence. 
```{r}
PoissonProcess <- function(lambda,T) {
  return(sort(runif(rpois(1,lambda*T),0,T)))
}

SimulationH1 <- function(lambda0, lambda1,T,tau){
    ppH0=PoissonProcess(lambda0,T)
    ppH1.segt=PoissonProcess(lambda1,tau)
    dbt=runif(1,0,T-tau)
    ppH0bis=PoissonProcess(lambda0,T)
    ppH1.repo=dbt+ppH1.segt
    ppH0_avant=ppH0bis[which(ppH0bis<ppH1.repo[1])]
    ppH0_apres=ppH0bis[which(ppH0bis>ppH1.repo[length(ppH1.repo)])]
    ppH1=c(ppH0_avant,ppH1.repo,ppH0_apres)
    return (ppH1)
}
```


```{r}
TimeBetweenEvent <- function(pp){
    n=length(pp)
    tbe=pp[2:n]-pp[1:n1-1]
    tbe=c(0,tbe)
    return (tbe)
}

DataFrame <- function(pp,tbe){
    list=data.frame(ProcessusPoisson=pp, TimeBetweenEvent=tbe)
}
```

## 2. Simulation of the sequences under $\mathcal{H}_0$ via a Monte Carlo Method
In this part, we will try to simulate, using a Monte Carlo method, a set of $10^5$ independant samples, under the assumption that $\lambda=\lambda_0$, hence, that we are under the null hypothesis $\mathcal{H}_0$.  
```{r}
ScanStat <- function(pp, T, tau){
    n=length(pp)
    stop=n-length(which(pp>(T-tau)))
    ScanStat=0
    for (i in (1:stop)) {
        x=which((pp>=pp[i])&(pp<=(pp[i]+tau)))
        scan=length(x)
        if (scan>ScanStat) {ScanStat=scan}
  }   
    return (c(i,ScanStat))
}
```

We test the scan statistic method for different values of $\lambda_0$. The method of scan statistic we implemented will allow us to have access to the scan test statistic and where it happens in the sequence. 
```{r}
EmpDistrib <- function(lambda, n_sample,T,tau){
    pp=PoissonProcess(lambda,T)
    scan=c(ScanStat(pp,T, tau)[2])
    index=c(ScanStat(pp,T, tau)[1])
    for (i in 2:(n_sample)){
        pp=PoissonProcess(lambda,T)
        scan=rbind(scan,ScanStat(pp,T, tau)[2])
        index=rbind(index,ScanStat(pp,T, tau)[1])
    }
    min_scan=min(scan)-1
    max_scan=max(scan)
    table1=table(factor(scan, levels = min_scan:max_scan))
    EmpDis=data.frame(cdf=cumsum(table1)/sum(table1), proba=table1/sum(table1), index_scan=min_scan:max_scan)
    EmpDis<-EmpDis[,-2]
    return(EmpDis)
    }
```
```{r}
Plot_CDF <- function(lambda,n_sample,T,tau){
    Emp=EmpDistrib(lambda,n_sample,T,tau)
    title=TeX(paste(r'(Cumulative distribution function for $\lambda=$)', lambda))
    plot(Emp$index_scan, Emp$cdf,type="s",xlab="Number of occurrences",ylab="Probability", main=title, col="red")
    return(Emp)
}
```
### 2.1. Test of $\mathcal{H}_0: \lambda=\lambda_0$ against $\mathcal{H}_0: \lambda=\lambda_1$, where $\lambda_1 > \lambda_0$ 
In this part, we will test different values for $\lambda_0$ and $\lambda_1$, and compute the probability of occurrence of a certain scan statistic.

```{r}
#Empiricial distribution under H0
n_sample=10**4
lambda0=3
T=10
tau=1
ppH0=PoissonProcess(lambda0,T)
CDF=Plot_CDF(lambda0,n_sample,T,tau)
```
```{r}
n_sample=10**4
lambda1=4
T=10
tau=1
ppH0=PoissonProcess(lambda1,T)
CDF=Plot_CDF(lambda1,n_sample,T,tau)
```

```{r}
PValue <- function(Emp,ppH1, T, tau){
    scanH1=ScanStat(ppH1,T,tau)[2]
    index_scanH1=ScanStat(ppH1,T,tau)[1]
    index=Emp$index_scan
    n=length(index)
    if (scanH1< min(Emp$index_scan)){
        return (c(scanH1,1,index_scanH1))
        } else{
            if(min(Emp$index_scan)<scanH1 && scanH1<=max(Emp$index_scan)){
                return(c(scanH1,1-Emp$cdf[scanH1-min(Emp$index_scan)+1],index_scanH1))
            } else{return (c(scanH1,0,index_scanH1))}}
}
```

### 2.2. Simulation under $\mathcal{H}_0$ and computation of p-values
On simule des séquences sous $\mathcal{H}_0$, que l'on stocke. On calcule la valeur de la scan stat et de la p-value, que l'on stocke aussi. On a une séquence de p-valeur des scans et une séquence de score local.
```{r}
NbSeqH0=10000
NbSeqH1=NbSeqH0
DataH0=vector("list")
DataH1=vector("list")
lambda0=4
lambda1=10
T=10
tau=1

#Creation of a sequence that contains the sequence simulated under the null hypothesis
for (i in 1:NbSeqH0) {
    ppi=PoissonProcess(lambda0,T)
    DataH0[[i]]=ppi
}

#Creation of a sequence that contains the sequence simulated under the alternative hypothesis
seqH1begin=c()
for (i in 1:NbSeqH1) {
    pphi=SimulationH1(lambda0, lambda1,T,tau)
    DataH1[[i]]=pphi
}

#Computation of the time between events
TimeBetweenEventList <- function(list,n_list){
    TBE=vector("list",length=n_list)
    for (i in (1:n_list)) {
        ppi=list[[i]]
        ni=length(ppi)
        tbei=ppi[2:ni]-ppi[1:ni-1]
        TBE[[i]]=tbei
    }
    return (TBE)
}
tbe0=TimeBetweenEventList(DataH0,NbSeqH0)
```
We compute the p-value associated to all 5 sequences, and stock them in a vector. 

```{r}
#We start by computing the empirical distribution for lambda0
Emp = EmpDistrib(lambda0,n_sample,T,tau)
scan = c()
pvalue = c()
index_scan = c()

#Then, we stock the p-value and the 
for (i in 1:NbSeqH0){
    ppi = DataH0[[i]]
    result = PValue(Emp,ppi,T,tau)
    scan = c(scan,result[1])
    pvalue = c(pvalue,result[2])
    index_scan = c(index_scan,result[3])
}

ScS_H0=data.frame(num=(1:NbSeqH0), scan_stat=scan, pvalue_scan=pvalue,class=c(pvalue<0.05)) 
sum(ScS_H0$class[which(ScS_H0$class==TRUE)])/NbSeqH0
```

```{r}
#We start by computing the empirical distribution for lambda0
scan=c()
pvalue=c()
index_scan=c()

#Then, we stock the p-value and the 
for (i in 1:NbSeqH1){
    ppi=DataH1[[i]]
    result=PValue(Emp,DataH1[[i]],T,tau)
    scan=c(scan,result[1])
    pvalue=c(pvalue,result[2])
    index_scan=c(index_scan,result[3])
}
ScS_H1=data.frame(num=1:NbSeqH1, scan_stat=scan, pvalue_scan=pvalue, class=(pvalue<0.05), begin_scan=index_scan)
sum(ScS_H1$class[which(ScS_H1$class==TRUE)])/NbSeqH1

```

```{r}
ScanStatMC <- function(NbSeq, T, tau, Emp, pp0){
    scan=c()
    pvalue=c()
    index_scan=c()

    for (i in 1:NbSeq){
        ppi=pp0[[i]]
        result=PValue(Emp,ppi,T,tau)
        scan=c(scan,result[1])
        pvalue=c(pvalue,result[2])
        index_scan=c(index_scan,result[3])
    }

    ScS_H0=data.frame(num=(1:NbSeq), scan_stat=scan, pvalue_scan=pvalue,class=c(pvalue<0.05))
    return(ScS_H0)
}
```

## 3. Local score
### 3.1. Distribution of scores via Monte Carlo
```{r}
ComputeE <- function(lambda0, lambda1){
    E = 1
    maxXk = floor(E*(log(lambda1/lambda0)))
    while (maxXk < 3) {
        E = E+1
        maxXk = floor(E*(log(lambda1/lambda0)))
    }

    return (E)
}
```

```{r}
ScoreDistribEmpiric <- function(lambda0, lambda1, n_sample, T){
    E = ComputeE(lambda0, lambda1)
    Score = c()
    
    for (i in 1:n_sample){
        ppH0 = PoissonProcess(lambda0,T)
        n1 = length(ppH0)
        tbe0 = ppH0[2:n1]-ppH0[1:n1-1]
        X = floor(E*(log(lambda1/lambda0)+(lambda0-lambda1)*tbe0))
        Score=c(Score,X)
    }
    min_X = min(Score)
    max_X = max(Score)

    P_X = table(factor(Score, levels = min_X:max_X))/sum(table(Score))
    df = data.frame("Score_X" = min(Score):max(Score), "P_X" = P_X)
    df <- df[,-2]

    return (df)
}
```

```{r}
ScoreDistribElisa <- function(lambda0, lambda1, T){
    E = ComputeE(lambda0, lambda1)

    score_max = floor(E*log(lambda1/lambda0))

    ## score_min compute
    score_min_c = floor(E*log(lambda1/lambda0)+E*(lambda0-lambda1)*T)

    l = seq(score_min_c,score_max,1)
    borne_inf = (l-E*log(lambda1/lambda0))/(E*(lambda0-lambda1))
    borne_sup = (l+1-E*log(lambda1/lambda0))/(E*(lambda0-lambda1))
    proba.l = pexp(rate=lambda0,borne_inf)-pexp(rate=lambda0,borne_sup)
    S = sum(proba.l)
    new.proba.s = proba.l/S
    df = data.frame("Score_X" = l, "P_X" = new.proba.s)

    return (df)
}
```

```{r}
distrib_score_mc = ScoreDistribEmpiric(2,3,10000,T)
distrib_score_theo = ScoreDistribElisa(2,3,T)

plot_graph_distrib_score <- function(distrib_score_theo, distrib_score_mc){
    # length(distrib_score_mc[,2])
    # length(distrib_score_theo[,2])

    #diff_distrib_score=abs(distrib_score_mc[,2]-distrib_score_theo[,2])

    #par(mfrow = c(1,2))
    barplot(distrib_score_mc[,2],col="blue",axes=F)
    mtext("Distribution of scores via Monte Carlo",side=1,line=2.5,col="blue")
    axis(2, ylim=c(0,10))
    par(new = T)
    barplot(distrib_score_theo[,2],col="red",axes=F)
    mtext("Distribution of scores using the theoretical method",side=1,line=4,col="red") 
}

plot_graph_distrib_score(distrib_score_theo, distrib_score_mc)
```


### 3.2. Local score calculation
```{r}
LocalScoreMC <- function(lambda0, lambda1, NbSeq, T, X_seq, P_X, tbe0){
  E = ComputeE(lambda0, lambda1)
  
  pvalue = c()
  X = c()
  
  min_X = min(X_seq)
  max_X = max(X_seq)
  
  for (i in 1:NbSeq){
      x = floor(E*log(dexp(tbe0[[i]], rate = lambda1)/dexp(tbe0[[i]], rate = lambda0)))
      X = c(X,x)
      LS = localScoreC(x)$localScore[1]
      
      daudin_result = daudin(localScore = LS, score_probabilities = P_X, sequence_length = length(x), sequence_min = min_X, sequence_max = max_X)
      options(warn = -1) # Disable warnings print
      
      pvalue = c(pvalue, daudin_result)
  }
  LS_H0=data.frame(num=1:NbSeq, pvalue_scan=pvalue, class=(pvalue<0.05))
  return(LS_H0)
}
```

## 4. Experience plan for comparaison
```{r}
NbSeq = 10**3
T = 10
for (lambda0 in (2:5)){
  Sensitivity = c()
  Specificity = c()
  accepted_lambda = c()

  for (lambda1 in c(3:8)){
    if (lambda0 < lambda1){
      accepted_lambda=c(accepted_lambda,lambda1)
      cat("For T = ", T, ", Nb = ", NbSeq, ", lambda0 = ", lambda0, " and lambda1 = ", lambda1, ":\n", sep = "")
      tbe0=vector("list",length=NbSeq)
      pp0 =  vector("list", length = NbSeq)
      for (i in (1:NbSeq)) {
        ppi = PoissonProcess(lambda0,T)
        ni=length(ppi)
        pp0[[i]] = ppi
        tbei=ppi[2:ni]-ppi[1:ni-1]
        tbe0[[i]]=tbei
        }
            
      #cat("- Empiric version:\n")
      Score = ScoreDistribEmpiric(lambda0, lambda1, NbSeq, T)
      Emp = EmpDistrib(lambda0,n_sample,T,tau)
      
      X_seq = Score$Score_X
      P_X = Score$P_X
      
      LS_H0 = LocalScoreMC(lambda0, lambda1, NbSeq, T, X_seq, P_X, tbe0)
      options(warn = -1) # Disable warnings print
      SS_H0 = ScanStatMC(NbSeq, T, tau, Emp, pp0)
            
      #cat("Local Score:\n")
      #print(summary(LS_H0))
      #cat("Scan Statistics:\n")
      #print(summary(SS_H0))
      #cat("Confusion Matrix:\n")
      #print(confusionMatrix(factor(LS_H0$class), factor(SS_H0$class)))
        
      #cat("- Elisa version:\n")
      Score = ScoreDistribElisa(lambda0, lambda1, T)
      Emp = EmpDistrib(lambda0,n_sample,T,tau)

      X_seq = Score$Score_X
      P_X = Score$P_X
            
      LS_H0 = LocalScoreMC(lambda0, lambda1, NbSeq, T, X_seq, P_X, tbe0)
      options(warn = -1) # Disable warnings print

      SS_H0 = ScanStatMC(NbSeq, T, tau, Emp, pp0)
        
      #cat("Local Score:\n")
      #print(summary(LS_H0))
      #cat("Scan Statistics:\n")
      #print(summary(SS_H0))
      #cat("Confusion Matrix:\n")
      print(confusionMatrix(factor(LS_H0$class), factor(SS_H0$class))$table)
      Sensitivity = c(Sensitivity,confusionMatrix(factor(LS_H0$class), factor(SS_H0$class))$byClass[1])
      Specificity = c(Specificity,confusionMatrix(factor(LS_H0$class), factor(SS_H0$class))$byClass[2])

      cat("---\n")
      
    }
  }
  titleSens=TeX(paste(r'(Sensitivity for $\lambda_0=$)', lambda0))
  plot(x=accepted_lambda,y=Sensitivity, type='l', main = titleSens)
  
  titleSpec=TeX(paste(r'(Specificity for $\lambda_0=$)', lambda0))
  plot(x=accepted_lambda,y=Specificity, type='l', main = titleSpec)

}
```
Add Comparaison of methods 2022-03-29 07:17:34 +00:00			`---`
			`title: "Comparaison of methods"`
			`output: pdf_document`
			`---`

Update Comparaison_of_methods.rmd Add_Paul+Nicolas 2022-03-29 08:34:00 +00:00			`# Scan statistique - Méthode de Monte Carlo et calcul de p-value`

Import libraries section 2022-04-10 20:58:57 +00:00			`## Import libraries`
			```{r}
			`library("localScore")`
			`library("latex2exp")`
			`library("Rcpp")`
Add Confusion Matrix to Experience plan 2022-04-13 07:15:34 +00:00			`library("caret")`
Import libraries section 2022-04-10 20:58:57 +00:00			```

Update Comparaison_of_methods.rmd Add_Paul+Nicolas 2022-03-29 08:34:00 +00:00			`## 1. Proposition for simulations under $\mathcal{H}_1$`

			`In this part, we propose a method that simulates a Poisson process under the hypothesis $\mathcal{H}_1$. The idea is to simulate a sample under $\mathcal{H}_0$, and add randomly a subsequence under the alternative hypothesis in this sequence.`
Add Comparaison of methods 2022-03-29 07:17:34 +00:00			```{r}
			`PoissonProcess <- function(lambda,T) {`
			`return(sort(runif(rpois(1,lambda*T),0,T)))`
Update Comparaison_of_methods.rmd Add_Paul+Nicolas 2022-03-29 08:34:00 +00:00			`}`
Add Comparaison of methods 2022-03-29 07:17:34 +00:00
Update Comparaison_of_methods.rmd Add_Paul+Nicolas 2022-03-29 08:34:00 +00:00			`SimulationH1 <- function(lambda0, lambda1,T,tau){`
			`ppH0=PoissonProcess(lambda0,T)`
			`ppH1.segt=PoissonProcess(lambda1,tau)`
			`dbt=runif(1,0,T-tau)`
			`ppH0bis=PoissonProcess(lambda0,T)`
			`ppH1.repo=dbt+ppH1.segt`
			`ppH0_avant=ppH0bis[which(ppH0bis<ppH1.repo[1])]`
			`ppH0_apres=ppH0bis[which(ppH0bis>ppH1.repo[length(ppH1.repo)])]`
			`ppH1=c(ppH0_avant,ppH1.repo,ppH0_apres)`
Update Comparaison_of_methods.rmd Plot Sensitivity and Specificity for confusion matrix and correction of method for simulation under H1 2022-04-18 18:13:27 +00:00			`return (ppH1)`
Update Comparaison_of_methods.rmd Add_Paul+Nicolas 2022-03-29 08:34:00 +00:00			`}`
Update of Local score 2022-04-05 07:17:58 +00:00			```

Add Comparaison of methods 2022-03-29 07:17:34 +00:00
Update of Local score 2022-04-05 07:17:58 +00:00			```{r}
Update Comparaison_of_methods.rmd Add_Paul+Nicolas 2022-03-29 08:34:00 +00:00			`TimeBetweenEvent <- function(pp){`
			`n=length(pp)`
			`tbe=pp[2:n]-pp[1:n1-1]`
			`tbe=c(0,tbe)`
			`return (tbe)`
			`}`
Add Comparaison of methods 2022-03-29 07:17:34 +00:00
Update Comparaison_of_methods.rmd Add_Paul+Nicolas 2022-03-29 08:34:00 +00:00			`DataFrame <- function(pp,tbe){`
			`list=data.frame(ProcessusPoisson=pp, TimeBetweenEvent=tbe)`
			`}`
			```

			`## 2. Simulation of the sequences under $\mathcal{H}_0$ via a Monte Carlo Method`
			`In this part, we will try to simulate, using a Monte Carlo method, a set of $10^5$ independant samples, under the assumption that $\lambda=\lambda_0$, hence, that we are under the null hypothesis $\mathcal{H}_0$.`
			```{r}
			`ScanStat <- function(pp, T, tau){`
			`n=length(pp)`
			`stop=n-length(which(pp>(T-tau)))`
			`ScanStat=0`
			`for (i in (1:stop)) {`
			`x=which((pp>=pp[i])&(pp<=(pp[i]+tau)))`
			`scan=length(x)`
			`if (scan>ScanStat) {ScanStat=scan}`
			`}`
			`return (c(i,ScanStat))`
			`}`
			```
Add Comparaison of methods 2022-03-29 07:17:34 +00:00
Update Comparaison_of_methods.rmd Add_Paul+Nicolas 2022-03-29 08:34:00 +00:00			`We test the scan statistic method for different values of $\lambda_0$. The method of scan statistic we implemented will allow us to have access to the scan test statistic and where it happens in the sequence.`
			```{r}
			`EmpDistrib <- function(lambda, n_sample,T,tau){`
			`pp=PoissonProcess(lambda,T)`
			`scan=c(ScanStat(pp,T, tau)[2])`
			`index=c(ScanStat(pp,T, tau)[1])`
			`for (i in 2:(n_sample)){`
			`pp=PoissonProcess(lambda,T)`
			`scan=rbind(scan,ScanStat(pp,T, tau)[2])`
			`index=rbind(index,ScanStat(pp,T, tau)[1])`
			`}`
			`min_scan=min(scan)-1`
			`max_scan=max(scan)`
			`table1=table(factor(scan, levels = min_scan:max_scan))`
			`EmpDis=data.frame(cdf=cumsum(table1)/sum(table1), proba=table1/sum(table1), index_scan=min_scan:max_scan)`
			`EmpDis<-EmpDis[,-2]`
			`return(EmpDis)`
			`}`
			```
			```{r}
			`Plot_CDF <- function(lambda,n_sample,T,tau){`
			`Emp=EmpDistrib(lambda,n_sample,T,tau)`
			`title=TeX(paste(r'(Cumulative distribution function for $\lambda=$)', lambda))`
			`plot(Emp$index_scan, Emp$cdf,type="s",xlab="Number of occurrences",ylab="Probability", main=title, col="red")`
			`return(Emp)`
			`}`
			```
Update titles 2022-04-19 06:10:45 +00:00			`### 2.1. Test of $\mathcal{H}_0: \lambda=\lambda_0$ against $\mathcal{H}_0: \lambda=\lambda_1$, where $\lambda_1 > \lambda_0$`
Update Comparaison_of_methods.rmd Add_Paul+Nicolas 2022-03-29 08:34:00 +00:00			`In this part, we will test different values for $\lambda_0$ and $\lambda_1$, and compute the probability of occurrence of a certain scan statistic.`
Add Comparaison of methods 2022-03-29 07:17:34 +00:00
Update Comparaison_of_methods.rmd Add_Paul+Nicolas 2022-03-29 08:34:00 +00:00			```{r}
			`#Empiricial distribution under H0`
			`n_sample=10**4`
			`lambda0=3`
			`T=10`
			`tau=1`
			`ppH0=PoissonProcess(lambda0,T)`
			`CDF=Plot_CDF(lambda0,n_sample,T,tau)`
			```
Update of Local score 2022-04-05 07:17:58 +00:00			```{r}
			`n_sample=10**4`
			`lambda1=4`
			`T=10`
			`tau=1`
			`ppH0=PoissonProcess(lambda1,T)`
			`CDF=Plot_CDF(lambda1,n_sample,T,tau)`
			```
Add Comparaison of methods 2022-03-29 07:17:34 +00:00
Update Comparaison_of_methods.rmd Add_Paul+Nicolas 2022-03-29 08:34:00 +00:00			```{r}
			`PValue <- function(Emp,ppH1, T, tau){`
			`scanH1=ScanStat(ppH1,T,tau)[2]`
Update of Local score 2022-04-05 07:17:58 +00:00			`index_scanH1=ScanStat(ppH1,T,tau)[1]`
Update Comparaison_of_methods.rmd Add_Paul+Nicolas 2022-03-29 08:34:00 +00:00			`index=Emp$index_scan`
			`n=length(index)`
			`if (scanH1< min(Emp$index_scan)){`
Update of Local score 2022-04-05 07:17:58 +00:00			`return (c(scanH1,1,index_scanH1))`
Update Comparaison_of_methods.rmd Add_Paul+Nicolas 2022-03-29 08:34:00 +00:00			`} else{`
			`if(min(Emp$index_scan)<scanH1 && scanH1<=max(Emp$index_scan)){`
Update of Local score 2022-04-05 07:17:58 +00:00			`return(c(scanH1,1-Emp$cdf[scanH1-min(Emp$index_scan)+1],index_scanH1))`
			`} else{return (c(scanH1,0,index_scanH1))}}`
Update Comparaison_of_methods.rmd Add_Paul+Nicolas 2022-03-29 08:34:00 +00:00			`}`
Add Comparaison of methods 2022-03-29 07:17:34 +00:00			```

Update Comparaison_of_methods.rmd Add_Paul+Nicolas 2022-03-29 08:34:00 +00:00			`### 2.2. Simulation under $\mathcal{H}_0$ and computation of p-values`
			`On simule des séquences sous $\mathcal{H}_0$, que l'on stocke. On calcule la valeur de la scan stat et de la p-value, que l'on stocke aussi. On a une séquence de p-valeur des scans et une séquence de score local.`
			```{r}
Update Comparaison_of_methods.rmd Plot Sensitivity and Specificity for confusion matrix and correction of method for simulation under H1 2022-04-18 18:13:27 +00:00			`NbSeqH0=10000`
Update of Local score 2022-04-05 07:17:58 +00:00			`NbSeqH1=NbSeqH0`
Update Comparaison_of_methods.rmd Add_Paul+Nicolas 2022-03-29 08:34:00 +00:00			`DataH0=vector("list")`
			`DataH1=vector("list")`
Update Comparaison_of_methods.rmd Plot Sensitivity and Specificity for confusion matrix and correction of method for simulation under H1 2022-04-18 18:13:27 +00:00			`lambda0=4`
			`lambda1=10`
Update Comparaison_of_methods.rmd Add_Paul+Nicolas 2022-03-29 08:34:00 +00:00			`T=10`
			`tau=1`

			`#Creation of a sequence that contains the sequence simulated under the null hypothesis`
			`for (i in 1:NbSeqH0) {`
			`ppi=PoissonProcess(lambda0,T)`
			`DataH0[[i]]=ppi`
			`}`

			`#Creation of a sequence that contains the sequence simulated under the alternative hypothesis`
Update of Local score 2022-04-05 07:17:58 +00:00			`seqH1begin=c()`
Update Comparaison_of_methods.rmd Add_Paul+Nicolas 2022-03-29 08:34:00 +00:00			`for (i in 1:NbSeqH1) {`
			`pphi=SimulationH1(lambda0, lambda1,T,tau)`
Update Comparaison_of_methods.rmd Plot Sensitivity and Specificity for confusion matrix and correction of method for simulation under H1 2022-04-18 18:13:27 +00:00			`DataH1[[i]]=pphi`
Update Comparaison_of_methods.rmd Add_Paul+Nicolas 2022-03-29 08:34:00 +00:00			`}`

			`#Computation of the time between events`
			`TimeBetweenEventList <- function(list,n_list){`
			`TBE=vector("list",length=n_list)`
			`for (i in (1:n_list)) {`
			`ppi=list[[i]]`
			`ni=length(ppi)`
			`tbei=ppi[2:ni]-ppi[1:ni-1]`
			`TBE[[i]]=tbei`
			`}`
			`return (TBE)`
			`}`
			`tbe0=TimeBetweenEventList(DataH0,NbSeqH0)`
			```
			`We compute the p-value associated to all 5 sequences, and stock them in a vector.`

			```{r}
			`#We start by computing the empirical distribution for lambda0`
Add ScanStatMC to Experience plan 2022-04-12 16:23:44 +00:00			`Emp = EmpDistrib(lambda0,n_sample,T,tau)`
			`scan = c()`
			`pvalue = c()`
			`index_scan = c()`
Update Comparaison_of_methods.rmd Add_Paul+Nicolas 2022-03-29 08:34:00 +00:00
			`#Then, we stock the p-value and the`
			`for (i in 1:NbSeqH0){`
Add ScanStatMC to Experience plan 2022-04-12 16:23:44 +00:00			`ppi = DataH0[[i]]`
			`result = PValue(Emp,ppi,T,tau)`
			`scan = c(scan,result[1])`
			`pvalue = c(pvalue,result[2])`
			`index_scan = c(index_scan,result[3])`
Update Comparaison_of_methods.rmd Add_Paul+Nicolas 2022-03-29 08:34:00 +00:00			`}`
Add ScanStatMC to Experience plan 2022-04-12 16:23:44 +00:00
			`ScS_H0=data.frame(num=(1:NbSeqH0), scan_stat=scan, pvalue_scan=pvalue,class=c(pvalue<0.05))`
			`sum(ScS_H0$class[which(ScS_H0$class==TRUE)])/NbSeqH0`
Update Comparaison_of_methods.rmd Add_Paul+Nicolas 2022-03-29 08:34:00 +00:00			```
Update of Local score 2022-04-05 07:17:58 +00:00
			```{r}
			`#We start by computing the empirical distribution for lambda0`
			`scan=c()`
			`pvalue=c()`
			`index_scan=c()`

			`#Then, we stock the p-value and the`
			`for (i in 1:NbSeqH1){`
			`ppi=DataH1[[i]]`
			`result=PValue(Emp,DataH1[[i]],T,tau)`
			`scan=c(scan,result[1])`
			`pvalue=c(pvalue,result[2])`
			`index_scan=c(index_scan,result[3])`
			`}`
Update Comparaison_of_methods.rmd Plot Sensitivity and Specificity for confusion matrix and correction of method for simulation under H1 2022-04-18 18:13:27 +00:00			`ScS_H1=data.frame(num=1:NbSeqH1, scan_stat=scan, pvalue_scan=pvalue, class=(pvalue<0.05), begin_scan=index_scan)`
			`sum(ScS_H1$class[which(ScS_H1$class==TRUE)])/NbSeqH1`

Update of Local score 2022-04-05 07:17:58 +00:00			```

Add ScanStatMC to Experience plan 2022-04-12 16:23:44 +00:00			```{r}
			`ScanStatMC <- function(NbSeq, T, tau, Emp, pp0){`
			`scan=c()`
			`pvalue=c()`
			`index_scan=c()`

			`for (i in 1:NbSeq){`
			`ppi=pp0[[i]]`
			`result=PValue(Emp,ppi,T,tau)`
			`scan=c(scan,result[1])`
			`pvalue=c(pvalue,result[2])`
			`index_scan=c(index_scan,result[3])`
			`}`

			`ScS_H0=data.frame(num=(1:NbSeq), scan_stat=scan, pvalue_scan=pvalue,class=c(pvalue<0.05))`
			`return(ScS_H0)`
			`}`
			```

Update of Local score 2022-04-05 07:17:58 +00:00			`## 3. Local score`
Update titles 2022-04-19 06:10:45 +00:00			`### 3.1. Distribution of scores via Monte Carlo`
Add local score function 2022-04-05 09:20:46 +00:00			```{r}
			`ComputeE <- function(lambda0, lambda1){`
			`E = 1`
			`maxXk = floor(E*(log(lambda1/lambda0)))`
			`while (maxXk < 3) {`
			`E = E+1`
			`maxXk = floor(E*(log(lambda1/lambda0)))`
			`}`

			`return (E)`
			`}`
			```

			```{r}
Experience plan work with ScoreDistribEmpiric 2022-04-12 15:35:16 +00:00			`ScoreDistribEmpiric <- function(lambda0, lambda1, n_sample, T){`
Add local score function 2022-04-05 09:20:46 +00:00			`E = ComputeE(lambda0, lambda1)`
Update ScoreDistribElisa format 2022-04-12 15:58:13 +00:00			`Score = c()`
Experience plan work with ScoreDistribEmpiric 2022-04-12 15:35:16 +00:00
			`for (i in 1:n_sample){`
			`ppH0 = PoissonProcess(lambda0,T)`
			`n1 = length(ppH0)`
			`tbe0 = ppH0[2:n1]-ppH0[1:n1-1]`
			`X = floor(E(log(lambda1/lambda0)+(lambda0-lambda1)tbe0))`
			`Score=c(Score,X)`
Update LocaScoreMC and sub-functions 2022-04-12 07:29:10 +00:00			`}`
Experience plan work with ScoreDistribEmpiric 2022-04-12 15:35:16 +00:00			`min_X = min(Score)`
			`max_X = max(Score)`
Add local score function 2022-04-05 09:20:46 +00:00
Minor issue with ScoreDistribEmpiric solve 2022-04-12 16:06:30 +00:00			`P_X = table(factor(Score, levels = min_X:max_X))/sum(table(Score))`
Experience plan work with ScoreDistribEmpiric 2022-04-12 15:35:16 +00:00			`df = data.frame("Score_X" = min(Score):max(Score), "P_X" = P_X)`
			`df <- df[,-2]`
Add local score function 2022-04-05 09:20:46 +00:00
Experience plan work with ScoreDistribEmpiric 2022-04-12 15:35:16 +00:00			`return (df)`
			`}`
Add Elisa ScoreDistrib code 2022-04-12 15:26:00 +00:00			```

			```{r}
Update ScoreDistribElisa format 2022-04-12 15:58:13 +00:00			`ScoreDistribElisa <- function(lambda0, lambda1, T){`
Add Elisa ScoreDistrib code 2022-04-12 15:26:00 +00:00			`E = ComputeE(lambda0, lambda1)`

Update ScoreDistribElisa format 2022-04-12 15:58:13 +00:00			`score_max = floor(E*log(lambda1/lambda0))`
Add Elisa ScoreDistrib code 2022-04-12 15:26:00 +00:00
			`## score_min compute`
Update ScoreDistribElisa format 2022-04-12 15:58:13 +00:00			`score_min_c = floor(Elog(lambda1/lambda0)+E(lambda0-lambda1)*T)`
Add Elisa ScoreDistrib code 2022-04-12 15:26:00 +00:00
Update ScoreDistribElisa format 2022-04-12 15:58:13 +00:00			`l = seq(score_min_c,score_max,1)`
			`borne_inf = (l-Elog(lambda1/lambda0))/(E(lambda0-lambda1))`
			`borne_sup = (l+1-Elog(lambda1/lambda0))/(E(lambda0-lambda1))`
			`proba.l = pexp(rate=lambda0,borne_inf)-pexp(rate=lambda0,borne_sup)`
			`S = sum(proba.l)`
			`new.proba.s = proba.l/S`
			`df = data.frame("Score_X" = l, "P_X" = new.proba.s)`
Add Elisa ScoreDistrib code 2022-04-12 15:26:00 +00:00
Update ScoreDistribElisa format 2022-04-12 15:58:13 +00:00			`return (df)`
Add local score function 2022-04-05 09:20:46 +00:00			`}`
			```

Comparison_DistribScore_Methods 2022-04-18 14:37:55 +00:00			```{r}
Passing to function plot_graph_distrib_score 2022-04-19 08:30:49 +00:00			`distrib_score_mc = ScoreDistribEmpiric(2,3,10000,T)`
			`distrib_score_theo = ScoreDistribElisa(2,3,T)`

			`plot_graph_distrib_score <- function(distrib_score_theo, distrib_score_mc){`
			`# length(distrib_score_mc[,2])`
			`# length(distrib_score_theo[,2])`

			`#diff_distrib_score=abs(distrib_score_mc[,2]-distrib_score_theo[,2])`

			`#par(mfrow = c(1,2))`
			`barplot(distrib_score_mc[,2],col="blue",axes=F)`
			`mtext("Distribution of scores via Monte Carlo",side=1,line=2.5,col="blue")`
			`axis(2, ylim=c(0,10))`
			`par(new = T)`
			`barplot(distrib_score_theo[,2],col="red",axes=F)`
			`mtext("Distribution of scores using the theoretical method",side=1,line=4,col="red")`
			`}`
Comparison_DistribScore_Methods 2022-04-18 14:37:55 +00:00
Passing to function plot_graph_distrib_score 2022-04-19 08:30:49 +00:00			`plot_graph_distrib_score(distrib_score_theo, distrib_score_mc)`
Comparison_DistribScore_Methods 2022-04-18 14:37:55 +00:00			```


Update titles 2022-04-19 06:10:45 +00:00			`### 3.2. Local score calculation`
Add Comparaison of methods 2022-03-29 07:17:34 +00:00			```{r}
Rename for LocalScoreMC 2022-04-12 16:28:04 +00:00			`LocalScoreMC <- function(lambda0, lambda1, NbSeq, T, X_seq, P_X, tbe0){`
Update LocaScoreMC and sub-functions 2022-04-12 07:29:10 +00:00			`E = ComputeE(lambda0, lambda1)`

Transition to functions and experience plan 2022-04-10 20:59:22 +00:00			`pvalue = c()`
			`X = c()`

Put outside X_seq in LocaScoreMC 2022-04-12 08:50:29 +00:00			`min_X = min(X_seq)`
			`max_X = max(X_seq)`
Transition to functions and experience plan 2022-04-10 20:59:22 +00:00
Update LocaScoreMC and sub-functions 2022-04-12 07:29:10 +00:00			`for (i in 1:NbSeq){`
Transition to functions and experience plan 2022-04-10 20:59:22 +00:00			`x = floor(E*log(dexp(tbe0[[i]], rate = lambda1)/dexp(tbe0[[i]], rate = lambda0)))`
			`X = c(X,x)`
			`LS = localScoreC(x)$localScore[1]`

			`daudin_result = daudin(localScore = LS, score_probabilities = P_X, sequence_length = length(x), sequence_min = min_X, sequence_max = max_X)`
Update LocaScoreMC and sub-functions 2022-04-12 07:29:10 +00:00			`options(warn = -1) # Disable warnings print`
Transition to functions and experience plan 2022-04-10 20:59:22 +00:00
			`pvalue = c(pvalue, daudin_result)`
			`}`
Update LocaScoreMC and sub-functions 2022-04-12 07:29:10 +00:00			`LS_H0=data.frame(num=1:NbSeq, pvalue_scan=pvalue, class=(pvalue<0.05))`
Transition to functions and experience plan 2022-04-10 20:59:22 +00:00			`return(LS_H0)`
Update Comparaison_of_methods.rmd Add_Paul+Nicolas 2022-03-29 08:34:00 +00:00			`}`
			```
Add Comparaison of methods 2022-03-29 07:17:34 +00:00
Add ScanStatMC to Experience plan 2022-04-12 16:23:44 +00:00			`## 4. Experience plan for comparaison`
Transition to functions and experience plan 2022-04-10 20:59:22 +00:00			```{r}
Update LocaScoreMC and sub-functions 2022-04-12 07:29:10 +00:00			`NbSeq = 10**3`
Put outside X_seq in LocaScoreMC 2022-04-12 08:50:29 +00:00			`T = 10`
Experience plan work with ScoreDistribEmpiric 2022-04-12 15:35:16 +00:00			`for (lambda0 in (2:5)){`
Update Comparaison_of_methods.rmd Plot Sensitivity and Specificity for confusion matrix and correction of method for simulation under H1 2022-04-18 18:13:27 +00:00			`Sensitivity = c()`
			`Specificity = c()`
			`accepted_lambda = c()`

			`for (lambda1 in c(3:8)){`
			`if (lambda0 < lambda1){`
			`accepted_lambda=c(accepted_lambda,lambda1)`
			`cat("For T = ", T, ", Nb = ", NbSeq, ", lambda0 = ", lambda0, " and lambda1 = ", lambda1, ":\n", sep = "")`
			`tbe0=vector("list",length=NbSeq)`
			`pp0 = vector("list", length = NbSeq)`
			`for (i in (1:NbSeq)) {`
			`ppi = PoissonProcess(lambda0,T)`
			`ni=length(ppi)`
			`pp0[[i]] = ppi`
			`tbei=ppi[2:ni]-ppi[1:ni-1]`
			`tbe0[[i]]=tbei`
			`}`
Experience plan work with ScoreDistribEmpiric 2022-04-12 15:35:16 +00:00
Update Comparaison_of_methods.rmd Plot Sensitivity and Specificity for confusion matrix and correction of method for simulation under H1 2022-04-18 18:13:27 +00:00			`#cat("- Empiric version:\n")`
			`Score = ScoreDistribEmpiric(lambda0, lambda1, NbSeq, T)`
			`Emp = EmpDistrib(lambda0,n_sample,T,tau)`

			`X_seq = Score$Score_X`
			`P_X = Score$P_X`

			`LS_H0 = LocalScoreMC(lambda0, lambda1, NbSeq, T, X_seq, P_X, tbe0)`
			`options(warn = -1) # Disable warnings print`
			`SS_H0 = ScanStatMC(NbSeq, T, tau, Emp, pp0)`
Experience plan work with ScoreDistribEmpiric 2022-04-12 15:35:16 +00:00
Update Comparaison_of_methods.rmd Plot Sensitivity and Specificity for confusion matrix and correction of method for simulation under H1 2022-04-18 18:13:27 +00:00			`#cat("Local Score:\n")`
			`#print(summary(LS_H0))`
			`#cat("Scan Statistics:\n")`
			`#print(summary(SS_H0))`
			`#cat("Confusion Matrix:\n")`
			`#print(confusionMatrix(factor(LS_H0$class), factor(SS_H0$class)))`

			`#cat("- Elisa version:\n")`
			`Score = ScoreDistribElisa(lambda0, lambda1, T)`
			`Emp = EmpDistrib(lambda0,n_sample,T,tau)`

			`X_seq = Score$Score_X`
			`P_X = Score$P_X`
Experience plan work with ScoreDistribEmpiric 2022-04-12 15:35:16 +00:00
Update Comparaison_of_methods.rmd Plot Sensitivity and Specificity for confusion matrix and correction of method for simulation under H1 2022-04-18 18:13:27 +00:00			`LS_H0 = LocalScoreMC(lambda0, lambda1, NbSeq, T, X_seq, P_X, tbe0)`
			`options(warn = -1) # Disable warnings print`
Add ScanStatMC to Experience plan 2022-04-12 16:23:44 +00:00
Update Comparaison_of_methods.rmd Plot Sensitivity and Specificity for confusion matrix and correction of method for simulation under H1 2022-04-18 18:13:27 +00:00			`SS_H0 = ScanStatMC(NbSeq, T, tau, Emp, pp0)`

			`#cat("Local Score:\n")`
			`#print(summary(LS_H0))`
			`#cat("Scan Statistics:\n")`
			`#print(summary(SS_H0))`
			`#cat("Confusion Matrix:\n")`
			`print(confusionMatrix(factor(LS_H0$class), factor(SS_H0$class))$table)`
			`Sensitivity = c(Sensitivity,confusionMatrix(factor(LS_H0$class), factor(SS_H0$class))$byClass[1])`
			`Specificity = c(Specificity,confusionMatrix(factor(LS_H0$class), factor(SS_H0$class))$byClass[2])`

			`cat("---\n")`

Transition to functions and experience plan 2022-04-10 20:59:22 +00:00			`}`
Update Comparaison_of_methods.rmd Plot Sensitivity and Specificity for confusion matrix and correction of method for simulation under H1 2022-04-18 18:13:27 +00:00			`}`
			`titleSens=TeX(paste(r'(Sensitivity for $\lambda_0=$)', lambda0))`
			`plot(x=accepted_lambda,y=Sensitivity, type='l', main = titleSens)`

			`titleSpec=TeX(paste(r'(Specificity for $\lambda_0=$)', lambda0))`
			`plot(x=accepted_lambda,y=Specificity, type='l', main = titleSpec)`

Transition to functions and experience plan 2022-04-10 20:59:22 +00:00			`}`
			```
Update Comparaison_of_methods.rmd Plot Sensitivity and Specificity for confusion matrix and correction of method for simulation under H1 2022-04-18 18:13:27 +00:00