Friday, March 2, 2012

What is R-index

R index is developed in interpreting signal detection data for human perception. In sensory research it is used to interpret ranking data. The value one gets out of an R-index calculation is interpreted as a confusion between samples tested. It has been popularized among others by Bi and O'Mahony.
The value of R-index ranges between 0% and 100%, with 50% representing equality.
The computation of the R index is not difficult and can be done using a few short functions.

Simulated data of ranking for 3 persons and 4 products is:

   person prod rank
1       1    1    1
2       1    2    3
3       1    3    2
4       1    4    4
5       2    1    1
6       2    2    4
7       2    3    3
8       2    4    2
9       3    1    1
10      3    2    4
11      3    3    3
12      3    4    2


These data has as R indices:
  prod1 prod2    Rindex
1     1     2 100.00000
2     1     3 100.00000
3     1     4 100.00000
4     2     3  11.11111
5     2     4  22.22222
6     3     4  44.44444 


It is not obvious at first sight, but the R index is in all actuality not a continuous function. There are only so many permutations of rankings. To demonstrate this, we repeat a simulation 1000 times. With 25 simulated persons with each 4 products the frequency of R-indices is shown in the figure. 


It is clear that R index exactly 50% is the most probable outcome. This represents that the products are all equal. It is also clear that some R-indices which are really close to 50 are much less probable. Finally, values under 20 and over 80 hardly occur. This is well know. Critical values of R-indices are given by the red and blue lines (Bi and O'Mahony 1995 respectively 2007, Journal of Sensory Studies)


R code for these results

# A simple Ranking
makeRanksNodiff <- function(nprod,nrep) {
   inList <- lapply(1:nrep,function(x) sample(1:nprod,nprod)   )
   data.frame(person=factor(rep(1:nrep,each=nprod)),
  prod=factor(rep(1:nprod,times=nrep)),
rank=unlist(inList))
}


#  R index for two products out of a cross table  
tab2Rindex <- function(t1,t2) {
   Rindex <- crossprod(rev(t1)[-1],cumsum(rev(t2[-1]))) + 0.5*crossprod(t1,t2)
   100*Rindex/(sum(t1)*sum(t2))
}


# R index for all products out of an experiment
allRindex <- function(rankExperiment) {
 crst <- xtabs(~ prod + rank,data=rankExperiment)
 nprod <- nlevels(rankExperiment$prod)
 Rindices <- lapply(1:(nprod-1),function(p1) {
   lap <- lapply((p1+1):nprod,function(p2) {
index <- tab2Rindex(crst[p1,],crst[p2,])
prod1 <- levels(rankExperiment$prod)[p1]
prod2 <- levels(rankExperiment$prod)[p2]
data.frame(prod1=prod1,prod2=prod2,Rindex=index)
})
    return(do.call(rbind,lap))
    })
do.call(rbind,Rindices)
}


# small data set
set.seed(12)
rankExperiment <- makeRanksNodiff(nprod=4,nrep=3)
rankExperiment
allRindex(rankExperiment)


# larger simulation
set.seed(12)
last <- lapply(1:1000,function(x) {
   re <- makeRanksNodiff(nprod=4,nrep=25)
   ar <- allRindex(re)  
   ar$Rindex
} )


rankmatrix <- do.call(rbind,last)


RindicesTable <- as.data.frame(table(rankmatrix))
RindicesTable$Rindex <- as.numeric(as.character(RindicesTable$rankmatrix))


library(ggplot2)


g1 <- ggplot(RindicesTable, aes( Rindex, Freq))  + geom_point(alpha=.5) 
g1 <- g1 + geom_vline(xintercept=50 + 18.57*c(-1,1),colour='red')
g1 <- g1 + geom_vline(xintercept=50 + 15.21*c(-1,1),colour='blue')
g1





No comments:

Post a Comment