Sunday, March 4, 2012

Interpretation of R-index

Having introduced the R-index, it is time to look how it works. For this a simple example is sufficient. What happens if a product is different from another product. To make this at least slightly realistic, three products are needed. Two products will be equal, and one, the odd product, will have a varying distance. It is assumed, that the observations which a panelist does are normally distributed (sd=1) and based on these observations. Violating these assumptions is for a future time.
Again, critical values of R-indices are given by the red and blue lines (Bi and O'Mahony 1995 respectively 2007, Journal of Sensory Studies)

In the plot is is clear that R-index is more different from 50 as the odd product is more different from the others. At a distance of about 0.7 one has a fair chance to declare the odd product as different. At a distance 1.3 the odd product will almost certainly be declared different.

It is also clear that R-index is non-linear in terms of distance. With the current number of products and panelists, under 15 to 20 (or over 80 to 85) increasing the distance has less effect than closer to 50. R-index is therefore not a sensible distance measure.

The R code

library(plyr)
library(ggplot2)


makeRanksDiff <- function(prods,nrep) {
   nprod <- length(prods)
   inList <- lapply(1:nrep,function(x)  rank(rnorm(n=nprod,mean=prods)))
   data.frame(person=factor(rep(1:nrep,each=nprod)),
      prod=factor(rep(1:nprod,times=nrep)),
      rank=unlist(inList))
}


#  R index for two products out of a cross table  
tab2Rindex <- function(t1,t2) {
   Rindex <- crossprod(rev(t1)[-1],cumsum(rev(t2[-1]))) + 0.5*crossprod(t1,t2)
100*Rindex/(sum(t1)*sum(t2))
}


# faster version - no labels 
FastAllRindex <- function(rankExperiment) {
   crst <- xtabs(~ prod + rank,data=rankExperiment)
   nprod <- nlevels(rankExperiment$prod)
   Rindices <- unlist(   lapply(1:(nprod-1),function(p1) {
      lapply((p1+1):nprod,function(p2) tab2Rindex(crst[p1,],crst[p2,])) }) )
   Rindices  
}




set.seed(12)
location <- seq(0,5,by=.1)
last <- lapply(location,function(xo) {
   li <- lapply(1:250,function(xi) {
      re <- makeRanksDiff(prod=c(0,xo,0),nrep=25)
      FastAllRindex(re)  
      })
   li2 <- as.data.frame(do.call(rbind,li))
   li2$location <- xo
   li2
} )




nprod <- 3
compare <- unlist(lapply(1:(nprod-1),function(p1) {
   lap <- lapply((p1+1):nprod,function(p2) {
   paste('Prod',p2,'vs','Prod',p1) })}) )




rankmatrix <- as.data.frame(do.call(rbind,last))
summy <- ddply(rankmatrix,~location,function(xo) {
   what <- grep('location',names(xo) ,value=TRUE,invert=TRUE)
      rb <-  sapply(what,function(xi) quantile(xo[,xi],c(.025,.5,.975)))
      rb <- as.data.frame(t(rb))
      rb$what <- compare
      rb
})




g1 <- ggplot(summy,aes(location,`50%`) )
g1 <- g1 + geom_point(aes(colour= what))
g1 <- g1+ geom_errorbar(aes(ymax = `97.5%`, ymin=`2.5%`,colour=what))
g1 <- g1 + scale_y_continuous(name='R-index' )
g1 <- g1 + scale_x_continuous(name='Location of odd product')
g1 <- g1 + geom_hline(yintercept=50 + 18.57*c(-1,1),colour='red')
g1 <- g1 + geom_hline(yintercept=50 + 15.21*c(-1,1),colour='blue')


g1

No comments:

Post a Comment