Saturday, March 30, 2013

More ordinal data display

The past two weeks I made a post regarding analyzing ordinal data with R and JAGS. The calculations in the second part made me realize I could actually get top two box intervals out of R. This demonstrated here. For that I needed the inverse logistic transformation. This I pulled out the ARM package, and I now realize the latter also contains a function to analyse ordinal data, which is the second part of this post.

Data

The data is as before the cheese data. I will pick up at calculating the model and printing the model summary. The subsequent calculation is actually fairly simple. 
The first parameters in the summary refer to the the difference between cheese A and cheeses B, C and D. The second part are the thresholds between categories for cheese A. So, when I take the threshold of category 7|8 for cheese A (1.54) and add the differences, I get the 7|8 thresholds for the other three cheeses. Parameter values and their variances can be readily obtained. Associated standard deviations follow from the vcov matrix. Take the inverse logit and the desired result is there. It is almost too simple.
Res.clm <- clm(FResponse ~Cheese,data=cheese2)
summary(Res.clm)
formula: FResponse ~ Cheese
data:    cheese2
 link  threshold nobs logLik  AIC    niter max.grad cond.H 
 logit flexible  208  -355.67 733.35 6(0)  3.14e-11 8.8e+01

Coefficients:
        Estimate Std. Error z value Pr(>|z|)    
CheeseB  -3.3518     0.4287  -7.819 5.34e-15 ***
CheeseC  -1.7099     0.3715  -4.603 4.16e-06 ***
CheeseD   1.6128     0.3805   4.238 2.25e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Threshold coefficients:
    Estimate Std. Error z value
1|2 -5.46738    0.52363 -10.441
2|3 -4.41219    0.42776 -10.315
3|4 -3.31262    0.37004  -8.952
4|5 -2.24401    0.32674  -6.868
5|6 -0.90776    0.28335  -3.204
6|7  0.04425    0.26457   0.167
7|8  1.54592    0.30168   5.124
8|9  3.10577    0.40573   7.655
co <- coef(Res.clm)[c('7|8','CheeseB','CheeseC','CheeseD')]
vc <- vcov(Res.clm)[c('7|8','CheeseB','CheeseC','CheeseD'),
                    c('7|8','CheeseB','CheeseC','CheeseD')]
names(co) <- levels(cheese2$Cheese)
sd <- sqrt(c(vc[1,1],diag(vc)[-1]+vc[1,1]-2*vc[1,-1]))
data.frame(
    `top 2 box`=arm::invlogit(c(-co[1],-co[1]+co[-1])),
    `2.5% limit`=arm::invlogit(c(-co[1],-co[1]+co[-1])+qnorm(.025)*sd),
    `97.5% limit`=arm::invlogit(c(-co[1],-co[1]+co[-1])+qnorm(.975)*sd),
    check.names=FALSE
)
    top 2 box    2.5% limit    97.5% limit
A 0.175676950   0.105533878     0.27795336
B 0.007407959   0.003235069     0.01687231
C 0.037118760   0.018617885     0.07264338
D 0.516712572   0.384663489     0.64646825

ARM 

ARM (Data Analysis Using Regression and Multilevel/Hierarchical Models) is the package associated with the similar titled book (Gelman & Hill) which is certainly value for money. Having said that, I did not remember this being in the book. The package also contains a simulation function, which I used to get some posterior results for further processing.
library(arm)
Res.arm <- bayespolr(FResponse ~Cheese,data=cheese2)
Res.arm
bayespolr(formula = FResponse ~ Cheese, data = cheese2)
        coef.est coef.se
CheeseB -3.25     0.42  
CheeseC -1.63     0.37  
CheeseD  1.59     0.38  
1|2     -5.36     0.52  
2|3     -4.32     0.42  
3|4     -3.23     0.36  
4|5     -2.18     0.32  
5|6     -0.86     0.28  
6|7      0.07     0.26  
7|8      1.55     0.30  
8|9      3.09     0.40  
---
n = 208, k = 11 (including 8 intercepts)
residual deviance = 727.1, null deviance is not computed by polr
sims <- sim(Res.arm,n.sims=1000)
cosims <- coef(sims)
mycoef <- cbind(-cosims[,'7|8'],
    -cosims[,'7|8']+cosims[,grep('Cheese',colnames(cosims),value=TRUE)])
mycoef <- invlogit(mycoef)
coefplot(rev(apply(mycoef,2,mean)),rev(apply(mycoef,2,sd)),
    varnames=rev(levels(cheese2$Cheese)),
    main='Estimated Proportion Top 2 Box')

No comments:

Post a Comment