r - Use lapply() to find percentages of factor variables -


i have data frame consists of 4 columns represent questions, , each column 4 levels represent responses.

  q1 q2 1   2   b 3  b  b 4  c  c 5  d  d 

and i'd derive data.frame such this:

   question response percent 1       q2            0.2 2       q2        b     0.4 3       q2        c     0.2 4       q2        d     0.2 5       q1            0.4 6       q1        b     0.2 7       q1        c     0.2 8       q1        d     0.2 

so far, i've been achieving for loop, scripts riddled for loops i'd achieve using functions in reshape2 or lapply. instance code lot cleaner for loop still not quite i'm looking for. appreciated!

here's i've got far:

lapply(lapply(df, summary), function(x) x/sum(x)) 

edit: including example of data frame per request. afraid take space since level labels long, shortened them.

dput(df[1:4,]) structure(list(q1 = structure(c(4l, 4l, 1l, 4l), .label = c("1.a",      "1.b", "1.c", "1.d"), class = "factor"),      q2 = structure(c(4l, 4l, 4l, 1l), .label = c("2.a","2.b",     "2.c", "2.d"), class = "factor"),      q3 = structure(c(4l, 3l, 4l, 4l), .label = c("3.a","3.b",     "3.c","3.d"), class = "factor"),      q4 = structure(c(3l, 1l, 3l, 3l), .label = c("4.a","4.b",      "4.c","4.d")),      .names = c("q1.pre", "q2.pre", "q3.pre", "q4.pre"), row.names = c(na, 4l),      class = "data.frame") 

i've found combination of lafortune , user20650's responses has given me i've been looking for:

melt(sapply(df, function(x) prop.table(table(x)))) 

however there's 1 problem. @ sapply level, dimnames same label names of levels q1, , after performing melt output of sapply, var1 column repetition of q1s levels, whereas i'd var1 have q1's levels in q1 rows, q2's levels in q2 rows, etc. found workaround pulling levels of of columns separate variable qnames before performing operations on df so:

qnames = melt(sapply(df, levels)) qnames = qnames[ ,3] melt(sapply(df, function(x) prop.table(table(x)))) df = cbind(qnames, df) 

which result need. i'm interested see if there way achieve without sapply , cbind, i'll leave question open little longer. help!

one-liner using data.table:

library(data.table) # 1.9.5+ dt<-data.table(q1=c("a","a","b","c","d"),                q2=c("a","b","b","c","d"))  rbindlist(lapply(   names(dt),   function(x)dt[,.n/nrow(dt),by=x                 ][,.(question=x,response=get(x),percent=v1)])) 

Comments