data.table - R: dynamically generate column-parsing code for dynamically named column in data table -


i trying move old code data frame implementation data table. obtain data .csv file, cells contain arrays converted character strings fread, so:

> mydata$sport[1] [1] "[24, 18, 24, 18]" 

i want parse these strings numeric arrays. here's i've got partly working first step (to rid of brackets, step 2, not shown here, convert numeric array):

> name = "ascent" > paste0(name, ":=strsplit(gsub('^\\[|\\]$','',", name, "),',')") [1] "ascent:=strsplit(gsub('^\\[|\\]$','',ascent),',')"  #here manually copy result of paste0 datatable command  #i want automate setup, can put in loop  #for many names > mydata[, ascent:=strsplit(gsub('^\\[|\\]$','',ascent),',')] > mydata$ascent[10] [[1]] [1] "-999"  " -999" 

so command generate make modification good, have many names want for, don't want copy , paste hand, necessary above. tried using eval trick discussed here dynamic column names in data.table, r

but once introduce eval code doesn't work:

> name = "ascent" > mydata[, eval(paste0(name, ":=strsplit(gsub('^\\[|\\]$','',", name, "),',')"))] [1] "ascent:=strsplit(gsub('^\\[|\\]$','',ascent),',')" 

so how can implement work arbitrary name without having create command hand each desired name via paste0? have entire vector of names modification.

here's data table right after fread , before making modifications:

> mydata[1:10, .(sport, ascent)]                              sport                                                       ascent  1:               [24, 18, 24, 18]                                   [-999, 140.0, -999, 140.0]  2: [2, 2, 2, 22]                                                    [-999, -999, -999, -999]  3:       [-999, -999, -999, -999]                                     [-999, -999, -999, -999]  4:                   [-999, -999]                                               [173.0, 173.0]  5:                       [18, 18]                                                 [-999, -999]  6:                         [-999]                                                       [-999]  7:                         [-999]                                                       [-999]  8:                         [-999]                                                       [-999]  9:                   [-999, -999]                                                 [-999, -999] 10:                   [-999, -999]                                                 [-999, -999] 

don't use names @ all...

for(j in which(names(mydata) %in% names)) set(mydata,i=null,j=j,value=strsplit(gsub('^\\[|\\]$','',mydata[[j]]),',')) 

as aside eval needs parse work way trying use it, example eval(parse(text=paste0(name,":=1+1")))


Comments