data.table - R: dynamically generate column-parsing code for dynamically named column in data table -
i trying move old code data frame implementation data table. obtain data .csv file, cells contain arrays converted character strings fread, so:
> mydata$sport[1] [1] "[24, 18, 24, 18]" i want parse these strings numeric arrays. here's i've got partly working first step (to rid of brackets, step 2, not shown here, convert numeric array):
> name = "ascent" > paste0(name, ":=strsplit(gsub('^\\[|\\]$','',", name, "),',')") [1] "ascent:=strsplit(gsub('^\\[|\\]$','',ascent),',')" #here manually copy result of paste0 datatable command #i want automate setup, can put in loop #for many names > mydata[, ascent:=strsplit(gsub('^\\[|\\]$','',ascent),',')] > mydata$ascent[10] [[1]] [1] "-999" " -999" so command generate make modification good, have many names want for, don't want copy , paste hand, necessary above. tried using eval trick discussed here dynamic column names in data.table, r
but once introduce eval code doesn't work:
> name = "ascent" > mydata[, eval(paste0(name, ":=strsplit(gsub('^\\[|\\]$','',", name, "),',')"))] [1] "ascent:=strsplit(gsub('^\\[|\\]$','',ascent),',')" so how can implement work arbitrary name without having create command hand each desired name via paste0? have entire vector of names modification.
here's data table right after fread , before making modifications:
> mydata[1:10, .(sport, ascent)] sport ascent 1: [24, 18, 24, 18] [-999, 140.0, -999, 140.0] 2: [2, 2, 2, 22] [-999, -999, -999, -999] 3: [-999, -999, -999, -999] [-999, -999, -999, -999] 4: [-999, -999] [173.0, 173.0] 5: [18, 18] [-999, -999] 6: [-999] [-999] 7: [-999] [-999] 8: [-999] [-999] 9: [-999, -999] [-999, -999] 10: [-999, -999] [-999, -999]
don't use names @ all...
for(j in which(names(mydata) %in% names)) set(mydata,i=null,j=j,value=strsplit(gsub('^\\[|\\]$','',mydata[[j]]),',')) as aside eval needs parse work way trying use it, example eval(parse(text=paste0(name,":=1+1")))
Comments
Post a Comment