suppose have following dataframe:
df <- data.frame("yearmonth"=c("2005-01","2005-02","2005-03","2005-01","2005-02","2005-03"),"state"=c(1,1,1,2,2,2),"county"=c(3,3,3,3,3,3),"unemp"=c(4.0,3.6,1.4,3.7,6.5,5.4)) i'm trying create lag unemployment within each unique state-county combination. want end this:
df2 <- data.frame("yearmonth"=c("2005-01","2005-02","2005-03","2005-01","2005-02","2005-03"),"state"=c(1,1,1,2,2,2),"county"=c(3,3,3,3,3,3),"unemp"=c(4.0,3.6,1.4,3.7,6.5,5.4),"unemp_lag"=c(na,4.0,3.6,na,3.7,6.5)) now, imagine situation except thousands of different county-state combinations , on several years. tried using lag function, zoo.lag function, couldn't make take account state-county codes. 1 possibility make giant loop, think data (r not handle loops well) , looking cleaner way it. ideas? thanks!
with data.table:
library(data.table) setdt(df)[,`:=`(unemp_lag1=shift(unemp,n=1l,fill=na, type="lag")),by=.(state, county)][] yearmonth state county unemp unemp_lag1 1: 2005-01 1 3 4.0 na 2: 2005-02 1 3 3.6 4.0 3: 2005-03 1 3 1.4 3.6 4: 2005-01 2 3 3.7 na 5: 2005-02 2 3 6.5 3.7 6: 2005-03 2 3 5.4 6.5
Comments
Post a Comment