i facing issue of simple problem. data have contain following variables :bcsid id dd mm day. personal identifier, id-day idenfifier, calendar day, calendar month , day of week. dd_flag variable need create in order correct dd date wrong because not increment according day day.
my data
bcsid id dd mm day 200 b10011q b10011q2 24 10 2 201 b10011q b10011q2 24 10 2 202 b10011q b10011q2 24 10 2 203 b10011q b10011q2 24 10 2 204 b10011q b10011q2 24 10 2 205 b10011q b10011q2 24 10 2 206 b10011q b10011q2 24 10 2 207 b10011q b10011q3 24 10 3 208 b10011q b10011q3 24 10 3 209 b10011q b10011q3 24 10 3 210 b10011q b10011q3 24 10 3 211 b10011q b10011q3 24 10 3 212 b10011q b10011q3 24 10 3 213 b10011q b10011q3 24 10 3 214 b10011q b10011q3 24 10 3 i create dd_flag variable based on dd
dtadate$dd_flag <- as.numeric(dtadate$dd) what need increment +1 th dd_flag variable each time day day change each identifier bcsid.
i thought simpler use collapsed id id loop.
1
i tried r loop not sure why solution wrong
for(i in 2:nrow(dtadate)){ if( dtadate$id[i] == dtadate$id[i-1] ) { dtadate$dd_flag[i] = dtadate$dd_flag[i] + 1 } } 2
i tried rcpp solution, gives me correct output. here used bcsid , day.
the incrementation correct unfortunately not re-use incremented value rest of loop.
#include <rcpp.h> using namespace rcpp; // [[rcpp::export]] numericvector timeaddonecpp(charactervector idday, charactervector day, numericvector time) { int n = idday.size(); int len = n ; ( int = 1; < len; ++i ) { if( ( idday[i] == idday[i - 1] ) & ( day[i] != day [i - 1] ) ) time[i] = time[i-1] + 1; } return time; } the function
timeaddonecpp(idday = dtadate$bcsid, day = dtadate$day, time = dtadate$dd_flag) expected output
the output want following
bcsid id dd mm day dd_flag 200 b10011q b10011q2 24 10 2 24 201 b10011q b10011q2 24 10 2 24 202 b10011q b10011q2 24 10 2 24 203 b10011q b10011q2 24 10 2 24 204 b10011q b10011q2 24 10 2 24 205 b10011q b10011q2 24 10 2 24 206 b10011q b10011q2 24 10 2 24 207 b10011q b10011q3 24 10 3 25 208 b10011q b10011q3 24 10 3 25 209 b10011q b10011q3 24 10 3 25 210 b10011q b10011q3 24 10 3 25 211 b10011q b10011q3 24 10 3 25 212 b10011q b10011q3 24 10 3 25 213 b10011q b10011q3 24 10 3 25 214 b10011q b10011q3 24 10 3 25 215 b10011q b10011q3 24 10 3 25 216 b10011q b10011q3 24 10 3 25 217 b10011q b10011q3 24 10 3 25 218 b10011q b10011q3 24 10 3 25 219 b10011q b10011q3 24 10 3 25 220 b10011q b10011q4 24 10 4 26 ... so each time day change each bcsid, dd_flag based on dd should incremented +1.
the data
dta = structure(list(bcsid = c("b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10011q", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10015u", "b10017w", "b10017w", "b10017w", "b10017w", "b10017w", "b10017w", "b10017w", "b10017w", "b10017w", "b10017w", "b10017w", "b10017w", "b10017w", "b10017w", "b10017w", "b10017w", "b10017w", "b10017w", "b10017w"), id = c("b10011q2", "b10011q2", "b10011q2", "b10011q2", "b10011q2", "b10011q2", "b10011q2", "b10011q3", "b10011q3", "b10011q3", "b10011q3", "b10011q3", "b10011q3", "b10011q3", "b10011q3", "b10011q3", "b10011q3", "b10011q3", "b10011q3", "b10011q3", "b10011q4", "b10011q4", "b10011q4", "b10011q4", "b10011q4", "b10011q4", "b10011q4", "b10011q4", "b10011q4", "b10011q4", "b10011q5", "b10011q5", "b10015u1", "b10015u1", "b10015u1", "b10015u1", "b10015u1", "b10015u1", "b10015u1", "b10015u1", "b10015u1", "b10015u1", "b10015u1", "b10015u1", "b10015u1", "b10015u2", "b10015u2", "b10015u2", "b10015u2", "b10015u2", "b10015u2", "b10015u2", "b10015u2", "b10015u2", "b10015u2", "b10015u2", "b10015u2", "b10015u2", "b10015u2", "b10015u2", "b10015u2", "b10015u3", "b10015u3", "b10015u3", "b10015u3", "b10015u3", "b10015u3", "b10015u3", "b10015u3", "b10015u3", "b10015u4", "b10015u4", "b10015u4", "b10015u4", "b10015u4", "b10015u4", "b10015u4", "b10015u4", "b10015u4", "b10015u4", "b10015u4", "b10015u4", "b10017w1", "b10017w1", "b10017w1", "b10017w1", "b10017w1", "b10017w1", "b10017w1", "b10017w1", "b10017w1", "b10017w1", "b10017w1", "b10017w1", "b10017w1", "b10017w1", "b10017w1", "b10017w1", "b10017w1", "b10017w1", "b10017w1"), dd = c("24", "24", "24", "24", "24", "24", "24", "24", "24", "24", "24", "24", "24", "24", "24", "24", "24", "24", "24", "24", "24", "24", "24", "24", "24", "24", "24", "24", "24", "24", "24", "24", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "13", "13", "13", "13", "13", "13", "13", "13", "13", "13", "13", "13", "13", "13", "13", "13", "13", "13", "13"), mm = c("10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "8", "6", "6", "6", "6", "6", "6", "6", "6", "6", "6", "6", "6", "6", "6", "6", "6", "6", "6", "6"), day = c("2", "2", "2", "2", "2", "2", "2", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "4", "4", "4", "4", "4", "4", "4", "4", "4", "4", "5", "5", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "3", "3", "3", "3", "3", "3", "3", "3", "3", "4", "4", "4", "4", "4", "4", "4", "4", "4", "4", "4", "4", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1")), .names = c("bcsid", "id", "dd", "mm", "day"), row.names = 200:300, class = "data.frame")
library(dplyr) dta %>% group_by(bcsid) %>% mutate(dd_flag = c(0, cumsum(diff(as.integer(day))))+as.integer(dd)) # source: local data frame [101 x 6] # groups: bcsid # # bcsid id dd mm day dd_flag # 1 b10011q b10011q2 24 10 2 24 # 2 b10011q b10011q2 24 10 2 24 # 3 b10011q b10011q2 24 10 2 24 # 4 b10011q b10011q2 24 10 2 24 # 5 b10011q b10011q2 24 10 2 24 # 6 b10011q b10011q2 24 10 2 24 # 7 b10011q b10011q2 24 10 2 24 # 8 b10011q b10011q3 24 10 3 25 # 9 b10011q b10011q3 24 10 3 25 # 10 b10011q b10011q3 24 10 3 25 # .. ... ... .. .. ... ...
Comments
Post a Comment