i had issue using ifelse() in function solved in stackoverflow thread. after implementing suggestions code performed desired. code below
country_panel <- function(x, y) { ifelse(cnames$time < y, cnames[match(x, cnames$country),]$panel, cnames[match(x, cnames$country),]$standardize ) } generate fake data this
countryname <- c("viet nam", "viet nam", "viet nam", "viet nam", "viet nam") year <- c(1974, 1975, 1976, 1977,1978) df <- data.frame(countryname, year, stringsasfactors=false) country <- c("vietnam, north", "vietnam, n.", "vietnam north", "viet nam", "democratic republic of vietnam") standardize <- c("vietnam, democratic republic of", "vietnam, democratic republic of", "vietnam, democratic republic of", "vietnam, democratic republic of", "vietnam, democratic republic of") panel <- c("vietnam", "vietnam","vietnam","vietnam","vietnam") time <- c(1976,1976,1976,1976,1976) cnames <- data.frame(country, standardize, panel, time, stringsasfactors = false) evaluate using function using
d1 <- df %>% mutate(new_name = country_panel(countryname, year)) however, when went implement suggestions real data problem returned function not evaluate condition in ifelse statement , returns $panel value.
because using stringsasfactors = false in data.frame worked fake data thought using read.csv(path, stringsasfactors = false) work instead of using read_csv both perform equally.
i should note checked attributes of each vector in data frame using str() , forced them match found in fake data.
the real data , scripts replicate can found on github here
here dput(head(cnames))
structure(list(country = c("afghanistan", "afghanistan", "albania", "albania", "albania", "algeria"), standardize = c("afghanistan", "afghanistan", "albania", "albania", "albania", "algeria"), time = c(2015l, 2015l, 2015l, 2015l, 2015l, 2015l), panel = c("afghanistan", "afghanistan", "albania", "albania", "albania", "algeria")), .names = c("country", "standardize", "time", "panel"), class = c("tbl_df", "data.frame" ), row.names = c(na, -6l)) and dput(head(d1))
structure(list(countryname = c("afghanistan", "afghanistan", "afghanistan", "afghanistan", "afghanistan", "afghanistan"), year = 1970:1975), .names = c("countryname", "year"), class = c("tbl_df", "data.frame"), row.names = c(na, -6l))
d1 <- df %>% mutate(new_name = country_panel(countryname, year)) df2 <- structure(list(country = c("afghanistan", "afghanistan", "albania", "albania", "albania", "algeria"), standardize = c("afghanistan", "afghanistan", "albania", "albania", "albania", "algeria"), time = c(2015l, 2015l, 2015l, 2015l, 2015l, 2015l), panel = c("afghanistan", "afghanistan", "albania", "albania", "albania", "algeria")), .names = c("country", "standardize", "time", "panel"), class = c("tbl_df", "data.frame" ), row.names = c(na, -6l)) d2 <- df2 %>% mutate(new_name = country_panel(countryname, year)) this gives:
error: wrong result size (5), expected 6 or 1 the immediate problem mutate expects country_panel return 6 values since df2 has 6 rows (dim(df2)), or, alternatively, 1 value recycle needed. first example made data in fact works because number of rows happen match.
try running example again after running:
debug(country_panel) ... # after done: undebug(country_panel) this give line line view of function called, , can check out objects exist or created within function runs (exit anytime q).
instead of using ifelse might better use sequential matching, first country , time. or try making data frame out of x , y vectors passed function, merging cnames, , picking name want conditions within data frame.
Comments
Post a Comment