i'm new on r , need until can't do:
i have data frame have random number of columns, need mantain in each column of data frame unique values, have done independent of other columns:
for example, if have below data frame:
column_a column_b column_c 1 a1 2 a2 b 1 a3 b 2 a4 c 3 a5 c 4 a6 the output this, after code must be:
column_a column_b column_c 1 a1 b 2 a2 c 3 a3 4 a4 a5 a6 i've tried ds <- unique(ds) leave unique relations between columns.
i apreciate or orientation gave me.
thanks in advance.
data
`> str(df) 'data.frame': 6 obs. of 3 variables: $ a: chr "a" "b" "c" "a" ... $ b: num 1 2 1 2 3 4 $ c: chr "a1" "a2" "a3" "a4" ...` loop
`i <- 1` `while (i < ncol(df)){ + df[i] <- lapply(df, function(x) { + x[duplicated(x)] <- '' + c(x[x!=''], x[x==''])}) + <- i+1 +}`
if there 'factor' columns, better convert character or include '' 1 of levels of factor column. here, changing factor columns character first.
indx <- sapply(df1, is.factor) df1[indx] <- lapply(df1[indx], as.character) loop columns lapply, replace duplicated elements '', arrange elements empty strings @ end (c(x[x=''],x=='']))
df1[] <- lapply(df1, function(x) { x[duplicated(x)] <- '' c(x[x!=''], x[x==''])}) df1 # column_a column_b column_c #1 1 a1 #2 b 2 a2 #3 c 3 a3 #4 4 a4 #5 a5 #6 a6 or option use match
df1[] <- lapply(df1, function(x) c(x[match(unique(x),x)], rep('', length(x)-length(unique(x))))) note: using '' change numeric column classes 'character/factor' class. may better replace na can deleted custom functions is.na/na.omit/complete.cases etc..
data
df1 <- structure(list(column_a = structure(c(1l, 1l, 2l, 2l, 3l, 3l), .label = c("a", "b", "c"), class = "factor"), column_b = c(1l, 2l, 1l, 2l, 3l, 4l), column_c = structure(1:6, .label = c("a1", "a2", "a3", "a4", "a5", "a6"), class = "factor")), .names = c("column_a", "column_b", "column_c"), row.names = c(na, -6l), class = "data.frame")
Comments
Post a Comment