How can I merge 2 dataframes in special conditions [R] -


i have problem, complicated explain, think not hard solve, newbie in r... hope guys can me.

i have 2 dataframes:

-the first 1 consists on column portfolio1.

-the second 1 consists in 4 columns : date, portfolio2, simulation, value.

for each day there simulation each portfolio , value obtained.

i create new table or dataframe in have values of simulations, 1 day (this not important, because input data of 1 day) , each portfolio. therefore table have column each portfolio , row each simulation.

it important mention there values in portfolio1 won't exist in portfolio2. add column 0's in new table.

thanks in advance!

p.s. have tried merge them unsuccessfully, using comand merge.

the input data this:

portfolio1 port1 port2 port3  date          portfolio2      simulation     value 20150715      port1           simul1         12.6 20150715      port1           simul2         13.6 20150715      port1           simul3         14.6 20150715      port1           simul4         15.6 20150715      port1           simul5         16.6 20150715      port3           simul1         12.7 20150715      port3           simul2         13.7 20150715      port3           simul3         14.7 20150715      port3           simul4         15.7 20150715      port3           simul5         16.7 20150715      port4           simul1         12.8 20150715      port4           simul2         13.8 20150715      port4           simul3         14.8 20150715      port4           simul4         15.8 20150715      port4           simul5         16.8 

and resulting table:

        port1   port3   port2 simul1  12.6    12.7    0 simul2  13.6    13.7    0 simul3  14.6    14.7    0 simul4  15.6    15.7    0 simul5  16.6    16.7    0 

the problem don't know exact number of columns. determined depending on given input files.

i not sure understand question, here goes.

to reorganize data have row per simulation , column per portfolio can use tidyr::spread in following example:

> library(tidyr) > df1 <- data.frame(x=c(1,2,3,1,2,3), p=c(1,1,1,2,2,2), v = c(111,222,333,444,555,666)) > spread(df1, p, v)   x   1   2 1 1 111 444 2 2 222 555 3 3 333 666 

to missing portfolio added best think of along lines of:

df2 <- setnames(expand.grid(unique(df1$x), c(1,2,3)), c("x", "p"))

using x column simulation ids, , c(1,2,3) possible portfolio identifiers. using dplyr anti_join, can combinations didn't appear in original:

df3 <- anti_join(df2, df1, = c("x", "p")) 

add values want:

df3$v <- 0 

and add rows onto original

> rbind(df1, df3)   x p   v 1 1 1 111 2 2 1 222 3 3 1 333 4 1 2 444 5 2 2 555 6 3 2 666 7 3 3   0 8 2 3   0 9 1 3   0 

probably best ignore second part of answer, doesn't feel quite right. use of spread reorganizing data.frame think idea though.


Comments