dataframe - Union of data frame in R -
i have 4 dataframes in list l below:
l[[1]]: v1 v2 b c b z b l[[2]]: v1 v2 b d b z b l[[3]]: v1 v2 z y x z n z l[[4]]: v1 v2 z j x z n z
this come graph head c,d,y, , j. obviously, c , d same graph, y , j. how can merge c d , y j given these dataframes in list l?
what i'm thinking is, iterate list , pairwise comparison. if dfx intersect dfy merge. can r code?
edit: i'm thinking this: first element, compare second, if okay, merged , save first element, remove second element, move next element until last. repeat until remaining element not removed. this, list consist of remaining element has been merged know how implement in code? output expected :
l[[1]]: v1 v2 b c b d b z b l[[2]]: v1 v2 z y z j x z n z
could approach solution you?
# create list of data.frames ld <- list( data.frame(v1 = c("b","a","z"), v2 = c("c","b","b")), data.frame(v1 = c("b","a","z"), v2 = c("d","b","b")), data.frame(v1 = c("z","x","n"), v2 = c("y","z","z")), data.frame(v1 = c("z","x","n"), v2 = c("j","z","z")) ) # suggested solution union_ld <- data.table::rbindlist(ld) unique(union_ld)
results:
v1 v2 1: b c 2: b 3: z b 4: b d 5: z y 6: x z 7: n z 8: z j
update 1
quick hack: 2 data frames in list requested op. according comment of op, order of rows within each result data frame doesn't matter.
list( unique(data.table::rbindlist(ld[1:2])), unique(data.table::rbindlist(ld[3:4])) )
results in:
[[1]] v1 v2 1: b c 2: b 3: z b 4: b d [[2]] v1 v2 1: z y 2: x z 3: n z 4: z j
the proposed solution combines first 2 data frames in list 1 data frame, removes duplicate rows. repeated last 2 data frames in list. then, resulting data frames combined list again.
update 2
this solution uses rbindlist
package data.table
. if don't this, result can returned "pure" data frames this
library(data.table) list( setdf(unique(rbindlist(ld[1:2]))), setdf(unique(rbindlist(ld[3:4]))) )
update 3
according op's comment there more data frames need combined in several groups.
# set list of vectors of numbers of data.frames combine dfs_to_combine <- list(c(1:2), c(3:4)) dfs_to_combine [[1]] [1] 1 2 [[2]] [1] 3 4 # now, combine data.frames specified library(data.table) lapply(dfs_to_combine, function(x) setdf(unique(rbindlist(ld[x])))) [[1]] v1 v2 1 b c 2 b 3 z b 4 b d [[2]] v1 v2 1 z y 2 x z 3 n z 4 z j
this reproduce initial example. if want combine differently change numbers, e.g.,
dfs_to_combine <- list(c(1), c(2, 4), c(3))
Comments
Post a Comment