我面临的挑战比我在这里意识到的要大:Merge contents from three dataframes into one column,因此是我的新问题。我有以下三个数据帧。
df1 <- data.frame(c("A", "B", "C", "D"),
c("text1", "texta", "textk", "textx"),
c("texti", "textI", "texti", "textI"))
names(df1) <- c('dummy_1', 'dummy_2', 'dummy_3')
df2 <- data.frame(c("A", "B", "C", "D"),
c("text2", "textb", "textl", "texty"),
c("textii", "textII", "textii", "textII"))
names(df2) <- c('dummy_1', 'dummy_2', 'dummy_3')
df3 <- data.frame(c("A", "B", "C", "D"),
c("text3", "textc", "textm", "textz"),
c("textiii", "textIII", "textiii", "textIII"))
names(df3) <- c('dummy_1', 'dummy_2', 'dummy_3')
如何将dfs df1
,df2
和df3
中的每一列dummy_2和每一列dummy_3的文本合并到由" \n "
分隔的一列中?所以理想的结果就是这个data.frame
:
dummy_1 dummy2_merge dummy3_merge
A text1 \n text2 \n text3 texti \n textii \n textiii
B texta \n textb \n textc textI \n textII \n textIII
C textk \n textl \n textm texti \n textii \n textiii
D textx \n texty \n textz textI \n textII \n textIII
谢谢您的任何建议。
答案 0 :(得分:3)
使用data.table
进行合并和替换
library(data.table)
setDT(df1);setDT(df2);setDT(df3)
df1[df2, on = .(dummy_1), `:=` (dummy_2 = paste0(dummy_2, ' \n ', i.dummy_2),
dummy_3 = paste0(dummy_3, ' \n ', i.dummy_3))][]
df1[df3, on = .(dummy_1), `:=` (dummy_2 = paste0(dummy_2, ' \n ', i.dummy_2),
dummy_3 = paste0(dummy_3, ' \n ', i.dummy_3))][]
产生
dummy_1 dummy_2 dummy_3
1: A text1 \n text2 \n text3 texti \n textii \n textiii
2: B texta \n textb \n textc textI \n textII \n textIII
3: C textk \n textl \n textm texti \n textii \n textiii
4: D textx \n texty \n textz textI \n textII \n textIII
答案 1 :(得分:2)
您可以使用 <div class="res-com">
<input class="res-com-pub" type="submit" value="Publish">
<input class="res-com-dec" type="submit" value="Cancel">
</div>
和dplyr
tidyr
有输出
library(dplyr)
library(tidyr)
df <- left_join(left_join(df1, df2, by='dummy_1'), df3, by='dummy_1') #combine into one dataframe based on dummy_1
df <- df %>% unite('dummy2_merge', grep('dummy_2', colnames(df), value = T), sep=' \n ') # unite columns that have dummy_2 in their colname
df <- df %>% unite('dummy3_merge', grep('dummy_3', colnames(df), value = T), sep=' \n ') # unite columns that have dummy_3 in their colname
希望这会有所帮助。
答案 2 :(得分:2)
通过R为底的想法可以是
d1 <- Reduce(function(...)merge(..., by = 'dummy_1'), list(df1, df2, df3))
sapply(unique(sub('\\..*', '', names(d1))), function(i)
do.call(paste, c(d1[grepl(i, names(d1))], sep = ' \n ')))
# dummy_1 dummy_2 dummy_3
#[1,] "A" "text1 \n text2 \n text3" "texti \n textii \n textiii"
#[2,] "B" "texta \n textb \n textc" "textI \n textII \n textIII"
#[3,] "C" "textk \n textl \n textm" "texti \n textii \n textiii"
#[4,] "D" "textx \n texty \n textz" "textI \n textII \n textIII"
答案 3 :(得分:2)
这里是使用tidyverse
library(tidyverse)
mget(paste0("df", 1:3)) %>%
reduce(left_join, by = 'dummy_1') %>%
split.default(str_remove(names(.), '\\..*$')) %>%
map_dfc(~ .x %>%
unite(!!rlang::sym(names(.)[length(.x)]),
!!! rlang::syms(names(.)), sep=" \n "))
# A tibble: 4 x 3
# dummy_1 dummy_2 dummy_3
# <chr> <chr> <chr>
#1 A "text1 \n text2 \n text3" "texti \n textii \n textiii"
#2 B "texta \n textb \n textc" "textI \n textII \n textIII"
#3 C "textk \n textl \n textm" "texti \n textii \n textiii"
#4 D "textx \n texty \n textz" "textI \n textII \n textIII"