客观
我有两个数据集:df1和df2
df1
Date Name Duration
1/2/2020 Tanisha 50
1/3/2020 Lisa 10
1/5/2020 Lisa 10
df2
Date Name Duration
1/2/2020 Tanisha 80
1/3/2020 Lisa 50
1/5/2020 Tom 10
所需的输出:
Date Name Duration Date Name Duration
1/2/2020 Tanisha 50 1/2/2020 Tanisha 80
1/3/2020 Lisa 10 1/3/2020 Lisa 50
我希望将名称列中的内容与df1和df2以及日期列进行匹配
df1和df2的Dput:
structure(list(Date = structure(1:3, .Label = c("1/2/2020", "1/3/2020",
"1/5/2020"), class = "factor"), Name = structure(c(2L, 1L, 1L
), .Label = c("Lisa", "Tanisha"), class = "factor"), Duration = c(50L,
10L, 10L), X = c(NA, NA, NA), X.1 = c(NA, NA, NA), X.2 = c(NA,
NA, NA), X.3 = c(NA, NA, NA)), class = "data.frame", row.names = c(NA,
-3L))
structure(list(Date = structure(1:3, .Label = c("1/2/2020", "1/3/2020",
"1/5/2020"), class = "factor"), Name = structure(c(2L, 1L, 3L
), .Label = c("lisa", "tanisha", "tom"), class = "factor"), Duration2 = c(80L,
50L, 10L)), class = "data.frame", row.names = c(NA, -3L))
我尝试过的事情:
水平合并
merge(df1, df2, all.x=True)
我不确定如何匹配名称和日期内容
感谢您的帮助。
答案 0 :(得分:2)
这是一个简单的合并,但是您的Name
列不一致。将它们转换为相似的格式(大写,小写或标题大小写),然后合并。同样,Date
和Name
不需要重复的列,因为它们具有完全相同的信息。
library(dplyr)
df1 %>% mutate(Name = tolower(Name)) %>% inner_join(df2, by = c('Date', 'Name'))
或在基数R中:
merge(transform(df1, Name = tolower(Name)), df2, by = c('Date', 'Name'))
# Date Name Duration Duration2
#1 1/2/2020 tanisha 50 80
#2 1/3/2020 lisa 10 50