我在R中有两个数据框,如下所示......我需要在df1中添加新列(count_orders),其中包含df2中的订单数(或df2中的买方数)。 请帮忙。
> df1
buyer city
1 A xx
2 B yy
3 C zz
> df2
order buyer item
1 1 A 1
2 2 A 2
3 3 B 1
4 4 A 2
5 5 B 1
6 6 C 3
7 7 C 4
预期输出:
> df1
buyer city count_orders
1 A xx 3
2 B yy 2
3 C zz 2
答案 0 :(得分:3)
这是一个可能的data.table
解决方案,在df1
和df2
之间执行二进制连接,同时使用by = .EACHI
library(data.table)
setkey(setDT(df2), buyer)
df2[df1, list(city, count_orders = .N), by = .EACHI]
# buyer city count_orders
# 1: A xx 3
# 2: B yy 2
# 3: C zz 2
替代方法(修改@nicolas评论)可以(通过引用更新df1
)
library(data.table)
setkey(setDT(df1), buyer)
df1[setDT(df2)[, .N, keyby = buyer], count_orders := i.N]
答案 1 :(得分:2)
您可以尝试:
df1$count_orders<-as.vector(table(df2$buyer)[as.character(df1$buyer)])
# buyer city count_orders
#1 A xx 3
#2 B yy 2
#3 C zz 2
答案 2 :(得分:1)
这是一个dplyr方法:
library(dplyr)
count(df2, buyer) %>% right_join(df1, "buyer")
#Source: local data frame [3 x 3]
#
# buyer n city
#1 A 3 xx
#2 B 2 yy
#3 C 2 zz
您可以使用count(df2, buyer) %>% right_join(df1)
并让dplyr自行确定要加入的列(在这种情况下为“买方”)。