我有两个data.frame,我想在“d”data.frame中将“custMeanPrice”的值与“att.customer”data.frame中的相应值分配为“customerID”
> head(d)
customerID custMeanPrice
1 794 0
2 794 0
3 794 0
4 808 0
5 825 0
6 825 0
> dim(d)
[1] 428,165 2
> head(att.customer)
customerID meanPrice
1 794 68.91000
2 808 39.90000
3 825 79.34444
4 850 76.18571
5 860 93.72353
6 873 69.90000
> dim(att.customer)
[1] 49,870 2
我尝试使用以下来处理它,但它很慢,因为尺寸很大,并且无法看到结束。
for(i in 1:nrow(att.customer)){
k = which(d$customerID == att.customer$customerID[i])
d$custMeanPrice[k] = att.customer$meanPrice[i]
}
以快速和智能的方式执行此操作的最佳方法是什么?
答案 0 :(得分:3)
您可以尝试data.table
library(data.table)
setDT(d)
setkey(setDT(att.customer), customerID)
att.customer[d][,custMeanPrice:=NULL][]
# customerID meanPrice
#1: 794 68.91000
#2: 794 68.91000
#3: 794 68.91000
#4: 808 39.90000
#5: 825 79.34444
#6: 825 79.34444
来自@David Arenburg的评论,上述内容也可以在不将att.customer
转换为data.table
的情况下完成
setkey(setDT(d), customerID)[, custMeanPrice := as.numeric(custMeanPrice)]
d[att.customer, custMeanPrice := meanPrice][]
# customerID custMeanPrice
#1: 794 68.91000
#2: 794 68.91000
#3: 794 68.91000
#4: 808 39.90000
#5: 825 79.34444
#6: 825 79.34444
或者custMeanPrice
已经numeric
类
setkey(setDT(d), customerID)
d[att.customer, custMeanPrice := meanPrice][]
或者您可以使用match
base R
d$custMeanPrice <- att.customer$meanPrice[match(d$customerID,
att.customer$customerID)]
d
# customerID custMeanPrice
#1 794 68.91000
#2 794 68.91000
#3 794 68.91000
#4 808 39.90000
#5 825 79.34444
#6 825 79.34444
d <- structure(list(customerID = c(794L, 794L, 794L, 808L, 825L, 825L
), custMeanPrice = c(0L, 0L, 0L, 0L, 0L, 0L)), .Names = c("customerID",
"custMeanPrice"), class = "data.frame", row.names = c("1", "2",
"3", "4", "5", "6"))
att.customer <- structure(list(customerID = c(794L, 808L, 825L, 850L, 860L, 873L
), meanPrice = c(68.91, 39.9, 79.34444, 76.18571, 93.72353, 69.9
)), .Names = c("customerID", "meanPrice"), class = "data.frame", row.names =
c("1", "2", "3", "4", "5", "6"))