我有一个数据框说prod_score:
product score
a 1
d 2
ff 2
e 3
fvf 1
我有另一个数据帧prod_rank与相同的产品+他们的等级prod_rank:
product rank
a 11
d 4
ff 1
e 5
fvf 9
只是为了澄清我有很多观察,这就是我展示样本数据的原因。
使用得分2过滤所有产品:
library(dplyr)
prod_scr_2 <- prod_score %>% filter(score == 2)
现在我想根据prod_rank df获取prod_scr_2产品并更新分数:
我使用过join:
decision_tbl <- inner_join(prod_scr_2, prod_rank, by = "product") %>%
top_n(2,desc(rank))
现在我正在考虑decision_tbl$product
,并希望仅更新获得最高排名的产品的分数。
我用匹配来做到这一点:
prods2update_idx <- match(decision_tbl$product, prod_score$product)
鉴于匹配索引我正在尝试更新prod_score数据帧,请告知我该怎么做?
答案 0 :(得分:1)
假设感兴趣的分数为2(如您在示例中所述),并且最高等级的产品的更新分数为100.可以更改。
这是一个dplyr
解决方案,因为我看到你开始使用这个包了:
library(dplyr)
prod_score = read.table(text = "
product score
a 1
d 2
ff 2
e 3
fvf 1
", header = T, stringsAsFactors = F)
prod_rank = read.table(text = "
product rank
a 11
d 4
ff 1
e 5
fvf 9
", header = T, stringsAsFactors = F)
prod_score %>%
filter(score == 2) %>% # select products with score = 2
inner_join(prod_rank, by = "product") %>% # join to get ranks
filter(rank == max(rank)) %>% # keep product(s) with maximum ranks
rename(given_score = score) %>% # change column name (for the next join)
right_join(prod_score, by = "product") %>% # join to get scores
mutate(score = ifelse(!is.na(rank), 100, score)) %>% # update score when there's a rank value
select(-given_score, -rank) # remove unnecessary columns
# product score
# 1 a 1
# 2 d 100
# 3 ff 2
# 4 e 3
# 5 fvf 1
基础R中的替代方法。请记住重新构建初始示例数据集:
# get products with score = 2
prod_score$product[prod_score$score == 2] -> prds_score_2
# get ranks for those products
prod_rank[prod_rank$product %in% prds_score_2,] -> prds_score_2_ranks
# keep products with maximum rank to update
prds_score_2_ranks$product[prds_score_2_ranks$rank == max(prds_score_2_ranks$rank)] -> prds_to_update
# update values for those products in your initial table
prod_score$score[prod_score$product %in% prds_to_update] = 100
# see the updates
prod_score
# product score
# 1 a 1
# 2 d 100
# 3 ff 2
# 4 e 3
# 5 fvf 1