如果这是重复的话,我深表歉意。我已经浏览了很多答案,但没有找到真正能解决我要尝试的问题的答案。
我有一个数据集,其中包含重复的名称,但不一定具有分配的帐号。例如:
df <- data.frame(Name = c("Hilton", "Comcast", "Comcast", "Comcast", "Froyos", "Froyos", "BigFive"),
Account = c("123", "456", NA, NA, "789", NA, "111"))
df
Name Account
1 Hilton 123
2 Comcast 456
3 Comcast <NA>
4 Comcast <NA>
5 Froyos 789
6 Froyos <NA>
7 BigFive 111
我想匹配名称以填写相关的帐号,所以我看起来像这样:
Name Account
1 Hilton 123
2 Comcast 456
3 Comcast 456
4 Comcast 456
5 Froyos 789
6 Froyos 789
7 BigFive 111
确保所有类都相同,我尝试制作一个单独的列表并使用ifelse
和%in%
,但未为该名称分配正确的值。我的代码如下:
library(dplyr)
df$Name <- as.character(df$Name)
df2$Name <- as.character(df2$Name)
df$Account <- as.numeric(as.character(df$Account))
df2$Account <- as.numeric(as.character(df2$Account))
df2 <- df %>%
filter(as.numeric(Account) > 0)
df3 <- within(df, {New = ifelse(df$Name %in% df2$Name,
df2$Account, NA)})
我觉得这应该很简单,但是我很难知道如何表达问题,以便正确地做到。任何帮助或指示将不胜感激。
答案 0 :(得分:1)
注意stringsAsFactors = F
df <- data.frame(Name = c("Hilton", "Comcast", "Comcast", "Comcast", "Froyos", "Froyos", "BigFive"),
Account = c("123", "456", NA, NA, "789", NA, "111"), stringsAsFactors = F)
df %>% group_by(Name) %>% mutate(Account = max(Account, na.rm = T)) %>% ungroup()