我从头开始自学R,基本上是通过做某事,然后阅读这些帖子,以及基于此的反复试验。有时候我会撞到墙上伸手去拿。
我撞墙了。我安装了dplyr 0.7。我有一个列的字节 - 称之为contract_key
- 我通过将mutate(coalesce())应用于tibble中的其他三个列来添加。以下是示例数据:
product <- c("655393265191","655393265191","168145850127","168145850127","350468621217","350468621217","977939797847","NA","928893912852")
supplier <- c("person5","person3","person10","person5","person11","person5","person11","person14","person5")
vendor <- c("org2","org3","org3","org2","org1","org2","org1","org5","org2")
quantity <- c(7,5,6,1,2,1,18,2,2)
gross <- c(0.0419,0.0193,0.0439,0.0069,0.0027,0.0055,0.0233,NA,0.0004)
df <- data_frame(product,supplier,vendor,quantity,gross)
以下是我生成contract_key
:
df <- df %>%
mutate(contract_key = coalesce(product,supplier,vendor))
我现在想要添加另一个列,根据提供内容的三列中的哪一列(通过coalesce())对contract_key
的内容进行分类。因此,如果contract_key =&#34; person5&#34;,例如,新列contract_level将是&#34; supplier&#34;。而contract_key =&#34; org2&#34;将映射到contract_level =&#34;供应商&#34;等等。
基本上,我将contract_level
用作另一个组合的连接变量。
我很难过。我已经尝试了if_else
,我发现我不应该费心case_when
(因为它在mutate()中)。我也尝试过嵌套if_else
无济于事。
它可能是我不知道的基本R语法。与点符号和语法有关。如果有人提供答案,我会回溯直到我弄清楚你做了什么。 (而且我已经在R中学到了新的一课!)
谢谢!
答案 0 :(得分:2)
这个怎么样:
df %>% mutate(contract_key = coalesce(product,supplier,vendor),
contract_level = case_when(contract_key %in% product ~ "product",
contract_key %in% supplier ~ "supplier",
contract_key %in% vendor ~ "vendor",
TRUE ~ "none"))
product supplier vendor quantity gross contract_key contract_level 1 655393265191 person5 org2 7 0.0419 655393265191 product 2 655393265191 person3 org3 5 0.0193 655393265191 product 3 168145850127 person10 org3 6 0.0439 168145850127 product 4 168145850127 person5 org2 1 0.0069 168145850127 product 5 350468621217 person11 org1 2 0.0027 350468621217 product 6 350468621217 person5 org2 1 0.0055 350468621217 product 7 977939797847 person11 org1 18 0.0233 977939797847 product 8 <NA> person14 org5 2 NA person14 supplier 9 928893912852 person5 org2 2 0.0004 928893912852 product
需要较少代码的其他选项:
df %>% mutate(contract_key = coalesce(product,supplier,vendor),
contract_level = if_else(!is.na(product), 'product',
if_else(!is.na(supplier), 'supplier', 'vendor')))
df %>% mutate(contract_key = coalesce(product,supplier,vendor),
contract_level = apply(., 1, function(x) names(.)[min(which(!is.na(x)))]))