仅当两个其他列重复时才更改列值

时间:2017-03-01 21:36:04

标签: r

我很难在R中弄清楚这一点。

这就是我想做的。

在如下所示的数据框中,如果Name和Class重复添加两行的分数,我想这样做,如果没有,请保持原样。

+------------------+-----------+-------+
|       Name       |   Class   | Score |
+------------------+-----------+-------+
| Sara             | Sophomore |    10 |
| John             |  Freshman |    20 |
| Taylor           | Sophomore |    30 |
| Tyler            | Junior    |    10 |
| Keith            | Junior    |    20 |
| Andrew           | Senior    |    30 |
| Victor           | Senior    |    10 |
| Nancy            |Sophomore  |    20 |
| Taylor           | Junior    |    30 |
| John             | Senior    |    10 |
| Victor           | Freshman  |    20 |
| Sara             | Sophomore |    30 |
| John             | Freshman  |    10 |
| Taylor           | Sophomore |    20 |
| John             | Senior    |    30 |
+------------------+-----------+-------+

基本上,最终结果应如下所示:

+--------+-----------+-------+--+--+--+--+
|  Name  |   Class   | Score |  |  |  |  |
+--------+-----------+-------+--+--+--+--+
| Sara   | Sophomore |    40 |  |  |  |  |
| John   | Freshman  |    30 |  |  |  |  |
| Taylor | Sophomore |    50 |  |  |  |  |
| Tyler  | Junior    |    10 |  |  |  |  |
| Keith  | Junior    |    20 |  |  |  |  |
| Andrew | Senior    |    30 |  |  |  |  |
| Victor | Senior    |    10 |  |  |  |  |
| Nancy  | Sophomore |    20 |  |  |  |  |
| Taylor | Junior    |    30 |  |  |  |  |
| John   | Senior    |    40 |  |  |  |  |
| Victor | Freshman  |    20 |  |  |  |  |
+--------+-----------+-------+--+--+--+--+

如果您看到名称是唯一的重复值,则不会更改(John Freshman和John Senior的示例)。如果class是唯一的重复值,则它也不会更改...必须复制一行中的两列才能进行更改。

我的尝试如下,但它无法正常工作并收到错误消息

  

'如果((实验[i,1] ==实验[j,1])&(实验[i,2] ==:缺少值需要TRUE / FALSE'的错误

我的代码:

# creating an empty data frame


experiment1<-data.frame(matrix(ncol=3, nrow=15))
for(i in 1: nrow(experiment)){
for(j in i+1: nrow(experiment)){
if((experiment[i,1] == experiment[j,1]) & (experiment[i,2] == experiment[j,2])){ 
experiment1[i,1] <- experiment[i,1]
experiment1[i,2] <- experiment[i,2]
experiment1[i,3] <- experiment[i,3] + experiment[j,3]}
else{
experiment1[i,1] <- experiment[i,1]
experiment1[i,2] <- experiment[i,2] 
experiment1[i,3] <- experiment[i,3]}}}

任何人都可以帮忙修复我的代码或找出“更高贵”的代码吗?

1 个答案:

答案 0 :(得分:2)

聚合就像在任何基本的R教程中解释的第一个参数一样,我建议你去关注一些。

基础R

aggregate(formula = Score ~ Name + Class, data = mydf, FUN = sum)

dplyr

mydf %>% group_by(Name, Class) %>% summarize(scoreSum = sum(Score))

data.table

setDT(mydf)[ , .(scoreSum = sum(number)), by = .(Name, Class)]