如何基于另一列求和一列?

时间:2018-10-18 21:28:46

标签: r

假定数据帧存储为<html> <table id = "mytable"> <tr> <td> Cats/dogs</td> </tr> </table> <body> <script> var cats = [ "a", "b", "c", "d", "e", "f"]; var dogs = [ "z", "y", "x", "w"] var numberOfRows = cats.length; var numberOfColumns = dogs.length; for (var i = 0; i < numberOfRows; i++){ var column = "<td>" + cats[i] + "</td>"; for (var j = 0; j < numberOfColumns; j++){ column += "<td id =" cats.concat(dogs) "></td>; } $("#mytable").append("<tr>" + column + "</tr>"); } </script> </body> </html> ,并且格式如下:

fruit

在排除空白值的情况下,如何根据州对总值求和。但同时,应将每个州的总值相加,而不是针对CitrusFruit,PomeFruit等单独显示。

我尝试使用

State           Fruit Category         Fruit Type         Gross Value
ACT             CitrusFruit            Mandarins          $4,500,000
ACT             CitrusFruit            Oranges            
NSW             PomeFruit              Apple              $139,130,203.50
NSW             Grapes                 Wine Production    $50,000,000
NSW             OrchardStoneFruit      Avocados           $10,031,123
QLD             CitrusFruit            Oranges

方法无济于事。

任何帮助将不胜感激。

编辑: 我尝试使用以下方法:

library(plyr)
counts

但是,我收到一条错误消息:

library(dplyr)
fruit %>% 
  group_by(State) %>% 
  summarise(Gross = sum(Gross))

编辑: Evaluation Error: 'sum' not meaningful for factors.

的输出
dput(fruit)

2 个答案:

答案 0 :(得分:2)

这里有几个问题:

  • 您的数据中没有Gross Value,您有Gross.Value
  • 该列为factor,这是一种更具存储效率的字符串形式。 factorcharacter都无法sum被治疗。 R对记账一无所知,因此"$"在这种情况下毫无意义。

尝试一下:

library(dplyr)
someData %>%
  mutate(Gross.Value = as.numeric(gsub("[^0-9.]", "", as.character(Gross.Value)))) %>%
  group_by(State) %>%
  summarize(Gross.Value = sum(Gross.Value, na.rm=TRUE))
# # A tibble: 8 x 2
#   State Gross.Value
#   <fct>       <dbl>
# 1 ACT            0 
# 2 NSW    564400574.
# 3 NT      20133040.
# 4 QLD   1053007677.
# 5 SA     691850721.
# 6 TAS    112902970.
# 7 VIC   1069102796.
# 8 WA     281014929.

根据我的评论,唯一的变化是(1)使用正确的列名,以及(2)添加na.rm=TRUE,因为您有很多空白。这意味着您需要小心使用这些数据, 因为您的摘要中现在有偏差和不正确之处

答案 1 :(得分:0)

您应该将因子转换为数字,然后求和。这是我想出的解决方案:

library(tidyverse)

##This line converts the factor into a numeric variable, by making it a character and then removing the commas and the dollar sign. Finally it converts to number
fruit$`Gross Value` <- as.numeric(str_replace_all(as.character(fruit$`Gross Value`),"\\$|\\,",""))

##Then you can run your sum function


fruit %>% 
  group_by(State) %>% 
  summarise(Gross = sum(`Gross Value`, na.rm = TRUE))