我最近开始学习R。对于我的代码之一,我已经尝试调试此问题很长时间了。
我有如下数据:
dd
[,1] [,2] [,3]
[1,] "Category" "A" "B"
[2,] "ONE" "23" "45"
[3,] "TWO" "234" "23"
[4,] "THREE" "565" "324"
[5,] "FOUR" "676" "343"
[6,] "FIVE" "1231" "544"
我想为每一列添加行:一,三和五(类别中)。因此,输出将如下所示:
sum 1819 913
我尝试使用rowSums和sum。每次使用时都会出错。最常见的错误之一如下所示。
sum = rowSums(subset(dd, CATEGORY == 'ONE', 'THREE', 'FIVE'))
Error in rowSums(subset(spread_DNT_TXN, CATEGORY == "Invoiced")) :
'x' must be numeric
我正在寻找如何执行此功能。我在任何地方都找不到。
谢谢!
答案 0 :(得分:0)
首先,可以使用%in%
运算符和colSums
:
colSums( dd[dd$Category %in% c("ONE", "THREE", "FIVE"), c("A", "B") )
但是,我建议使用data.table
而不是data.frame
或dplyr
。我发现此程序包的切片和分组非常清楚。
首先,安装并加载data.table
:
install.packages("data.table")
library(data.table)
然后将您的旧data.frame
变成data.table
:
dd <- as.data.table(dd)
现在总和:
dd[Category %in% c("ONE", "THREE", "FIVE"), list(Sum_of_A = sum(A), Sum_of_B = sum(B))]
答案 1 :(得分:0)
我们可以在base R
中进行此操作。请注意,OP的数据集是matrix
,矩阵只能容纳一个类。如果有一个字符元素,则整个矩阵将转换为character
类。在这里,出于某种原因,标题是第一行,第一列是character
。一种选择是将数字列子集化,转换类型,然后为rowSums
m1 <- matrix(as.numeric(dd[-1, 2:3]), ncol = 2)
i1 <- dd[-1, 1] %in% c("ONE", "THREE", "FIVE")
rowSums(m1[i1, ])
#[1] 68 889 1775
或者如果它需要按列求和
colSums(m1[i1, ])
#[1] 1819 913
dd <- cbind(c("Category", "ONE", "TWO", "THREE", "FOUR", "FIVE"),
c("A", 23, 234, 565, 676, 1231), c("B", 45, 23, 324, 343, 544))
答案 2 :(得分:0)
在base R中,您可以执行以下操作:
# Load your data first
dd <- read.table(header = TRUE, text = '
"Category" "A" "B"
"ONE" "23" "45"
"TWO" "234" "23"
"THREE" "565" "324"
"FOUR" "676" "343"
"FIVE" "1231" "544"')
# Summarize by selected catagories
colSums(subset(dd, Category %in% c("ONE", "THREE", "FIVE"), select = -Category))
# A B
#1819 913
或者使用aggregate
:
aggregate(cbind(A, B) ~ 1,
data = subset(dd, Category %in% c("ONE", "THREE", "FIVE")),
FUN = sum)
# A B
#1 1819 913
也许还有更多惯用语:
dd$ofInterest <- dd$Category %in% c("ONE", "THREE", "FIVE")
aggregate(cbind(A, B) ~ ofInterest, data = dd, FUN = sum)
# ofInterest A B
#1 FALSE 910 366
#2 TRUE 1819 913