在R

时间:2017-04-14 22:46:22

标签: r aggregate

这是银行对账单的一个示例部分:

Category<-c(
"Merchandise",
"Dining",
"Lodging",
"Other Services",
"Dining",
"Merchandise",
"Merchandise",
"Other Services",
"Entertainment",
"Merchandise",
"Merchandise",
"Internet",
"Other Services",
"Merchandise",
"Merchandise",
"Merchandise",
"Other Services",
"Phone/Cable",
"Airfare",
"Airfare",
"Other Services",
"Merchandise",
"Merchandise",
"Internet",
"Other Services",
"Other Services",
"Phone/Cable",
"Other Services",
"Healthcare"
)

Debit<-as.numeric(c(
"26.34",
"4.75",
"9.88",
"31.26",
"8.67",
"64.64",
"5.18",
"15.5",
"10",
"12.93",
"10.02",
"6.95",
"39.93",
"16.39",
"24",
"40.35",
"27.33",
"11.12",
"214.2",
"214.2",
"4",
"86.28",
"19.99",
"19.99",
"13.68",
"205",
"10.96",
"85",
"1525"
))

df<-data.frame(Category,Debit)

使用以下输出:

         Category   Debit
1     Merchandise   26.34
2          Dining    4.75
3         Lodging    9.88
4  Other Services   31.26
5          Dining    8.67
6     Merchandise   64.64
7     Merchandise    5.18
8  Other Services   15.50
9   Entertainment   10.00
10    Merchandise   12.93
11    Merchandise   10.02
12       Internet    6.95
13 Other Services   39.93
14    Merchandise   16.39
15    Merchandise   24.00
16    Merchandise   40.35
17 Other Services   27.33
18    Phone/Cable   11.12
19        Airfare  214.20
20        Airfare  214.20
21 Other Services    4.00
22    Merchandise   86.28
23    Merchandise   19.99
24       Internet   19.99
25 Other Services   13.68
26 Other Services  205.00
27    Phone/Cable   10.96
28 Other Services   85.00
29     Healthcare 1525.00

从那里,要查看我在各个类别中花费的总金额,例如&#34; Merchandise&#34;,我必须这样做:

> sum(df$Debit[which(df$Category=="Merchandise")])
[1] 306.12

但是逐个为每个类别做这件事很麻烦。我想知道是否有更简洁的方式来显示它,以便在一列中我得到df$Category列出的所有级别,在第二列中列出每个类别的总和。

这样的事情:

Merchandise 306.12
Other Services  421.7
Phone/Cable 22.08
etc...

有什么建议吗?

3 个答案:

答案 0 :(得分:2)

在基础R中,您可以使用aggregate()

aggregate(Debit ~ Category, df, FUN = sum)        

这给出了:

        Category   Debit
1        Airfare  428.40
2         Dining   13.42
3  Entertainment   10.00
4     Healthcare 1525.00
5       Internet   26.94
6        Lodging    9.88
7    Merchandise  306.12
8 Other Services  421.70
9    Phone/Cable   22.08

答案 1 :(得分:2)

还可以使用包#Custom max_connections=500 max_connect_errors=99999 expire_logs_days = 2 max_binlog_size = 1G #Replication purpose server-id=1 binlog-do-db=pencilm_running binlog-format = mixed log-bin=mysql-bin innodb_flush_log_at_trx_commit=1 sync_binlog=0 #Performance issue thread_cache_size=30 query_cache_size=256M query_cache_type=1 query_cache_limit=1M slow_query_log=1 group_concat_max_len = 5000 #join_buffer_size = 250000轻松解决:

data.table

答案 2 :(得分:1)

您还可以选择使用经典SQL组。 首先需要加载sqldf包。 库(sqldf)

$query = mysql_query("SELECT Alumni_ID, Password from 'Alumni' WHERE Alumni_ID='".$_POST['Alumni_ID']."'");

它会给你以下

sqldf (" select Category, sum(Debit) `debit_sum` from df group by Category ")