熊猫将唯一值求和,并放入表格中

时间:2019-06-19 12:19:01

标签: python pandas

我试图弄清楚如何在大熊猫中建立表格,让大熊猫计算出唯一值,并从Excel工作表中检索出来。

表:

|--------------|--------------------|
|  location    |   signal           |
|--------------|--------------------|
|  New York    |  Vehicle 20 open   |
|  New York    |  Vehicle 22 open   |
|  Washington  |  Vehicle 20 open   |
|  Washington  |  Vehicle 21 open   |
|  New York    |  Vehicle 20 open   |
|  New York    |  Vehicle 22 open   |
|  Washington  |  Vehicle 20 open   |
|  Washington  |  Vehicle 21 open   |
|  New York    |  Vehicle 20 open   |
|  New York    |  Vehicle 22 open   |
|  Washington  |  Vehicle 20 closed |
|  Washington  |  Vehicle 21 closed |
|  New York    |  Vehicle 20 closed |
|  New York    |  Vehicle 22 closed |
|  Washington  |  Vehicle 20 closed |
|  Washington  |  Vehicle 21 closed |
|  New York    |  Vehicle 20 open   |
|  New York    |  Vehicle 20 open   |
|  New York    |  Vehicle 20 open   |
|--------------|--------------------|

我如何将其打印出来(并导出到Excel中)

|--------------|-------------------|------------------|
|  Alarmtype   |   Vehicle open    |  Vehicle Closed  | 
|--------------|-------------------|------------------|
|  New York    |      9            |      2           |
|  Washington  |      4            |      4           |
|--------------|-------------------|------------------|

所以我想统计每个事件(组)在每个位置发生的次数,并将其中一些汇总到表中

这是我尝试过的

top = df.groupby(['Location', 'Sign Descr']).count()

or

sorted = df.sort_values(["Location", "Sign Descr"]).groupby(['Location', 'Sign Descr']).nunique()

3 个答案:

答案 0 :(得分:4)

首先替换signal列中的数字,然后使用pd.pivot_table

df['signal'] = df['signal'].str.replace('([0-9])', '')

pd.pivot_table(df, index='location', columns='signal', aggfunc='size')

signal      Vehicle  closed  Vehicle  open
location                                  
New York                  2              9
Washington                4              4

如果要使用Alarmtype作为索引名称。添加rename_axis

pd.pivot_table(df, index='location', columns='signal', aggfunc='size').rename_axis('Alarmtype')
signal      Vehicle  closed  Vehicle  open
Alarmtype                                 
New York                  2              9
Washington                4              4

答案 1 :(得分:2)

另一个是crosstab的人:

pd.crosstab(df.location,df.signal.str.replace('\d+',''))

signal      Vehicle  closed  Vehicle  open
location                                  
New York                  2              9
Washington                4              4

答案 2 :(得分:0)

您也可以使用 groupby 数据透视进行设置。要尝试此操作,请找到下面的代码

import pandas as pd

data = pd.read_csv('c.csv')
print(data)



grp_data = data.groupby(by=['location','status']).count().reset_index()
print(grp_data)
grp_data.pivot(index='location',columns='status',values=['signal'])

原始数据:

      location  signal  status
0     New York      20    open
1     New York      22    open
2   Washington      20    open
3   Washington      21    open
4     New York      20    open
5     New York      22    open
6   Washington      20    open
7   Washington      21    open
8     New York      20    open
9     New York      22    open
10  Washington      20  closed
11  Washington      21  closed
12    New York      20  closed
13    New York      22  closed
14  Washington      20  closed
15  Washington      21  closed
16    New York      20    open
17    New York      20    open
18    New York      20    open

按输出分组:

     location  status  signal
0    New York  closed       2
1    New York    open       9
2  Washington  closed       4
3  Washington    open       4

最终输出:

           signal
status  closed  open
location        
New York    2   9
Washington  4   4