按时间和方式转换数据类

时间:2018-03-13 00:17:20

标签: pandas datetime grouping transform

我的数据帧如下:

('CSRPGPrimary=', '//PG PRIMARY ("VDD") ("VSS") () () ("A X M SEL") ("Y")'

每小时共有15个课程和每个小时的课数

我希望按照以下计数每小时将数据转换为按列进行,如下所示

输出请求

         DateTime       Class   Count
0   2017-10-01 00:00:00 1       0
1   2017-10-01 00:00:00 2       240
2   2017-10-01 00:00:00 3       17
3   2017-10-01 00:00:00 4       0
4   2017-10-01 00:00:00 5       1
5   2017-10-01 00:00:00 6       0
6   2017-10-01 00:00:00 7       0
7   2017-10-01 00:00:00 8       0
8   2017-10-01 00:00:00 9       0
9   2017-10-01 00:00:00 10      0
10  2017-10-01 00:00:00 11      0
11  2017-10-01 00:00:00 12      0
12  2017-10-01 00:00:00 13      0
13  2017-10-01 00:00:00 14      0
14  2017-10-01 00:00:00 15      0
..............................
30  2017-10-01 01:00:00 1       0
31  2017-10-01 01:00:00 2       209
32  2017-10-01 01:00:00 3       14
33  2017-10-01 01:00:00 4       0
34  2017-10-01 01:00:00 5       4
35  2017-10-01 01:00:00 6       0
36  2017-10-01 01:00:00 7       0
37  2017-10-01 01:00:00 8       0
38  2017-10-01 01:00:00 9       0
39  2017-10-01 01:00:00 10      0
40  2017-10-01 01:00:00 11      0
41  2017-10-01 01:00:00 12      0
42  2017-10-01 01:00:00 13      0
43  2017-10-01 01:00:00 14      0
44  2017-10-01 01:00:00 15      0
....... and so on

2 个答案:

答案 0 :(得分:0)

您可以使用pandas将数据读入pd.Dataframe(),通过使用条件切片数据框来选择每个类的计数,然后使用datetime作为索引来连接数据:

import pandas as pd

# create dataframe from file
df = pd.read_csv('fname')
# or from numpy array
df = pd.Dataframe(data=np_array, columns=['DateTime', 'Class', 'Count'])

# select the counts for each class
df_c1 = df[df.Class == 1]
df_c2 = df[df.Class == 2] 
df_c3 = df[df.Class == 3]
df_c4 = df[df.Class == 4]

df_new = pd.Dataframe()
df_new['DateTime'] = df_c1['DateTime']
df_new['Class1'] = df_c1['Count']
df_new['Class2'] = df_c2['Count']
df_new['Class3'] = df_c3['Count']
df_new['Class4'] = df_c4['Count']

代码示例非常脏,我可能错过了很多,但也许它会给你一个灵感。我还建议您查看concat()Dataframe()

的pandas文档

我明天将审查并重构我的示例代码,以防问题尚未解决。与此同时,你可以在你的问题中修复数据的布局,这是不可读的。

答案 1 :(得分:0)

尝试pivot_table

(df.pivot_table(index='DateTime',columns='Class',
               values='Count',
               aggfunc='sum')
  .add_prefix('Class_'))