我的数据帧如下:
('CSRPGPrimary=', '//PG PRIMARY ("VDD") ("VSS") () () ("A X M SEL") ("Y")'
每小时共有15个课程和每个小时的课数
我希望按照以下计数每小时将数据转换为按列进行,如下所示
输出请求
DateTime Class Count
0 2017-10-01 00:00:00 1 0
1 2017-10-01 00:00:00 2 240
2 2017-10-01 00:00:00 3 17
3 2017-10-01 00:00:00 4 0
4 2017-10-01 00:00:00 5 1
5 2017-10-01 00:00:00 6 0
6 2017-10-01 00:00:00 7 0
7 2017-10-01 00:00:00 8 0
8 2017-10-01 00:00:00 9 0
9 2017-10-01 00:00:00 10 0
10 2017-10-01 00:00:00 11 0
11 2017-10-01 00:00:00 12 0
12 2017-10-01 00:00:00 13 0
13 2017-10-01 00:00:00 14 0
14 2017-10-01 00:00:00 15 0
..............................
30 2017-10-01 01:00:00 1 0
31 2017-10-01 01:00:00 2 209
32 2017-10-01 01:00:00 3 14
33 2017-10-01 01:00:00 4 0
34 2017-10-01 01:00:00 5 4
35 2017-10-01 01:00:00 6 0
36 2017-10-01 01:00:00 7 0
37 2017-10-01 01:00:00 8 0
38 2017-10-01 01:00:00 9 0
39 2017-10-01 01:00:00 10 0
40 2017-10-01 01:00:00 11 0
41 2017-10-01 01:00:00 12 0
42 2017-10-01 01:00:00 13 0
43 2017-10-01 01:00:00 14 0
44 2017-10-01 01:00:00 15 0
....... and so on
答案 0 :(得分:0)
您可以使用pandas将数据读入pd.Dataframe(),通过使用条件切片数据框来选择每个类的计数,然后使用datetime作为索引来连接数据:
import pandas as pd
# create dataframe from file
df = pd.read_csv('fname')
# or from numpy array
df = pd.Dataframe(data=np_array, columns=['DateTime', 'Class', 'Count'])
# select the counts for each class
df_c1 = df[df.Class == 1]
df_c2 = df[df.Class == 2]
df_c3 = df[df.Class == 3]
df_c4 = df[df.Class == 4]
df_new = pd.Dataframe()
df_new['DateTime'] = df_c1['DateTime']
df_new['Class1'] = df_c1['Count']
df_new['Class2'] = df_c2['Count']
df_new['Class3'] = df_c3['Count']
df_new['Class4'] = df_c4['Count']
代码示例非常脏,我可能错过了很多,但也许它会给你一个灵感。我还建议您查看concat()和Dataframe()
的pandas文档我明天将审查并重构我的示例代码,以防问题尚未解决。与此同时,你可以在你的问题中修复数据的布局,这是不可读的。
答案 1 :(得分:0)
尝试pivot_table
:
(df.pivot_table(index='DateTime',columns='Class',
values='Count',
aggfunc='sum')
.add_prefix('Class_'))