我有一个最初看起来像这样的数据集
ContextID VariableID Timestamp Timestampms Value
7304693 516 2018-07-11 10:49:36 153 1.00000001335143e-10
7304693 516 2018-07-11 10:49:36 291 1.00000001335143e-10
7304693 516 2018-07-11 10:49:36 455 1.00000001335143e-10
7304693 517 2018-07-11 10:49:36 153 0.00266113295219839
7304693 517 2018-07-11 10:49:36 291 0.00266113295219839
7304693 517 2018-07-11 10:49:36 455 0.00236816401593387
7304693 517 2018-07-11 10:49:36 483 0.00236816401593387
我想透视数据集以使VariableID
作为单独的列,为此我不得不结合Timestamp
和Timestampms
来创建唯一值,我这样做是
data = pd.read_excel('Book1.xlsx', header = 0, parse_dates = [['Timestamp', 'Timestampms']])
data = data.rename(columns={'Timestamp_Timestampms': 'Time'})
data = data.pivot(index= 'Time', columns='VariableID', values='Value')
data = data.reset_index(level=0)
并获得以下数据框
Time 516 517
2018-07-11 10:49:36 153 1.00000001335143e-10 0.00266113295219839
2018-07-11 10:49:36 291 1.00000001335143e-10 0.00266113295219839
2018-07-11 10:49:36 455 1.00000001335143e-10 0.00236816401593387
2018-07-11 10:49:36 483 nan 0.00236816401593387
现在,我想要一些如何将Time
列分成2个不同列的帮助。第一个仅包含日期的列,第二个包含时间的列,其次是其他列,例如516
和517
。
Date Time_ms
2018-07-11 10:49:36_153
2018-07-11 10:49:36_291
2018-07-11 10:49:36_455
2018-07-11 10:49:36_483
2018-07-11 10:49:36_578
此外,我想将原始表中的ContextID
列设置为数据透视表的索引,并想知道该怎么做?
预先感谢
答案 0 :(得分:2)
将Series.str.split
与Series.str.replace
一起使用:
data = data.rename(columns={'Timestamp_Timestampms': 'Time'})
#added ContextID column
data = data.set_index(['ContextID','Time','VariableID'])['Value'].unstack()
data = data.reset_index()
data[['Time','Time_ms']] = data.Time.str.split(n=1, expand=True)
#python separator for ms is . (altarnative solution)
#data['Time_ms'] = data['Time_ms'].str.replace('\s+', '.')
data['Time_ms'] = data['Time_ms'].str.replace('\s+', '_')
c = ['ContextID','Time','Time_ms']
data = data[c + data.columns.difference(c).tolist()]
data = data.rename_axis(None, axis=1)
print (data)
ContextID Time Time_ms 516 517
0 7304693 2018-07-11 10:49:36_153 1.000000e-10 0.002661
1 7304693 2018-07-11 10:49:36_291 1.000000e-10 0.002661
2 7304693 2018-07-11 10:49:36_455 1.000000e-10 0.002368
3 7304693 2018-07-11 10:49:36_483 NaN 0.002368