我的数据看起来像
Time Pressure Normal/Abnormal
11/30/2011 22:50 74.3 0
11/30/2011 23:00 74.8 1
11/30/2011 23:10 77.7 1
11/30/2011 23:30 74.8 0
11/30/2011 13:00 80.9 0
Desired Output:
Time Normal Time Abnormal
11/30/2011 22:50 74.3 11/30/2011 23:00 74.8
11/30/2011 23:30 74.8 11/30/2011 23:10 77.7
11/30/2011 13:00 80.9
我想像在“期望的输出”中提到的那样转置行。我知道我需要使用类似于melt and cast(在R中使用)的东西,但是不确定如何使用它们。
答案 0 :(得分:0)
使用上面的数据
import pandas as pd
from io import StringIO
import itertools
text = u'Time \t Pressure\tNormal/Abnormal\n11/30/2011 22:50\t74.3\t 0\n11/30/2011 23:00\t74.8\t 1\n11/30/2011 23:10\t77.7\t 1\n11/30/2011 23:30\t74.8\t 0\n11/30/2011 13:00\t80.9\t 0'
df = pd.read_table(StringIO(text))
normal = df.loc[df['Normal/Abnormal'] == 0].as_matrix()
abnormal = df.loc[df['Normal/Abnormal'] == 1].as_matrix()
columns = ["Time", "Normal", "Time", "Abnormal"]
out = []
for nr, ar in itertools.izip_longest(normal, abnormal, fillvalue=['', '']):
# Concat rows horizontally (i.e. hstack)
r = list(nr[:2]) + list(ar[:2])
out.append(r)
df2 = pd.DataFrame(out, columns=columns)
print df2.to_string(index=False)
''' Output
Time Normal Time Abnormal
11/30/2011 22:50 74.3 11/30/2011 23:00 74.8
11/30/2011 23:30 74.8 11/30/2011 23:10 77.7
11/30/2011 13:00 80.9
'''
答案 1 :(得分:0)
构造两个数据帧,其中1个表示正常,1个表示异常,然后concat并编辑列名
out = pd.concat([
df[df['Normal/Abnormal'] == k].iloc[:, [0,1]].reset_index(drop=True)
for k in [0, 1]], axis=1
)
out.columns = ['Time', 'Normal', 'Time', 'Abnormal']
out
Time Normal Time Abnormal
0 11/30/2011 22:50 74.3 11/30/2011 23:00 74.8
1 11/30/2011 23:30 74.8 11/30/2011 23:10 77.7
2 11/30/2011 13:00 80.9 NaN NaN