使用Pandas在Python中重新排列数据

时间:2018-04-23 08:08:55

标签: python pandas csv dataframe

我收到的格式如下:

Date ,20100423
Open ,1028.75
High ,1029.5
Low ,1026
Close ,1026
S1 ,1030.62082869339
R1 ,1033.6233971724
S2 ,1026.87917130661
R2 ,1023.8766028276
Date ,20100426
Open ,1037.75
High ,1040.5
Low ,1037
Close ,1038.75
S1 ,1043.86350963032
R1 ,1040.79138126515
S2 ,1031.63649036968
R2 ,1034.70861873485

需要按以下格式重新安排:

Date    Open    High    Low     Close   S1  R1  S2  R2
xx      xx      xx      xx      xx      xx  xx  xx  xx
xx      xx      xx      xx      xx      xx  xx  xx  xx

我如何在Python / Pandas中执行此操作?

1 个答案:

答案 0 :(得分:0)

这是一种方式。

<强>设置

from io import StringIO
import pandas as pd

mystr = StringIO("""Date ,20100423
Open ,1028.75
High ,1029.5
Low ,1026
Close ,1026
S1 ,1030.62082869339
R1 ,1033.6233971724
S2 ,1026.87917130661
R2 ,1023.8766028276
Date ,20100426
Open ,1037.75
High ,1040.5
Low ,1037
Close ,1038.75
S1 ,1043.86350963032
R1 ,1040.79138126515
S2 ,1031.63649036968
R2 ,1034.70861873485""")

读取文件并重组数据框

# read csv file with flexible separator
df = pd.read_csv(mystr, sep='\s*,\s*', engine='python',
                 header=None, names=['col', 'value'])

# create dataframe through iterating rows in chunks
res = pd.DataFrame([df.iloc[i*9:(i+1)*9, 1].tolist() for i in range(int(len(df.index)/9))],
                   columns=df.iloc[:9, 0].values)

# convert date column to datetime
res['Date'] = pd.to_datetime(res['Date'], format='%Y%m%d').dt.normalize()

print(res)

#         Date     Open    High     Low    Close           S1           R1  \
# 0 2010-04-23  1028.75  1029.5  1026.0  1026.00  1030.620829  1033.623397   
# 1 2010-04-26  1037.75  1040.5  1037.0  1038.75  1043.863510  1040.791381   

#             S2           R2  
# 0  1026.879171  1023.876603  
# 1  1031.636490  1034.708619