我收到的格式如下:
Date ,20100423
Open ,1028.75
High ,1029.5
Low ,1026
Close ,1026
S1 ,1030.62082869339
R1 ,1033.6233971724
S2 ,1026.87917130661
R2 ,1023.8766028276
Date ,20100426
Open ,1037.75
High ,1040.5
Low ,1037
Close ,1038.75
S1 ,1043.86350963032
R1 ,1040.79138126515
S2 ,1031.63649036968
R2 ,1034.70861873485
需要按以下格式重新安排:
Date Open High Low Close S1 R1 S2 R2
xx xx xx xx xx xx xx xx xx
xx xx xx xx xx xx xx xx xx
我如何在Python / Pandas中执行此操作?
答案 0 :(得分:0)
这是一种方式。
<强>设置强>
from io import StringIO
import pandas as pd
mystr = StringIO("""Date ,20100423
Open ,1028.75
High ,1029.5
Low ,1026
Close ,1026
S1 ,1030.62082869339
R1 ,1033.6233971724
S2 ,1026.87917130661
R2 ,1023.8766028276
Date ,20100426
Open ,1037.75
High ,1040.5
Low ,1037
Close ,1038.75
S1 ,1043.86350963032
R1 ,1040.79138126515
S2 ,1031.63649036968
R2 ,1034.70861873485""")
读取文件并重组数据框
# read csv file with flexible separator
df = pd.read_csv(mystr, sep='\s*,\s*', engine='python',
header=None, names=['col', 'value'])
# create dataframe through iterating rows in chunks
res = pd.DataFrame([df.iloc[i*9:(i+1)*9, 1].tolist() for i in range(int(len(df.index)/9))],
columns=df.iloc[:9, 0].values)
# convert date column to datetime
res['Date'] = pd.to_datetime(res['Date'], format='%Y%m%d').dt.normalize()
print(res)
# Date Open High Low Close S1 R1 \
# 0 2010-04-23 1028.75 1029.5 1026.0 1026.00 1030.620829 1033.623397
# 1 2010-04-26 1037.75 1040.5 1037.0 1038.75 1043.863510 1040.791381
# S2 R2
# 0 1026.879171 1023.876603
# 1 1031.636490 1034.708619