我有一个数据框(df
)如下:
0 1 2 3 \
0 date BBG.XASX.ABP.S_price BBG.XASX.ABP.S_pos BBG.XASX.ABP.S_trade
1 2017-09-11 2.8303586 0.0 0.0
2 2017-09-12 2.8135189 98570.0 98570.0
3 2017-09-13 2.7829274 98570.0 0.0
4 2017-09-14 2.7928042 98570.0 0.0
4 5
0 BBG.XASX.ABP.S_cost BBG.XASX.ABP.S_pnl_pre_cost
1 -0.0 0.0
2 -37.439355326355 0.0
3 -0.0 -3015.4041549999965
4 -0.0 973.5561759999837
并设置了df.column
:
Int64Index([ 0, 1, 2, 3, 4, 5], dtype='int64')
有人可以让我知道如何修改数据框,以便第0列是标题行吗?因此数据框看起来像:
date BBG.XASX.ABP.S_price BBG.XASX.ABP.S_pos BBG.XASX.ABP.S_trade
0 2017-09-11 2.8303586 0.0 0.0
1 2017-09-12 2.8135189 98570.0 98570.0
2 2017-09-13 2.7829274 98570.0 0.0
3 2017-09-14 2.7928042 98570.0 0.0
BBG.XASX.ABP.S_cost BBG.XASX.ABP.S_pnl_pre_cost
0 -0.0 0.0
1 -37.439355326355 0.0
2 -0.0 -3015.4041549999965
3 -0.0 973.5561759999837
和df.column设置为:
[date,BBG.XASX.ABP.S_price,BBG.XASX.ABP.S_pos,BBG.XASX.ABP.S_trade,BBG.XASX.ABP.S_cost,BBG.XASX.ABP.S_pnl_pre_cost]
创建数据框的代码(如下所示):
for subdirname in glob.iglob('C:/Users/stacey/WorkDocs/tradeopt/'+filename+'//BBG*/tradeopt.is-pnl*.lzma', recursive=True):
a = pd.DataFrame(numpy.zeros((0,27)))#data is 35 columns
row = 0
with lzma.open(subdirname, mode='rt') as file:
print(subdirname)
for line in file:
items = line.split(",")
a.loc[row] = items
row = row+1
#a.columns = a.iloc[0]
print(a.columns)
print(a.head())
谢谢
答案 0 :(得分:3)
创建列表列表,并将所有列表传递给DataFrame构造函数,而不用out[1:]
首先将其列名与out[0]
一起传递:
out = []
with lzma.open(subdirname, mode='rt') as file:
print(subdirname)
for line in file:
items = line.split(",")
out.append(items)
a = pd.DataFrame(out[1:], columns=out[0])
答案 1 :(得分:0)
我没有对此进行测试,但应该可以工作:
with lzma.open(subdirname. mode='rt') as file:
df = pd.read_csv(file, sep=',', header=0)
此方法基于您的文件看起来像csv。