Question

我有一个数据框（df）如下：

           0                     1                   2                     3   \
0        date  BBG.XASX.ABP.S_price  BBG.XASX.ABP.S_pos  BBG.XASX.ABP.S_trade   
1  2017-09-11             2.8303586                 0.0                   0.0   
2  2017-09-12             2.8135189             98570.0               98570.0   
3  2017-09-13             2.7829274             98570.0                   0.0   
4  2017-09-14             2.7928042             98570.0                   0.0   

                    4                            5   
0  BBG.XASX.ABP.S_cost  BBG.XASX.ABP.S_pnl_pre_cost   
1                 -0.0                          0.0   
2     -37.439355326355                          0.0   
3                 -0.0          -3015.4041549999965   
4                 -0.0            973.5561759999837

并设置了df.column：

Int64Index([ 0,  1,  2,  3,  4,  5], dtype='int64')

有人可以让我知道如何修改数据框，以便第0列是标题行吗？因此数据框看起来像：

        date  BBG.XASX.ABP.S_price  BBG.XASX.ABP.S_pos  BBG.XASX.ABP.S_trade   
0  2017-09-11             2.8303586                 0.0                   0.0   
1  2017-09-12             2.8135189             98570.0               98570.0   
2  2017-09-13             2.7829274             98570.0                   0.0   
3  2017-09-14             2.7928042             98570.0                   0.0   

   BBG.XASX.ABP.S_cost  BBG.XASX.ABP.S_pnl_pre_cost   
0                 -0.0                          0.0   
1     -37.439355326355                          0.0   
2                 -0.0          -3015.4041549999965   
3                 -0.0            973.5561759999837

和df.column设置为：

[date,BBG.XASX.ABP.S_price,BBG.XASX.ABP.S_pos,BBG.XASX.ABP.S_trade,BBG.XASX.ABP.S_cost,BBG.XASX.ABP.S_pnl_pre_cost]

创建数据框的代码（如下所示）：

for subdirname in glob.iglob('C:/Users/stacey/WorkDocs/tradeopt/'+filename+'//BBG*/tradeopt.is-pnl*.lzma', recursive=True):
            a = pd.DataFrame(numpy.zeros((0,27)))#data is 35 columns  
            row = 0

            with lzma.open(subdirname, mode='rt') as file:
                print(subdirname)
                for line in file:
                    items = line.split(",")
                    a.loc[row] = items
                    row = row+1
                    #a.columns = a.iloc[0]
            print(a.columns)    
            print(a.head())

谢谢

Answer 1

创建列表列表，并将所有列表传递给DataFrame构造函数，而不用out[1:]首先将其列名与out[0]一起传递：

out = []
with lzma.open(subdirname, mode='rt') as file:
    print(subdirname)
    for line in file:
        items = line.split(",")
        out.append(items)

a = pd.DataFrame(out[1:], columns=out[0])

Answer 2

我没有对此进行测试，但应该可以工作：

with lzma.open(subdirname. mode='rt') as file:
    df = pd.read_csv(file, sep=',', header=0)

此方法基于您的文件看起来像csv。

将数据帧的第0行设置为标题

2 个答案: