读取固定宽度表(.txt文件)中的一个“单元格”,该表在python / pandas中分成两行

时间:2017-05-22 06:52:46

标签: python pandas

如何读取分成两行的固定宽度列的一个“单元格”?数据输入是固定宽度表,如此;

ID   Description                 QTY
1    Description split over      1
     two lines
2    Description on one line     2

我希望数据框格式化数据如下所示;

ID   Description                           QTY
1    Description split over two lines      1       
2    Description on one line               2

我目前的代码是;

import pandas as pd

df = pd.read_fwf('test.txt', names = ['ID', 'Description', 'QTY'])
df

但这给了我;

ID   Description                 QTY
1    Description split over      1
NaN  two lines                   NaN 
2    Description on one line     2

有什么想法吗?

1 个答案:

答案 0 :(得分:0)

#Conditionally concatenate description from next row to current row if the ID of next row is NAN>
df['Description'] = df.apply(lambda x: x.Description if x.name==(len(df)-1) else x.Description + ' ' + df.iloc[x.name+1]['Description'] if np.isnan(df.iloc[x.name+1]['ID']) else x.Description, axis=1)

#Drop rows with NA.
df = df.dropna()