如何读取分成两行的固定宽度列的一个“单元格”?数据输入是固定宽度表,如此;
ID Description QTY
1 Description split over 1
two lines
2 Description on one line 2
我希望数据框格式化数据如下所示;
ID Description QTY
1 Description split over two lines 1
2 Description on one line 2
我目前的代码是;
import pandas as pd
df = pd.read_fwf('test.txt', names = ['ID', 'Description', 'QTY'])
df
但这给了我;
ID Description QTY
1 Description split over 1
NaN two lines NaN
2 Description on one line 2
有什么想法吗?
答案 0 :(得分:0)
#Conditionally concatenate description from next row to current row if the ID of next row is NAN>
df['Description'] = df.apply(lambda x: x.Description if x.name==(len(df)-1) else x.Description + ' ' + df.iloc[x.name+1]['Description'] if np.isnan(df.iloc[x.name+1]['ID']) else x.Description, axis=1)
#Drop rows with NA.
df = df.dropna()