我正在尝试将ASCII文件逐行读取到Pandas DataFrame中。
我写了以下脚本:
import pandas as pd
col_labels = ['Sg', 'Krg', 'Krw', 'Pc']
df = pd.DataFrame(columns=col_labels)
f = open('EPS.INC', 'r')
for line in f:
if 'SGWFN' in line:
print('Reading relative permeability table')
for line in f:
line = line.strip()
if (line.split() and not line.startswith('/') and not line.startswith('--')):
cols = line.split()
print(repr(cols))
df=df.append(cols)
print('Resulting Dataframe')
print(df)
我正在解析的文件是这样的:
SGWFN
--Facies 1 Drainage SATNUM 1
--Sg Krg Krw J
0.000000 0.000000 1.000000 0.000000
0.030000 0.000000 0.500000 0.091233
0.040000 0.000518 0.484212 0.093203
0.050000 0.001624 0.468759 0.095237
/
我希望为每个数据帧行添加四个值。而是将它们添加为列,如下所示:
Resulting Dataframe
Sg Krg Krw Pc 0
0 NaN NaN NaN NaN 0.000000
1 NaN NaN NaN NaN 0.000000
2 NaN NaN NaN NaN 1.000000
3 NaN NaN NaN NaN 0.000000
4 NaN NaN NaN NaN 0.030000
5 NaN NaN NaN NaN 0.000000
6 NaN NaN NaN NaN 0.500000
有人可以向我解释我在做什么错吗?
谢谢! D
答案 0 :(得分:0)
我建议创建一个空列表L
并在循环中附加值,最后一次调用DataFrame构造函数:
L = []
#better for correct close file
with open("EPS.INC") as f:
for line in f:
if 'SGWFN' in line:
print('Reading relative permeability table')
for line in f:
line = line.strip()
if (line.split() and not line.startswith('/') and not line.startswith('--')):
cols = line.split()
print(repr(cols))
L.append(cols)
print('Resulting Dataframe')
col_labels = ['Sg', 'Krg', 'Krw', 'Pc']
df = pd.DataFrame(L, columns=col_labels)
print(df)
Sg Krg Krw Pc
0 0.000000 0.000000 1.000000 0.000000
1 0.030000 0.000000 0.500000 0.091233
2 0.040000 0.000518 0.484212 0.093203
3 0.050000 0.001624 0.468759 0.095237
您的解决方案应通过在Series
后面加上指定的索引来更改:
col_labels = ['Sg', 'Krg', 'Krw', 'Pc']
df = pd.DataFrame()
f = open('EPS.INC', 'r')
for line in f:
if 'SGWFN' in line:
print('Reading relative permeability table')
for line in f:
line = line.strip()
if (line.split() and not line.startswith('/') and not line.startswith('--')):
cols = line.split()
print(repr(cols))
df=df.append(pd.Series(cols, index=col_labels), ignore_index=True)
print('Resulting Dataframe')
print(df)
Krg Krw Pc Sg
0 0.000000 1.000000 0.000000 0.000000
1 0.000000 0.500000 0.091233 0.030000
2 0.000518 0.484212 0.093203 0.040000
3 0.001624 0.468759 0.095237 0.050000