Question

我有一个数据框df：

         Filename         Weight
0  '\file path\file.txt'    NaN
1  '\file path\file.txt'    NaN
2  '\file path\file.txt'    NaN

我有一个函数，我输入文件名，它从文件中提取我的浮点值。我想要的是从Filename中的每一行df调用文件路径到我的函数中，然后将数据输出到Weight列。我目前的代码是：

df['Weight'] = df['Weight'].apply(x_wgt_pct(df['filename'].to_string()), axis = 1)

我的错误是：

pandas\parser.pyx in pandas.parser.TextReader.__cinit__ (pandas\parser.c:3173)()

pandas\parser.pyx in pandas.parser.TextReader._setup_parser_source (pandas\parser.c:5912)()

IOError: File 0      file0.txt
1      file1.txt
2      file2.txt
3      file3.txt does not exist

不确定这个错误是否是bc它是以字符串形式同时调用所有文件路径，或者我没有正确输入文件路径。

Answer 1

to_string从列中创建一个字符串，这不是您想要的：

In [11]: df['Filename'].to_string()
Out[11]: "0  '\\file    path\\file.txt'\n1  '\\file    path\\file.txt'\n2  '\\file    path\\file.txt'"

假设x_wgt_pct是获取文件路径并返回浮点数的函数...您可以遍历条目：

for i, f in enumerate(df["Filename"]):
    weight = x_wgt_pct(f)  # Note: you may have to slice off the 's i.e. f[1:-1]
    df.ix[i, "Weight"] = weight

注意：如果您有重复的行索引，则必须进一步注意。

从Pandas数据帧中的单元格中提取字符串

1 个答案: