Question

我有一个熊猫数据框，其中的“路径”列指定了图像文件的路径。

我还有一个函数f，例如“ get_info”，该函数进行api调用并输出（列表的）列表“ a”，“ b”和“ c”。

我现在想将f应用于我的数据框，并以创建带有输出'a'，'b'和'c'的新列作为回报。我想用10行的垃圾邮件来做，以保存中间结果（因为该函数使请求需要一些时间才能完成，并且数据帧有很多行）。

我想要的输出是一个像这样的数据帧：

        x             a                   b                    c      
35      'path1'       [[some lists]]       [[some lists]]      [[some lists]]      
1       'path2'       NaN                  NaN                 NaN       
362     'path3'       [[some lists]]       [[some lists]]      [[some lists]]

我尝试了以下代码：

df['a']=np.nan
df['b']=np.nan
df['c']=np.nan

for i in range(0, len(df),10):
    try:
        df[['a','b','c']].iloc[i:(i+10)]= df['path'].iloc[i:(i+10)].apply(get_info).apply(pd.Series)
    except:
        print("Unexpected error:", sys.exc_info()[1])

但是，这会产生一个数据框，其中的列“ a”，“ b”，“ c”都仅用NaN填充，并返回“ SettingWithCopyWarning”。

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexing.py:543: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self.obj[item] = s

但是，如果我只运行右侧：

df['path'].iloc[0:(0+10)].apply(get_info).apply(pd.Series)

输出与预期的一样（具有“ a”，“ b”，“ c”列的数据框）

类似地，使用以下代码将函数一次应用于整个数据帧时，输出与预期的一样：

df['a']=np.nan
df['b']=np.nan
df['c']=np.nan

for i in range(0, len(df),10):
    try:
        df[['a','b','c']]= df['path'].apply(get_info).apply(pd.Series)

为什么，输出没有按我期望的那样写到数据帧的列中，如何解决该问题？我尝试使用警告文档中的提示，但仍然没有成功。

将函数的多个输出写入数据框中的列：SettingWithCopyWarning

0 个答案: