我们说我有以下pandas DataFrame:
df = pd.DataFrame({'one': ['Baseline', 5, 6], 'two': [10, 10, 10]})
print(df)
print(df.dtypes)
# one object
# two int64
我想收集df.one != 'Baseline'
中的所有行,然后将此新数据框中的one
列转换为int
数据类型。我认为以下情况可以正常工作,但当我尝试将SettingWithCopyWarning
投放到int
时,我收到one
投诉:
df_sub = df[df['one'] != 'Baseline']
df_sub['one'] = df_sub['one'].astype(int)
script.py:15. SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
df_sub['one'] = df_sub['one'].astype(int)
代码似乎工作正常(见下文),但我想知道如何避免这种警告(我应该使用不同的方法等)。我关注this question以更改特定列的数据类型。我还尝试了df_sub.loc[:, 'one'] = df_sub['one'].astype(int)
和df_sub.loc[:, 'one'] = df_sub.loc[:, 'one'].astype(int)
,我也遇到了同样的错误。
print(df_sub.dtypes)
# one int64
# two int64
答案 0 :(得分:5)
为了避免该警告,请复制您的数据框
df_sub = df[df['one'] != 'Baseline'].copy() # A deep copy of your dataframe otherwise it'll point to an existing one in memory.