我有这个数据框:
listURL = "data_url:"+listURL+"";
并给出以下警告:
试图在DataFrame的切片副本上设置一个值。尝试 改用.loc [row_indexer,col_indexer] =值
我知道有一篇关于它的文章,但我不明白如何解决这个特殊情况。你能帮忙吗?
x = pd.read_csv(r'C:\Users\user\Desktop\Dataset.csv', sep = ',')
x['dates'] = pd.to_datetime(x['dates']) #turn column to datetime type
v = x[(x['proj'].str.contains('3'))] ### This part is causing the issue.
v['mnth_yr'] = v['dates'].apply(lambda x: x.strftime('%B-%Y'))
它仍然给出错误,是否可以在没有警告的情况下分配x = pd.read_csv(r'C:\Users\user\Desktop\Dataset.csv', sep = ',')
x.loc[:,'dates'] = pd.to_datetime(x['dates']) #turn column to datetime type
v = x[(x['proj'].str.contains('3'))] ###This part is causing the issue.
###And in the next line gives the warning, since it's a copy.
v.loc[:,'mnth_yr'] = v['dates'].apply(lambda x: x.strftime('%B-%Y'))
?
答案 0 :(得分:0)
您始终可以通过使用.loc并指定列和所有行来摆脱警告。例如,
x.loc[:, 'dates'] = pd.to_datetime(x['dates'])
...
v.loc[:, 'mnth_yr'] = v['dates'].apply(lambda x: x.strftime('%B-%Y'))
两者之间的区别在于,在您的示例中,x['dates']
返回满足条件列=='dates'(切片)的数据帧部分的副本。使用.loc时,它将重新调整切片,而不是副本。除非您尝试对数据进行嵌套切片,否则通常这不是问题。在这种情况下,不带.loc的嵌套切片将无法更新原始数据帧。在此处查看更多详细信息:
https://pandas.pydata.org/pandas-docs/stable/indexing.html#returning-a-view-versus-a-copy