Question

我正在处理一个包含每小时流量拥堵数据的时间序列。有一个类别列，在需要我映射和替换的特定标签的地方有很多零。该列对应于基于时间戳（该年的月份和日期）存在的假期。但是，由于缺少其余行的信息，因此数据只能填充一行。我已经能够不用使用.loc和.iloc来填充这些行，但是熊猫在这里给出了通常的SettingWithCopyWarning。我想看到一个更好的替代代码块的方法，它可以平稳地执行任务而不返回此警告。我使用的代码非常简单：

for row in range(len(main_data[main_data.Year == 2012])):
    if main_data.Month[row] == 12 and main_data.Day[row] == 25:  # Christmas
       if main_data.is_holiday[row] == 0:
          main_data.is_holiday[row] = 3  # Label mapped for Xmas

__main__:4: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

# Before reset
print(len(main_data.query("is_holiday == 3 and Year == 2012")))
1  # This is supposed to be 24 for hourly timestamps for 12-25-2012

# After reset
print(len(main_data.query("is_holiday == 3 and Year == 2012")))
24

因此，尽管上面的代码成功地用适当的标签填充了所有零，但每次都会返回警告。我已经查看了.loc和.iloc的一些文档，并且尝试了一下，但得到了KeyError：

for row in range(len(main_data.loc[main_data.Year == 2012])):
    if main_data.Month[row] == 11 and main_data.Day[row] == 22:
        main_data.loc[main_data.is_holiday == 0, row] = 2

KeyError: "None of [Int64Index([0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\n            ...\n            0, 0, 0, 0, 0, 0, 0, 0, 0, 0],\n           dtype='int64', length=36508)] are in the [columns]"

我对熊猫中的索引工作原理有所了解，但是如何在循环中实现呢？

在熊猫中执行此获取，设置任务的更好方法是什么？

0 个答案: