Question

我有一个非常大的数据集（测试），大约有100万行。我想从数据集中更新一列（“日期”）。我只想在“日期”列中输入3个日期：

2014-04-01, 2014-05-01, 2014-06-01

因此，一行中的每个日期以及每第3行之后的日期都是重复的。

我已经尝试过了：

for i in range(0,len(test),3):

    if(i <= len(test)):

       test['Date'][i] = '2014-04-01'

       test['Date'][i+1] = '2014-05-01'

       test['Date'][i+2] = '2014-06-01'

我收到此警告：

__main__:3: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
__main__:4: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
__main__:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

我已通过链接，但无法解决我的问题。而且我已经在Google上进行了搜索，在切片之前得到了诸如copy（）数据集之类的一些解决方案，而其他解决方案却无济于事。

Answer 1

我相信您想要的是np.tile：

from math import ceil

dates = pd.Series(['2014-04-01', '2014-05-01', '2014-06-01'], dtype='datetime64[ns]')

repeated_dates = np.tile(dates, len(df) // 3 + 1)[:len(df)]

df['dates'] = repeated_dates

这将创建一个包含重复值的Series，并将其分配给数据框的一列。

Answer 2

您还可以查看itertools islice和cycle，这使您可以在数据帧的长度上循环列表或序列。

dates = pd.Series(['2014-04-01', '2014-05-01', '2014-06-01'], dtype='datetime64[ns]')
df = pd.DataFrame(np.random.randint(0,50,50).reshape(10,5))

from itertools import islice,cycle
df['dates'] = list(islice(cycle(dates),len(df)))
print(df)

    0   1   2   3   4      dates
0  45   3  13  24  13 2014-04-01
1  30  44   6  17  24 2014-05-01
2  47  22  16  28  12 2014-06-01
3  11  13  10   0  47 2014-04-01
4  32  12  49  14   2 2014-05-01
5  15   6  21  17  49 2014-06-01
6  49  49  28  18   9 2014-04-01
7  18  35  35  40   7 2014-05-01
8  44  15  13  49  28 2014-06-01
9   9  14  36  36   6 2014-04-01

带有SettingWithCopyWarning的熊猫

2 个答案: