Question

我想用简单的语法

在pandas中实现这个简单的R代码

这里R代码

＆＃13;

> head(mtcars)
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
> mtcars$year <- c(1973, 1974)
> head(mtcars)
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb year
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4 1973
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4 1974
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1 1973
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1 1974
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2 1973
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1 1974

＆＃13;

正如您所见，列年份已添加到数据框中并填充了两个重复值，直到列结束

如何用简单的代码在pandas中实现这个目标

请注意，我不想在解决方案中使用for循环，因为如果我处理大数据集需要花费很多时间。

谢谢！

Answer 1

向Pandas DF添加列时，必须提供一个长度与DF中的行数匹配的对象（除非每个值都相同，在这种情况下为scalar value can be assigned to the column）。为此，您可以使用生成器表达式，该表达式重复列表中长度超过DF长度的元素，然后将其切片为正确的长度：

mtcars['year'] = ([1973, 1974] * (len(mtcars) // 2 + 1))[:len(mtcars)]

感谢MaxU获得此解决方案的灵感。

对于DF具有偶数行的情况，您可以简单地将列表元素重复到DF的长度：

mtcars['year'] = [1973, 1974] * (len(mtcars) // 2)

Answer 2

我建议：

$('ul#sub-menu > li').each(() => {
        const classProp = $(this).className;
        if (classProp.indexOf('select') >= 0) { //de-selecting
          const lastIndex = classProp.lastIndexOf(" ");
          $(this).className = classProp.substring(0, lastIndex);
        }
      });

它可能有点复杂，但它也适用于偶数行

Answer 3

使用numpy tile（比列表生成技术快得多）：

import numpy as np

years = (1973, 1974)
mtcars['year'] = np.tile(years, int(len(mtcars) / len(years)) + 1)[:len(mtcars)]

具有100万行数据帧的Numpy tile：

mtcars = pd.DataFrame(np.arange(1000000))

years = (1973, 1974)
mtcars['year'] = np.tile(years, int(len(mtcars) / len(years)) + 1)[:len(mtcars)]

CPU times: user 0 ns, sys: 4 ms, total: 4 ms
Wall time: 3.81 ms

具有100万行数据帧的列表生成：

mtcars['year'] = ([1973, 1974] * (len(mtcars) // 2 + 1))[:len(mtcars)]

CPU times: user 140 ms, sys: 0 ns, total: 140 ms
Wall time: 136 ms

向pandas数据框添加一个新列，并用2个值填充，直到列的末尾

3 个答案: