循环数字范围并使用pandas或itertools附加到df.col

时间:2018-12-06 17:54:56

标签: python pandas loops apply

我想遍历dataframe列中的一系列数字。

data = {'NAME': ['Jason', 'Molly', 'Tina', 'Jake', 'Amy','Tina3', 'Jake2', 'Amy1','Jake3', 'Amy2' ], 
        'REPORTS': [4, 24, 31, 2, 3, 12, 13, 63, 22, 64]}

df = pd.DataFrame(data)
df['col'] = 0
range = [1,2,3]

我希望输出如下所示:

Jason  4    1
Molly  24   2
Tina   31   3
Jake   2    1
Amy    3    2

我尝试过:

for row in df['col']:
    d['col'].append(range)

df['col'] = df.apply(lambda x: df['col']+range)

2 个答案:

答案 0 :(得分:0)

IIUC,您可以使用itertools.cycle在数据帧长度的范围内循环:

from itertools import cycle

c = cycle(range(1,4))

df['new_column'] = [next(c) for _ in range(len(df))]

>>> df
    NAME  REPORTS  new_column
0  Jason        4           1
1  Molly       24           2
2   Tina       31           3
3   Jake        2           1
4    Amy        3           2
5  Tina3       12           3
6  Jake2       13           1
7   Amy1       63           2
8  Jake3       22           3
9   Amy2       64           1

一种替代方法是使用np.tile来重复您的范围,但这对我来说似乎不太可读:

df['new_column'] = pd.np.tile(range(1,4), (len(df)//3)+1)[:len(df)]

答案 1 :(得分:0)

lambdaaxis=1一起使用 示例代码是

import pandas as pd

data = {'NAME': ['Jason', 'Molly', 'Tina', 'Jake', 'Amy','Tina3', 'Jake2', 'Amy1','Jake3', 'Amy2' ], 
        'REPORTS': [4, 24, 31, 2, 3, 12, 13, 63, 22, 64]}
df = pd.DataFrame(data)
df['col'] = 0
range = [1,2,3]
df['col'] = df.apply(lambda x: range[x.name % len(range)], axis=1)
print(df)

输出为

NAME  REPORTS  col
0  Jason        4    1
1  Molly       24    2
2   Tina       31    3
3   Jake        2    1
4    Amy        3    2
5  Tina3       12    3
6  Jake2       13    1
7   Amy1       63    2
8  Jake3       22    3
9   Amy2       64    1
[Finished in 1.1s]