Question

如何在数据框中重复n次值，同时在每次重复中添加新列

我已经尝试过将值重复n次，但是我不知道如何添加新列。这是我随机生成的温度的初始数据框-

我可以使用以下代码将其复制n次-

df2 = pd.DataFrame（np.repeat（df.values，2，axis = 0））

现在，我希望新的df拥有一个名为city的新列，并且每个新重复都添加以下列表中指定的其他值-

cities = ['Bangalore', 'Hyderabad'] //no. of cities will be same as n

expected output -
df2 = 
    temp city
0   30   Bangalore
1   40   Bangalore
2   50   Bangalore
3   60   Bangalore
4   30   Hyderabad
5   40   Hyderabad
6   50   Hyderabad
7   60   Hyderabad

我怎么能得到这个

Answer 1

使用`DataFrame.assign`和`pd.concat`：

我们遍历您的cities列表中的每个城市，并将其assign作为新列。然后，我们使用concat将单独的数据帧连接到一个最终的数据帧。

final = pd.concat([df1.assign(city=c) for c in cities], ignore_index=True)

输出

   temp       city
0    30  Bangalore
1    40  Bangalore
2    50  Bangalore
3    60  Bangalore
4    30  Hyderabad
5    40  Hyderabad
6    50  Hyderabad
7    60  Hyderabad

Answer 2

使用numpy.tile和numpy.repeat：

import pandas as pd
import numpy as np

temps = [30, 40, 50, 60]
cities = ['Bangalore', 'Hyderabad']

temp = np.tile(temps, len(cities))
city = np.repeat(cities, len(temps))
df = pd.DataFrame({"temp": temp, "city": city})

输出：

    temp    city
0   30  Bangalore
1   40  Bangalore
2   50  Bangalore
3   60  Bangalore
4   30  Hyderabad
5   40  Hyderabad
6   50  Hyderabad
7   60  Hyderabad

Answer 3

使用pandas.MultiIndex.from_product

pd.MultiIndex.from_product([df['temp'], cities], names=['temp', 'city']) \
    .to_frame(index=False) \
    .sort_values('city')

    temp    city
0   30  Bangalore
2   40  Bangalore
4   50  Bangalore
6   60  Bangalore
1   30  Hyderabad
3   40  Hyderabad
5   50  Hyderabad
7   60  Hyderabad

熊猫-每次添加一列如何重复n次数据框

3 个答案:

使用`DataFrame.assign`和`pd.concat`：

熊猫-每次添加一列如何重复n次数据框

3 个答案:

使用DataFrame.assign和pd.concat：

使用`DataFrame.assign`和`pd.concat`：