我有一个数据框:
DROP TRIGGER RG_SQLLighthouse_DDLTrigger ON ALL SERVER;
我想复制上面的数据框,取值范围,年份和月份。
例如:
speciality_id speciality_name
1 Acupuncturist
2 Andrologist
3 Anaesthesiologist
4 Audiologist
5 Ayurvedic Doctor
6 Biochemist
7 Biophysicist
我想产生一个如下数据框:
year = [2018]
Month = [1,2]
我无法想到一种方法。正确的做法是什么?
答案 0 :(得分:2)
对所有组合使用product
,并通过左连接创建DataFrame
和merge
:
year = [2018]
Month = [1,2]
from itertools import product
df1 = pd.DataFrame(list(product(year, Month, df['speciality_id'])),
columns=['Year','Month','speciality_id'])
print (df1)
Year Month speciality_id
0 2018 1 1
1 2018 1 2
2 2018 1 3
3 2018 1 4
4 2018 1 5
5 2018 1 6
6 2018 1 7
7 2018 2 1
8 2018 2 2
9 2018 2 3
10 2018 2 4
11 2018 2 5
12 2018 2 6
13 2018 2 7
df = df1.merge(df, on='speciality_id', how='left')
print (df)
Year Month speciality_id speciality_name
0 2018 1 1 Acupuncturist
1 2018 1 2 Andrologist
2 2018 1 3 Anaesthesiologist
3 2018 1 4 Audiologist
4 2018 1 5 Ayurvedic Doctor
5 2018 1 6 Biochemist
6 2018 1 7 Biophysicist
7 2018 2 1 Acupuncturist
8 2018 2 2 Andrologist
9 2018 2 3 Anaesthesiologist
10 2018 2 4 Audiologist
11 2018 2 5 Ayurvedic Doctor
12 2018 2 6 Biochemist
13 2018 2 7 Biophysicist
答案 1 :(得分:0)
您可以通过pd.MultiIndex.from_product
计算笛卡尔积,然后与平铺的数据框合并:
year = [2018]
month = [1, 2]
# calculate Cartesian product and repeat by number of rows in dataframe
cart_prod = pd.MultiIndex.from_product([year, month], names=['year', 'month'])
# tile dataframe and join year_month index
res = df.loc[np.tile(df.index, len(year) * len(month))]\
.set_index(cart_prod.repeat(df.shape[0])).reset_index()
print(res)
year month speciality_id speciality_name
0 2018 1 1 Acupuncturist
1 2018 1 2 Andrologist
2 2018 1 3 Anaesthesiologist
3 2018 1 4 Audiologist
4 2018 1 5 AyurvedicDoctor
5 2018 1 6 Biochemist
6 2018 1 7 Biophysicist
7 2018 2 1 Acupuncturist
8 2018 2 2 Andrologist
9 2018 2 3 Anaesthesiologist
10 2018 2 4 Audiologist
11 2018 2 5 AyurvedicDoctor
12 2018 2 6 Biochemist
13 2018 2 7 Biophysicist
答案 2 :(得分:0)
我希望可以帮助您。
# A: Create the new columns
df['Year'], df['Month'] = 2018, None
# A: Create the two new DataFrame
df1 = df.copy()
df2 = df.copy()
# A: Edith the month in both DataFrames
df1['Month'], df2['Month'] = 1, 2