Question

我在矩阵搜索和计算方面遇到问题。对于每个指标和国家/地区，我都有一个具有不同年份值的df1。我有国家/地区和年度值组合的df2。期望的输出是df3，其中每个国家/地区组合都有一个新列，其中包含创建和计算的指标乘积。

我尝试了loc和split的一些操作，但无法实现。

Df1

df1 =pd.DataFrame({'Country':['Armenia','Azerbaidjan','Belarus','Armenia','Azerbaidjan','Belarus'],\
             'Indictaor':['G','G','G','H', 'H', 'H'],'2005':[3,4,5,6,7,4],'2006':[6,3,1,3,5,6]})

Df2

df2 = pd.DataFrame({'Year':[2005,2006,2005,2006],
                    'Country1':['Armenia','Armenia','Azerbaidjan','Azerbaidjan'],
                    'Country2': ['Belarus','Belarus','Belarus','Belarus']})

Df3

df3 = pd.DataFrame({'Year':[2005,2006,2005,2006],                   
                    'Country2': ['Belarus','Belarus','Belarus','Belarus'],
                    'Country1':['Armenia','Armenia','Azerbaidjan','Azerbaidjan'],
                     'IndictaorGProduct':[15,6,35,5],
                      'IndictaorHProduct':[24,18,28,30]})

Answer 1

您可以这样做。这只是我思考的许多方式之一。

df[df['col'] == hoge]返回匹配项hoge。

将带有列表的新列添加到数据框。

G = []
H = []

for i in range(4):

    rowlist = df2.iloc[i].to_list()

    year = str(rowlist[0])
    country1 = rowlist[1]
    country2 = rowlist[2]

    # make G list
    tmp1 = df1[df1['Indictaor'] == 'G']
    row1 = tmp1[tmp1['Country'] == country1].index[0]
    tmp2 = df1[df1['Indictaor'] == 'G']
    row2 = tmp2[tmp2['Country'] == country2].index[0]
    G.append(df1[year].iloc[row1] * df1[year].iloc[row2])
    # make H list
    tmp1 = df1[df1['Indictaor'] == 'H']
    row1 = tmp1[tmp1['Country'] == country1].index[0]
    tmp2 = df1[df1['Indictaor'] == 'H']
    row2 = tmp2[tmp2['Country'] == country2].index[0]
    H.append(df1[year].iloc[row1] * df1[year].iloc[row2])


df2['IndictaorGProduct'] = G
df2['IndictaorHProduct'] = H
print(df2)

熊猫合并和矩阵计算

1 个答案: