我是数据科学学习的初学者。过去了熊猫主题,我在这里找到了一个任务,我无法理解这是什么错误。让我解释一下这个问题。
我有三个数据框:
gold = pd.DataFrame({'Country': ['USA', 'France', 'Russia'],
'Medals': [15, 13, 9]}
)
silver = pd.DataFrame({'Country': ['USA', 'Germany', 'Russia'],
'Medals': [29, 20, 16]}
)
bronze = pd.DataFrame({'Country': ['France', 'USA', 'UK'],
'Medals': [40, 28, 27]}
)
在这里,我需要将所有奖牌添加到一个栏中,将国家/地区添加到另一栏中。当我添加时,它显示的是NAN。因此,我用零值填充了NAN,但仍然无法获得应有的输出。
代码:
gold.set_index('Country', inplace = True)
silver.set_index('Country',inplace = True)
bronze.set_index('Country', inplace = True)
Total = silver.add(gold,fill_value = 0)
Total = bronze.add(silver,fill_value = 0)
Total = gold + silver + bronze
print(Total)
实际输出:
Medals
Country
France NaN
Germany NaN
Russia NaN
UK NaN
USA 72.0
预期:
Medals
Country
USA 72.0
France 53.0
UK 27.0
Russia 25.0
Germany 20.0
让我知道怎么了。
答案 0 :(得分:2)
只需对concat
groupby
做sum
pd.concat([gold,silver,bronze]).groupby('Country').sum()
Out[1306]:
Medals
Country
France 53
Germany 20
Russia 25
UK 27
USA 72
修正代码
silver.add(gold,fill_value = 0).add(bronze,fill_value=0)
答案 1 :(得分:0)
# For a video solution of the code, copy-paste the following link on your browser:
# https://youtu.be/p0cnApQDotA
import numpy as np
import pandas as pd
# Defining the three dataframes indicating the gold, silver, and bronze medal counts
# of different countries
gold = pd.DataFrame({'Country': ['USA', 'France', 'Russia'],
'Medals': [15, 13, 9]}
)
silver = pd.DataFrame({'Country': ['USA', 'Germany', 'Russia'],
'Medals': [29, 20, 16]}
)
bronze = pd.DataFrame({'Country': ['France', 'USA', 'UK'],
'Medals': [40, 28, 27]}
)
# Set the index of the dataframes to 'Country' so that you can get the countrywise
# medal count
gold.set_index('Country', inplace = True)
silver.set_index('Country', inplace = True)
bronze.set_index('Country', inplace = True)
# Add the three dataframes and set the fill_value argument to zero to avoid getting
# NaN values
total = gold.add(silver, fill_value = 0).add(bronze, fill_value = 0)
# Sort the resultant dataframe in a descending order
total = total.sort_values(by = 'Medals', ascending = False)
# Print the sorted dataframe
print(total)