我试图寻找一个简洁的答案,但没有任何帮助。我正在尝试向数据框添加一行,该行将第一列的字符串作为字符串,然后为每一列的字符串求和。我遇到了一个标量问题,所以我试图将所需的行分成一系列然后转换为数据框,但是显然我要添加四行带有一个列的值,而不是一行四列的值。
我的代码:
def country_csv():
# loop through absolute paths of each file in source
for filename in os.listdir(source):
filepath = os.path.join(source, filename)
if not os.path.isfile(filepath):
continue
df = pd.read_csv(filepath)
df = df.groupby(['Country']).sum()
df.reset_index()
print(df)
# df.to_csv(os.path.join(path1, filename))
示例数据框:
Confirmed Deaths Recovered
Country
Afghanistan 299 7 10
Albania 333 20 99
希望将其视为第一行
World 632 27 109
答案 0 :(得分:2)
IIUC,您可以创建一个字典,然后将其重新传递到数据帧以进行连接。
data = df.sum(axis=0).to_dict()
data.update({'Country' : 'World'})
df2 = pd.concat([pd.DataFrame(data,index=[0]).set_index('Country'),df],axis=0)
print(df2)
Confirmed Deaths Recovered
Country
World 632 27 109
Afghanistan 299 7 10
Albania 333 20 99
或使用assign
和Transpose
的下衬纸
df2 = pd.concat(
[df.sum(axis=0).to_frame().T.assign(Country="World").set_index("Country"), df],
axis=0,
)
print(df2)
Confirmed Deaths Recovered
Country
World 632 27 109
Afghanistan 299 7 10
Albania 333 20 99
答案 1 :(得分:2)
import pandas as pd
import datetime as dt
df
Confirmed Deaths Recovered
Country
Afghanistan 299 7 10
Albania 333 20 99
df.loc['World'] = [df['Confirmed'].sum(),df['Deaths'].sum(),df['Recovered'].sum()]
df.sort_values(by=['Confirmed'], ascending=False)
Confirmed Deaths Recovered
Country
World 632 27 109
Albania 333 20 99
Afghanistan 299 7 10