我想计算所有具有相同名字的人的总年龄:请参阅此处的示例表。
这是我到目前为止编写的代码..但它不完整而且它不起作用..
final_df = DataFrame()
for i in [list of names]:
dummy = sort_df.loc[sort_df['name'] == i]
total_age = 0
for j in dummy.age:
age2 = dummy.age(j)
total_age = total_age + age2
final_df.append(total_age)
final_df['total_age'] = total_age
如何解决此问题,我可以编写一个代码,该代码将遍历具有相同名称的年龄段的人并将它们相加并将它们存储在新列中?
最后它应该是这样的:
答案 0 :(得分:2)
查看您的代码我假设有一个名为input.csv
的csv文件已经被sort_df
读入,其中包含以下数据:
name,age,total age
Alfredo,13,
Alfredo,12,
Alfredo,15,
Jaap,12,
Jaap,14,
Koen,16,
Lian,76,
Lian,45,
Lian,34,
Lian,14,
在这种情况下,无需声明另一个dummy
数据帧。使用此:
from pandas import DataFrame
sort_df = DataFrame.from_csv("inCSV.txt", index_col=False)
final_df = sort_df
# Use a dictionary to keep track instead
total_age = {}
for name in sort_df["name"]:
if name not in total_age.keys():
total_age[name] = 0
# Add up the ages
for index in xrange(len(sort_df)):
person = sort_df.loc[index]
name = person["name"]
age = person["age"]
total_age[name] += age
# Set the new ages into final_df
for index in xrange(len(final_df)):
person = final_df.loc[index]
name = person["name"]
final_df.set_value(index, "total age", total_age[name])
print final_df
会给你(在final_df
中):
name age total age
0 Alfredo 13 40.0
1 Alfredo 12 40.0
2 Alfredo 15 40.0
3 Jaap 12 26.0
4 Jaap 14 26.0
5 Koen 16 16.0
6 Lian 76 169.0
7 Lian 45 169.0
8 Lian 34 169.0
9 Lian 14 169.0
答案 1 :(得分:1)
例如数据,
name,age
Alfredo,13,
Alfredo,12,
Alfredo,15,
Jaap,12,
Jaap,14,
Koen,16,
Lian,76,
Lian,45,
Lian,34,
Lian,14,
import csv
from collections import defaultdict
result = defaultdict(int)
reader = csv.DictReader(csv_file_handle)
for person in reader:
name = person['name'].lower()
age = int(person['age'])
result[name] += age
>>> result
defaultdict(int, {'alfredo': 40, 'jaap': 26, 'koen': 16, 'lian': 169})
使用result
# make a reader object
# make a writer object with fieldnames ['name', 'age', 'total_age']
# write header
for person in reader:
person.update({'total_age': result[person['name'].lower()]})
writer.writerow(person)
答案 2 :(得分:0)
你在这里找到答案:
from collections import defaultdict
list_name = [
{'age': 25, 'name': 'alfredo'},
{'age': 44, 'name': 'alfredo'},
{'age': 23, 'name': 'jaap'},
{'age': 60, 'name': 'jaap'}
]
k={}
c = defaultdict(int)
for d in list_name:
c[d['name']] += d['age']
k[d['name']] = c[d['name']]
for d in list_name:
d['total_age'] = k[d['name']]
print list_name