q = [{"name":"Mike","age":21, "text": "aaa"},{"name":"Jow","age":22, "text": "bbb"},{"name":"Piter","age":22, "text": "ccc"},{"name":"David","age":25, "text": "ddd"}]
df = pd.DataFrame(q)
result = df["name"].groupby(df['age']).agg(','.join).to_frame()
print(df)
print('---')
print(result)
输出:
$ app.py
age name text
0 21 Mike aaa
1 22 Jow bbb
2 22 Piter ccc
3 25 David ddd
---
name
age
21 Mike
22 Jow,Piter
25 David
但是我的text
列在哪里?如何在结果输出中添加它?
答案 0 :(得分:2)
当您执行df["name"]
时,您只会对"name"
列进行切片,因此所有其他列都将“消失”。
我想您正在尝试执行以下操作:
result = df.groupby('age').agg(','.join)
print(df)
print('---')
print(result)
age name text
0 21 Mike aaa
1 22 Jow bbb
2 22 Piter ccc
3 25 David ddd
---
name text
age
21 Mike aaa
22 Jow,Piter bbb,ccc
25 David ddd
答案 1 :(得分:1)
您的代码使用Name
列,因此省略了text
列:
#group column name by column age
result = df["name"].groupby(df['age']).agg(','.join).to_frame()
#more common altearnative - group by column age columns specified in [] after groupby()
result = df.groupby('age')['name'].agg(','.join).to_frame()
print (result)
name
age
21 Mike
22 Jow,Piter
25 David
result = df.groupby('age')['text'].agg(','.join).to_frame()
print (result)
text
age
21 aaa
22 bbb,ccc
25 ddd
#if need specified multiple columns
result = df.groupby('age')['name','text'].agg(','.join)
print (result)
name text
age
21 Mike aaa
22 Jow,Piter bbb,ccc
25 David ddd
#if omit [] proceses all non numeric columns
result = df.groupby('age').agg(','.join)
print (result)
name text
age
21 Mike aaa
22 Jow,Piter bbb,ccc
25 David ddd