我必须使用concatenate函数for large no。列。让我说这个功能。
pd.concat([mdf1[['user','tag1','tag2','tag3','tag4']].groupby(['user']).agg(sum)
这里我有大号没有。标签,所以我希望我的功能采取所有列说'tag1'之后我怎么能这样做? MDF1
user page_name category tag1 tag2 tag3
0 random guy BlackBuck Transport/Freight 1 1 0
1 mank nion DJ CHETAS Arts/Entertainment 0 1 1
2 random guy GiveMeSport Sport 1 0 1
3 mank nion Gurkeerat Singh Actor/Director 1 0 1
MDF2
user page_name category tag1 tag2 tag3
0 pop rajuel WOW Editions Concert Tour NaN NaN NaN
1 Roshan ghai MensXP News/Media Website NaN NaN NaN
2 mank nion Celina Jaitly Actress NaN NaN NaN
3 pop rajuel 500 Startups App Page 1.0 0.0 1.0
4 Roshan ghai No Abuse Community NaN NaN NaN
5 random guy Analytics Ninja Insurance Company NaN NaN NaN
6 pop rajuel Biswapati Sarkar Actor/Director 1.0 0.0 0.0
7 Roshan ghai the smartian Public Figure 0.0 1.0 1.0
输出
user tag1 tag2 tag3
0 mank nion 1.0 1.0 2.0
1 random guy 2.0 1.0 1.0
2 Roshan ghai 0.0 1.0 1.0
3 mank nion NaN NaN NaN
4 pop rajuel 2.0 0.0 1.0
5 random guy NaN NaN NaN
唯一不同的地方我想申请的是我有一个很大的没有。列,即'tag4''tag5'。所以我希望我的代码在此代码中的'tag1'之后取出所有列我在将2 mdf分组后在用户上并将它们相加后基本连接。
答案 0 :(得分:0)
df = pd.concat([mdf1,mdf2])
print (df)
user page_name category tag1 tag2 tag3
0 random guy BlackBuck Transport/Freight 1.0 1.0 0.0
1 mank nion DJ CHETAS Arts/Entertainment 0.0 1.0 1.0
2 random guy GiveMeSport Sport 1.0 0.0 1.0
3 mank nion Gurkeerat Singh Actor/Director 1.0 0.0 1.0
0 pop rajuel WOW Editions Concert Tour NaN NaN NaN
1 Roshan ghai MensXP News/Media Website NaN NaN NaN
2 mank nion Celina Jaitly Actress NaN NaN NaN
3 pop rajuel 500 Startups App Page 1.0 0.0 1.0
4 Roshan ghai No Abuse Community NaN NaN NaN
5 random guy Analytics Ninja Insurance Company NaN NaN NaN
6 pop rajuel Biswapati Sarkar Actor/Director 1.0 0.0 0.0
7 Roshan ghai the smartian Public Figure 0.0 1.0 1.0
print (df.groupby('user', as_index=False).sum())
user tag1 tag2 tag3
0 Roshan ghai 0.0 1.0 1.0
1 mank nion 1.0 1.0 2.0
2 pop rajuel 2.0 0.0 1.0
3 random guy 2.0 1.0 1.0
page_name
和category
列被省略,因为automatic exclusion of nuisance columns。