如何使用join / merge / concat / append / add将这两个表粘贴在一起,使得0-14岁的人口和15-64列是并排的?
我不需要这两个DataFrame的笛卡尔产品。
我试过了:
population ages = t3.merge(t4, on='Country Name', how='inner')
T3
Country Name Year Population Age 0 - 14
0 Aruba 1960 43.847771
1 Andorra 1960 NaN
2 Afghanistan 1960 43.712284
3 Angola 1960 43.759289
4 Albania 1960 41.757282
t4
Country Name Population Age 15 - 64
0 Aruba 53.667355
1 Andorra NaN
2 Afghanistan 53.834637
3 Angola 53.587101
4 Albania 52.941044
理想地
Country Name Population Age 15 - 64 Population Ages 0 - 14
0 Aruba 53.667355 43.847771
1 Andorra NaN NaN
2 Afghanistan 53.834637 43.712284
3 Angola 53.587101 43.759289
4 Albania 52.941044 41.757282
测试结果:
population_ages = t3.merge(t4, on='Country Name', how='inner')
我收到的数据框是t3,t4的笛卡尔积,形状为(734832,4),而不是(13608,4)
Country Name Year Population Age 0 - 14 Population Age 15 - 64
0 Aruba 1960 43.847771 53.667355
1 Aruba 1960 43.847771 53.890141
2 Aruba 1960 43.847771 54.216911
3 Aruba 1960 43.847771 54.637810
4 Aruba 1960 43.847771 55.119324
5 Aruba 1960 43.847771 55.631104
6 Aruba 1960 43.847771 56.168560
7 Aruba 1960 43.847771 56.736549
8 Aruba 1960 43.847771 57.341782
9 Aruba 1960 43.847771 57.983109
10 Aruba 1960 43.847771 58.674343
11 Aruba 1960 43.847771 59.404758
12 Aruba 1960 43.847771 60.164749
答案 0 :(得分:1)
怎么样
t4['Population Age 0 - 14'] = t3['Population Age 0 - 14']
或
pd.concat( t4, t3['Population Age 0 - 14'], axis=1)
完整的工作示例:
import pandas as pd
from StringIO import StringIO
d1 = '''Country Name Year Population Age 0 - 14
Aruba 1960 43.847771
Andorra 1960 NaN
Afghanistan 1960 43.712284
Angola 1960 43.759289
Albania 1960 41.757282'''
d2 = '''Country Name Population Age 15 - 64
Aruba 53.667355
Andorra NaN
Afghanistan 53.834637
Angola 53.587101
Albania 52.941044'''
t3 = pd.DataFrame.from_csv( StringIO(d1), sep='\s{2,}', index_col=None )
print '\nt3:\n',t3
t4 = pd.DataFrame.from_csv( StringIO(d2), sep='\s{2,}', index_col=None )
print '\nt4:\n',t3
print '\n--- merge ---\n'
print pd.merge( t4, t3, on='Country Name')
print pd.merge( t4, t3[ ['Country Name', 'Population Age 0 - 14'] ], on='Country Name')
print '\n--- concat ---\n'
print pd.concat( (t4, t3['Population Age 0 - 14']), axis=1)
print '\n--- [xxx] = [xxx] ---\n'
t4['Population Age 0 - 14'] = t3['Population Age 0 - 14']
print t4
结果:
Country Name Population Age 15 - 64 Population Age 0 - 14
0 Aruba 53.667355 43.847771
1 Andorra NaN NaN
2 Afghanistan 53.834637 43.712284
3 Angola 53.587101 43.759289
4 Albania 52.941044 41.757282