我是Python的新手,我正在尝试将两个数据集合并以进行研究:
我想将 grouptcode 从df1合并到df2,但是 companyname 稍有不同,但是在两个数据帧之间却非常相似。
对于每个代码,都有一个关联的 Dscode 。但是,多个公司具有相同的
问题
我只对合并与 companyname ( ticker 和 Dscode )相关的 ticker 和 Dscode 的 grouptcode 感兴趣(有时会略有不同-这是我无法逾越的部分)。我一直在使用的代码如下。
代码
import pandas as pd
import os
# set working directory
path = "/Users/name/Desktop/Python"
os.chdir(path)
os.getcwd() # Prints the working directory
# read in excel file
file = "/Users/name/Desktop/Python/Excel/DSROE.xlsx"
x1 = pd.ExcelFile(file)
print(x1.sheet_names)
df1 = x1.parse('Sheet1')
df1.head()
df1.tail()
file2 = "/Users/name/Desktop/Python/Excel/tcode2.xlsx"
x2 = pd.ExcelFile(file2)
print(x2.sheet_names)
df2 = x2.parse('Sheet1')
df2['companyname'] = df2['companyname'].str.upper() ## make column uppercase
df2.head()
df2.tail()
df2 = df2.dropna()
x3 = pd.merge(df1, df2,how = 'outer') # merge
数据
df1
Dscode ticker companyname
65286 8933TC 3pl 3P LEARNING LIMITED
79291 9401FP a2m A2 MILK COMPANY LIMITED
1925 14424Q aac AUSTRALIAN AGRICULTURAL COMPANY LIMITED
39902 675493 aad ARDENT LEISURE GROUP
1400 133915 aba AUSWIDE BANK LIMITED
74565 922472 abc ADELAIDE BRIGHTON LIMITED
7350 26502C abp ABACUS PROPERTY GROUP
39202 675142 ada ADACEL TECHNOLOGIES LIMITED
80866 9661AD adh ADAIRS
80341 9522QV afg AUSTRALIAN FINANCE GROUP LIMITED
45327 691938 agg ANGLOGOLD ASHANTI LIMITED
2625 14880E agi AINSWORTH GAME TECHNOLOGY LIMITED
75090 923040 agl AGL ENERGY LIMITED
19251 29897X ago ATLAS IRON LIMITED
64409 890588 agy ARGOSY MINERALS LIMITED
24151 31511D ahg AUTOMOTIVE HOLDINGS GROUP LIMITED
64934 8917JD ahy ASALEO CARE LIMITED
42877 691152 aia AUCKLAND INTERNATIONAL AIRPORT LIMITED
61433 88013C ajd ASIA PACIFIC DATA CENTRE GROUP
44452 691704 ajl AJ LUCAS GROUP LIMITED
700 13288C ajm ALTURA MINING LIMITED
19601 29929D akp AUDIO PIXELS HOLDINGS LIMITED
79816 951404 alk ALKANE RESOURCES LIMITED
56008 865613 all ARISTOCRAT LEISURE LIMITED
51807 771351 alq ALS LIMITED
44277 691685 alu ALTIUM LIMITED
42702 68625C alx ATLAS ARTERIA GROUP
30101 41162F ama AMA GROUP LIMITED
67386 902201 amc AMCOR LIMITED
33426 50431L ami AURELIA METALS LIMITED
df2
companyname grouptcode ticker
524 3P LEARNING LIMITED.. tpn1 3pl
1 THE A2 MILK COMPANY LIMITED a2m1 a2m
2 AUSTRALIAN AGRICULTURAL COMPANY LIMITED. aac2 aac
3 AAPC LIMITED. aad1 aad
6 ADVANCE BANK AUSTRALIA LIMITED aba1 aba
7 ADELAIDE BRIGHTON CEMENT HOLDINGS LIMITED abc1 abc
8 ABACUS PROPERTY GROUP abp1 abp
9 ADACEL TECHNOLOGIES LIMITED ada1 ada
288 ADA CORPORATION LIMITED khs1 ada
10 AERODATA HOLDINGS LIMITED adh1 adh
11 ADAMS (HERBERT) HOLDINGS LIMITED adh2 adh
12 ADAIRS LIMITED adh3 adh
431 ALLCO FINANCE GROUP LIMITED rcd1 afg
13 AUSTRALIAN FINANCE GROUP LTD afg1 afg
14 ANGLOGOLD ASHANTI LIMITED agg1 agg
15 APGAR INDUSTRIES LIMITED agi1 agi
16 AINSWORTH GAME TECHNOLOGY LIMITED agi2 agi
17 AUSTRALIAN GAS LIGHT COMPANY (THE) agl1 agl
18 ATLAS IRON LIMITED ago1 ago
393 ACM GOLD LIMITED pgo2 ago
19 AUSTRALIAN GYPSUM INDUSTRIES LIMITED agy1 agy
142 ARGOSY MINERALS INC cio1 agy
21 ARCHAEAN GOLD NL ahg1 ahg
22 AUSTRALIAN HYDROCARBONS N.L. ahy1 ahy
23 ASALEO CARE LIMITED ahy2 ahy
24 AUCKLAND INTERNATIONAL AIRPORT LIMITED aia1 aia
25 ASIA PACIFIC DATA CENTRE GROUP ajd1 ajd
26 AJ LUCAS GROUP LIMITED ajl1 ajl
27 AJAX MCPHERSON'S LIMITED ajm1 ajm
29 ALKANE EXPLORATION (TERRIGAL) N.L. alk1 alk
Dscode
524 8933TC
1 9401FP
2 14424Q
3 675493
6 133915
7 922472
8 26502C
9 675142
288 675142
10 9661AD
11 9661AD
12 9661AD
431 9522QV
13 9522QV
14 691938
15 14880E
16 14880E
17 923040
18 29897X
393 29897X
19 890588
142 890588
21 31511D
22 8917JD
23 8917JD
24 691152
25 88013C
26 691704
27 13288C
29 951404