熊猫在ID上合并

时间:2019-05-28 16:07:59

标签: python pandas merge

尽管尝试了不同的合并方式,但我还是无法如我所愿。

我想在 piat_retest_id_2 上合并df的 piat_comp piat2 ,但是下面的代码(也尝试作为内部联接使用)不行达到预期的水平。

    piat_comp2 = piat_comp.merge(piat2, on='piat_retest_id_2', how='outer')

例如,我期望piat_comp的索引8与piat 2的索引9成对,因为它们在'piat_retest_id_2'中共享相同的ID。相反,他们只是不排队。

在结果数据框中,这些行应合并为:

piat_retest_id_1 ----- PIAT.ability ----- piat_retest_id_2 ----- PIAT.ability_2

UK_543970282 ----- 0.270247832130706 ----- UK_358142873 ----- 0.697223344529081

请注意piat_retest_id_1和piat_retest_id_2不应相同。 piat_retest_id_2都在df中,我想对此进行合并,但最终只包含其中一列。

可复制:

piat_comp = {'piat_retest_id_1': {0: 'UK_982598362',
  1: 'UK_554290758',
  2: 'UK_416473256',
  3: 'UK_717765782',
  4: 'UK_340767839',
  5: 'UK_393447950',
  6: 'UK_16532112',
  7: 'UK_887499021',
  8: 'UK_543970282',
  9: 'UK_86558749',
  10: 'UK_763524616',
  11: 'UK_59234772',
  12: 'UK_891870396',
  13: 'UK_416946337',
  14: 'UK_208235437',
  15: 'UK_697728998',
  16: 'UK_283032428',
  17: 'UK_927006735',
  18: 'UK_838548458',
  19: 'UK_935961427',
  20: 'UK_534651143',
  21: 'UK_469442638',
  22: 'UK_357516555',
  23: 'UK_869226039',
  24: 'UK_675587892',
  25: 'UK_331843371',
  26: 'UK_570264371',
  27: 'UK_818769311',
  28: 'UK_708776594',
  29: 'UK_811947376',
  30: 'AUS_182800035',
  31: 'AUS_436205218',
  32: 'AUS_519722760',
  33: 'AUS_948860279',
  34: 'AUS_273917058',
  35: 'AUS_174076947',
  36: 'AUS_462744594',
  37: 'AUS_718809552',
  38: 'AUS_142275638',
  39: 'AUS_646335936',
  40: 'AUS_775355864',
  41: 'AUS_523210619',
  42: 'AUS_242767995',
  43: 'AUS_485797627',
  44: 'AUS_192289763',
  45: 'AUS_971858664',
  46: 'AUS_272182103',
  47: 'AUS_6470694',
  48: 'AUS_960651008',
  49: 'AUS_36924588',
  50: 'AUS_921547727',
  51: 'AUS_469946434',
  52: 'AUS_591613124',
  53: 'AUS_575220499',
  54: 'AUS_767535190',
  55: 'AUS_6760645',
  56: 'AUS_33323414',
  57: 'AUS_453834043',
  58: 'AUS_576415932',
  59: 'AUS_359646415',
  60: 'AUS_647631159',
  61: 'AUS_535492855',
  62: 'AUS_488064434',
  63: 'AUS_230122761',
  64: 'AUS_996001169',
  65: 'AUS_830133002',
  66: 'AUS_82508132',
  67: 'AUS_922040010',
  68: 'AUS_754682968',
  69: 'AUS_79125001',
  70: 'AUS_581895146',
  71: 'AUS_100799448'},
 'PIAT.ability': {0: 3.8309313391931497,
  1: 1.99292247701631,
  2: 3.94306748097956,
  3: -0.19562132382300002,
  4: 0.6505183735635122,
  5: 0.731816086976127,
  6: 0.8137871489410059,
  7: 3.22777961014248,
  8: 0.270247832130706,
  9: 1.3424922846183798,
  10: -0.301396636649614,
  11: -1.9332907828737598,
  12: 0.917831930881864,
  13: 0.12488682346195801,
  14: 0.357263190639869,
  15: -0.771073750440309,
  16: 0.947225971482024,
  17: 1.09754260961856,
  18: 3.89556727394303,
  19: 0.000776309348843904,
  20: 0.633059075095001,
  21: -0.7613018941029259,
  22: 0.35232544411647604,
  23: 1.12508794021388,
  24: -0.689808173864224,
  25: 1.5182029956795,
  26: 0.8041551225465369,
  27: -0.49821628662950396,
  28: 1.0916595064983101,
  29: 3.6305272212891,
  30: -0.941759632235263,
  31: 0.8281845827398409,
  32: 0.38229348622269504,
  33: 1.53338336352128,
  34: 0.41351872726046995,
  35: 1.5705599264116397,
  36: 1.97524137560666,
  37: -0.00901326645847487,
  38: -0.6601342995997729,
  39: 0.764888906047256,
  40: 1.5069746380603897,
  41: 2.7867287535613703,
  42: -0.0572566797274714,
  43: 1.18340947496768,
  44: 0.21109444697996899,
  45: -0.11442018948849901,
  46: 0.260924655953102,
  47: 0.20230025184467001,
  48: -0.3911319281544,
  49: 1.66298720615184,
  50: 2.0562478076601596,
  51: -0.802324601478322,
  52: 0.0949041456718715,
  53: 0.39139573420752705,
  54: 0.0536255839392736,
  55: 1.32571209008263,
  56: 0.532393162267327,
  57: 0.269601175227049,
  58: 0.7076081560697621,
  59: 0.46112169587998597,
  60: -0.403030461213409,
  61: 0.43675734847674497,
  62: -0.879126649374346,
  63: -0.421994389129437,
  64: 1.77097042768753,
  65: -0.9110846702038922,
  66: -1.07482469386456,
  67: 3.7599324868259596,
  68: 0.359258100506062,
  69: 3.63222036081698,
  70: 0.389727330252132,
  71: 0.656178769811206},
 'piat_retest_id_2': {0: 'UK_873131581',
  1: 'UK_536050247',
  2: 'UK_288912',
  3: 'UK_601549101',
  4: 'UK_654724631',
  5: 'UK_927459150',
  6: 'UK_161992453',
  7: 'UK_898365573',
  8: 'UK_358142873',
  9: 'UK_749339291',
  10: 'UK_137508979',
  11: 'UK_802078996',
  12: 'UK_418768387',
  13: 'UK_538152168',
  14: 'UK_825889086.',
  15: 'UK_56800313',
  16: 'UK_90101007',
  17: 'UK_77933368',
  18: 'UK_288469917',
  19: 'UK_908770966',
  20: 'UK_661264755',
  21: 'UK_154114403',
  22: 'UK_826457452',
  23: 'UK_626934971',
  24: 'UK_852937537',
  25: 'UK_32221105',
  26: 'UK_516899485',
  27: nan,
  28: 'UK_213817864',
  29: 'UK_112922227',
  30: 'AUS_1828000352',
  31: 'AUS_4362052182',
  32: 'AUS_5197227602',
  33: 'AUS_9488602792',
  34: 'AUS_2739170582',
  35: 'AUS_1740769472',
  36: 'AUS_4627445942',
  37: 'AUS_7188095522',
  38: 'AUS_1422756382',
  39: 'AUS_6463359362',
  40: 'AUS_7753558642',
  41: nan,
  42: 'AUS_4857976272',
  43: 'AUS_1922897632',
  44: 'AUS_9718586642',
  45: 'AUS_2721821032',
  46: 'AUS_64706942',
  47: 'AUS_9606510082',
  48: nan,
  49: 'AUS_9215477272',
  50: 'AUS_5916131242',
  51: 'AUS_5752204992',
  52: 'AUS_7675351902',
  53: 'AUS_67606452',
  54: 'AUS_333234142',
  55: 'AUS_4538340432',
  56: 'AUS_5764159322',
  57: 'AUS_3596464152',
  58: 'AUS_6476311592',
  59: 'AUS_5354928552',
  60: 'AUS_4880644342',
  61: 'AUS_2301227612',
  62: 'AUS_9960011692',
  63: 'AUS_8301330022',
  64: 'AUS_825081322',
  65: 'AUS_9220400102',
  66: 'AUS_7546829682',
  67: 'AUS_791250012',
  68: 'AUS_5818951462',
  69: 'AUS_1007994482',
  70: nan,
  71: nan}}

  piat_comp = pd.DataFrame().from_dict(piat_comp)

    piat2 = {'piat_retest_id_2': {0: 'UK_873131581',
  1: 'UK_536050247',
  2: 'UK_288912',
  3: 'UK_601549101',
  4: 'UK_654724631',
  5: 'UK_927459150',
  6: 'UK_605723093',
  7: 'UK_161992453',
  8: 'UK_898365573',
  9: 'UK_358142873',
  10: 'UK_749339291',
  11: 'UK_137508979',
  12: 'UK_802078996',
  13: 'UK_418768387',
  14: 'UK_538152168',
  15: 'UK_56800313',
  16: 'UK_90101007',
  17: 'UK_77933368',
  18: 'UK_288469917',
  19: 'UK_908770966',
  20: 'UK_661264755',
  21: 'UK_154114403',
  22: 'UK_826457452',
  23: 'UK_852937537',
  24: 'UK_32221105',
  25: 'UK_516899485',
  26: 'UK_213817864',
  27: 'UK_112922227',
  28: 'AUS_1828000352',
  29: 'AUS_4362052182',
  30: 'AUS_5197227602',
  31: 'AUS_5782596212',
  32: 'AUS_9488602792',
  33: 'AUS_2739170582',
  34: 'AUS_1740769472',
  35: 'AUS_4627445942',
  36: 'AUS_7188095522',
  37: 'AUS_1422756382',
  38: 'AUS_6463359362',
  39: 'AUS_7753558642',
  40: 'AUS_4857976272',
  41: 'AUS_1922897632',
  42: 'AUS_9718586642',
  43: 'AUS_2721821032',
  44: 'AUS_64706942',
  45: 'AUS_9606510082',
  46: 'AUS_9215477272',
  47: 'AUS_4699464342',
  48: 'AUS_5916131242',
  49: 'AUS_5752204992',
  50: 'AUS_7675351902',
  51: 'AUS_67606452'},
 'PIAT.ability': {0: 4.0,
  1: 3.03585323865227,
  2: 3.0659628668283796,
  3: -0.984877273151229,
  4: 3.9214698118886804,
  5: 2.51547803673487,
  6: 1.1199284033743502,
  7: 0.776919848266456,
  8: 1.27559145269151,
  9: 0.697223344529081,
  10: 1.20628362697982,
  11: 1.3367662330669798,
  12: -0.862955270810047,
  13: 3.8621560811803697,
  14: 0.0635995324043882,
  15: -0.5886718197915529,
  16: -1.24297429168435,
  17: 3.5438183255585898,
  18: 4.0,
  19: 1.11913257500304,
  20: 0.7775141672201971,
  21: 1.1006024338623002,
  22: -0.24300164716209502,
  23: -0.494374704856605,
  24: 2.85613759713724,
  25: 1.3029798990839299,
  26: 0.0710404544671652,
  27: 4.0,
  28: 0.902265996923405,
  29: 1.4454497071397798,
  30: -1.2020915721021102,
  31: -1.8592336815873403,
  32: 1.61547349789574,
  33: 0.26694017591396,
  34: 2.44205760912957,
  35: 0.6621194092522099,
  36: -0.9020563539838921,
  37: 2.15717946932124,
  38: 1.16781892225373,
  39: 1.20554720958442,
  40: 1.7657674979533402,
  41: 0.40870760302225895,
  42: 0.0479209217874616,
  43: 1.6004708092039999,
  44: 0.19277544914143,
  45: 2.06504675907639,
  46: 2.95312483834473,
  47: -0.870384004997555,
  48: 0.7775511527014491,
  49: 1.3847884057742,
  50: 0.12852600156989802,
  51: 1.15655034425564}}

piat2 = pd.DataFrame().from_dict(piat2)

1 个答案:

答案 0 :(得分:1)

使用left_on和right_on进行合并,因为每个数据帧中都有不同的键。

piat_comp2 = piat_comp.merge(
    piat2, 
    left_on='piat_retest_id_1',
    right_on='piat_retest_id_2', 
    how='outer'
)