Question

尝试在pic_code上加入df61和df_petsy_gz。我也包含了变量的数据类型。我的代码输出了一堆NaN，表明两个数据集之间没有pic_codes匹配。有几百万行数据，所以我确定有一堆匹配。我想我做错了。

monthly_fiscal_year  month                        pic_code class_of_mail  \
             2017.0   11.0  420606019300189843900566128707            FC   
             2017.0   11.0  420731629300189843900584700299            FC   
             2017.0   11.0  420405029300189843900568579224            FC   
             2017.0   11.0  420301349300189843900567382542            FC   

   weight  calc_postage  calc_total_postage  MikeZone mpe_wgt  
   0.8750          4.02                4.02       5.0     NaN  
   0.3750          2.77                2.77       6.0     NaN  
   0.6875          3.60                3.60       8.0     NaN  
   0.5000          2.77                2.77       4.0     NaN

输出

{{1}}

Answer 1

我不知道您的数据是怎样的，但我所知道的是，连接类型会影响连接后行的形成方式。

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.join.html

how : {‘left’, ‘right’, ‘outer’, ‘inner’}, default: ‘left’

How to handle the operation of the two objects.

left: use calling frame’s index (or column if on is specified)
right: use other frame’s index
outer: form union of calling frame’s index (or column if on is specified) with other frame’s index, and sort it lexicographically
inner: form intersection of calling frame’s index (or column if on is specified) with other frame’s index, preserving the order of the calling’s one

尝试使用'内部'加入，看看是否符合您的要求这将只返回在两个数据框中找到 pic_code 并且 mpe_wgt 的行。

此外，请确保pic_code没有尾随/前导空格，以便两个数据帧中的类似pic_code匹配。

加入/合并不在python中工作

1 个答案: