Question

我有两个熊猫数据帧：seren1和bbox。我想在名为filepath的列上执行它们的内部联接。

seren1[["filepath", "label"]].join(bbox[["filepath", "label"]], on="filepath", how="inner", lsuffix='_caller', rsuffix='_other')

给出错误：

ValueError                                Traceback (most recent call last)
<ipython-input-74-c001a7adc7cd> in <module>
----> 1 seren1[["filepath", "label"]].join(bbox[["filepath", "label"]], on="filepath", how="inner", lsuffix='_caller', rsuffix='_other')

/projects/community/py-data-science-stack/5.1.0/kp807/envs/fastai/lib/python3.7/site-packages/pandas/core/frame.py in join(self, other, on, how, lsuffix, rsuffix, sort)
   6822         # For SparseDataFrame's benefit
   6823         return self._join_compat(other, on=on, how=how, lsuffix=lsuffix,
-> 6824                                  rsuffix=rsuffix, sort=sort)
   6825 
   6826     def _join_compat(self, other, on=None, how='left', lsuffix='', rsuffix='',

/projects/community/py-data-science-stack/5.1.0/kp807/envs/fastai/lib/python3.7/site-packages/pandas/core/frame.py in _join_compat(self, other, on, how, lsuffix, rsuffix, sort)
   6837             return merge(self, other, left_on=on, how=how,
   6838                          left_index=on is None, right_index=True,
-> 6839                          suffixes=(lsuffix, rsuffix), sort=sort)
   6840         else:
   6841             if on is not None:

/projects/community/py-data-science-stack/5.1.0/kp807/envs/fastai/lib/python3.7/site-packages/pandas/core/reshape/merge.py in merge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator, validate)
     45                          right_index=right_index, sort=sort, suffixes=suffixes,
     46                          copy=copy, indicator=indicator,
---> 47                          validate=validate)
     48     return op.get_result()
     49 

/projects/community/py-data-science-stack/5.1.0/kp807/envs/fastai/lib/python3.7/site-packages/pandas/core/reshape/merge.py in __init__(self, left, right, how, on, left_on, right_on, axis, left_index, right_index, sort, suffixes, copy, indicator, validate)
    531         # validate the merge keys dtypes. We may need to coerce
    532         # to avoid incompat dtypes
--> 533         self._maybe_coerce_merge_keys()
    534 
    535         # If argument passed to validate,

/projects/community/py-data-science-stack/5.1.0/kp807/envs/fastai/lib/python3.7/site-packages/pandas/core/reshape/merge.py in _maybe_coerce_merge_keys(self)
    978                       (inferred_right in string_types and
    979                        inferred_left not in string_types)):
--> 980                     raise ValueError(msg)
    981 
    982             # datetimelikes must match exactly

ValueError: You are trying to merge on object and int64 columns. If you wish to proceed you should use pd.concat

但是如果我将它们转换为系列联接：

import numpy as np
pd.Series(np.intersect1d(seren1["filepath"].values,bbox["filepath"].values))

工作正常：

0        S1/B04/B04_R1/S1_B04_R1_PICT0006
1        S1/B04/B04_R1/S1_B04_R1_PICT0007
2        S1/B04/B04_R1/S1_B04_R1_PICT0008
3        S1/B04/B04_R1/S1_B04_R1_PICT0013
4        S1/B04/B04_R1/S1_B04_R1_PICT0039
5        S1/B04/B04_R1/S1_B04_R1_PICT0040
6        S1/B04/B04_R1/S1_B04_R1_PICT0041
7        S1/B05/B05_R1/S1_B05_R1_PICT0056
......

类型检查：

seren1.dtypes

filepath     object
timestamp    object
label        object
dtype: object

bbox.dtypes

filepath    object
label       object
X            int64
Y            int64
W            int64
H            int64
dtype: object

all (seren1.filepath.apply(lambda x: isinstance(x, str)) )
True

all (bbox.filepath.apply(lambda x: isinstance(x, str)) )
True

出了什么问题？

Answer 1

我能够按照以下方法解决此错误：

假设您正在尝试将df2加入df1。为了使连接功能正常工作，两个数据框中的列名称“ Column”必须相同，并且要连接的数据框中的“ Column”列也必须具有set_index。要将df2加入到“列”列的df1中，

df1.join（df2.set_index（'Column'），on ='Column'）

试图加入两个熊猫数据框，但出现“ ValueError：您正在尝试合并object和int64列。”？

1 个答案: