pandas dataframe:ValueError:只能比较标记相同的DataFrame对象

时间:2016-09-20 17:54:38

标签: python-2.7 pandas dataframe

我正在使用python-2.7并具有以下代码:

df_cut = df_in.copy()
df_cut[df_cut > df_boundry.iloc[[-1]]] = pd.concat([df_boundry.iloc[[-1]]] * len(df_cut)).set_index(df_cut.index)

然后我收到了错误:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-30-4eb788bd44c5> in <module>()
      1 df_cut = df_in.copy()
----> 2 df_cut[df_cut > df_boundry.iloc[[-1]]] = pd.concat([df_boundry.iloc[[-1]]] * len(df_cut)).set_index(df_cut.index)


/home/edamame/anaconda2/lib/python2.7/site-packages/pandas/core/ops.pyc in f(self, other)
   1175     def f(self, other):
   1176         if isinstance(other, pd.DataFrame):  # Another DataFrame
-> 1177             return self._compare_frame(other, func, str_rep)
   1178         elif isinstance(other, ABCSeries):
   1179             return self._combine_series_infer(other, func)

/home/edamame/anaconda2/lib/python2.7/site-packages/pandas/core/frame.pyc in _compare_frame(self, other, func, str_rep)
   3582     def _compare_frame(self, other, func, str_rep):
   3583         if not self._indexed_same(other):
-> 3584             raise ValueError('Can only compare identically-labeled '
   3585                              'DataFrame objects')
   3586         return self._compare_frame_evaluate(other, func, str_rep)

ValueError: Can only compare identically-labeled DataFrame objects

其中df_cut是:

    column_A | column_B | column_C
    --------------------------------
 0    0.5     |   0.5    |  NaN
 1    1.2     |   NaN    |  NaN
 2    NaN     |   8.1    | 21.1
 3    9.1     |   9.3    |  2.1
 4    4.5     |  90.1    |  1.4
 5  112.3     |  79.2    |  1.1
        :
        :

和df_boundry:

    |  column_A  |  column_B  |  column_C
----------------------------------------
0.0 |     0.1    |    0.4     |   0.0
0.8 |    110.4   |   80.1     |  20.5

有谁知道我错过了什么?谢谢!

1 个答案:

答案 0 :(得分:1)

<强>更新

它运作得很好:

In [27]: df_cut
Out[27]:
   column_A  column_B  column_C
0       0.5       0.5       NaN
1       1.2       NaN       NaN
2       NaN       8.1      21.1
3       9.1       9.3       2.1
4       4.5      90.1       1.4
5     112.3      79.2       1.1

In [28]: df_boundry
Out[28]:
     column_A  column_B  column_C
0.0       0.1       0.4       0.0
0.8     110.4      80.1      20.5

In [29]: df_cut[df_cut > df_boundry.iloc[-1]] = pd.concat([df_boundry.iloc[[-1]]] * len(df_cut)).set_index(df_cut.index)

In [31]: df_cut
Out[31]:
   column_A  column_B  column_C
0       0.5       0.5       NaN
1       1.2       NaN       NaN
2       NaN       8.1      20.5
3       9.1       9.3       2.1
4       4.5      80.1       1.4
5     110.4      79.2       1.1

OLD回答:

我猜df_boundry.iloc[[-1]] - 是一个DF,包含一行df_cut - 也是DF。所以它们必须相同(相同的列,相同的索引)才能比较它们。

df_boundry.iloc[-1]是一个系列,如果它的元素数量= =你要与之比较的DF中的列数,可以与每一行进行比较...