相同的数据帧断言不相等 - Python Pandas

时间:2017-06-06 20:22:50

标签: python unit-testing pandas dataframe python-unittest

我正在尝试对我的代码进行单元测试。我有一个给出MySQL查询的方法,将结果作为pandas数据帧返回。请注意,在数据库中,createdexternal_id中的所有返回值都为NULL。这是测试:

def test_get_data(self):

    ### SET UP

    self.report._query = "SELECT * FROM floor LIMIT 3";
    self.report._columns = ['id', 'facility_id', 'name', 'created', 'modified', 'external_id']
    self.d = {'id': p.Series([1, 2, 3]),
              'facility_id': p.Series([1, 1, 1]),
              'name': p.Series(['1st Floor', '2nd Floor', '3rd Floor']),
              'created': p.Series(['None', 'None', 'None']),
              'modified': p.Series([datetime.strptime('2012-10-06 01:08:27', '%Y-%m-%d %H:%M:%S'),
                                    datetime.strptime('2012-10-06 01:08:27', '%Y-%m-%d %H:%M:%S'),
                                    datetime.strptime('2012-10-06 01:08:27', '%Y-%m-%d %H:%M:%S')]),
              'external_id': p.Series(['None', 'None', 'None'])
              }
    self.df = p.DataFrame(data=self.d, columns=['id', 'facility_id', 'name', 'created', 'modified', 'external_id'])
    self.df.fillna('None')
    print(self.df)
    ### CODE UNDER TEST

    result = self.report.get_data(self.report._cursor_web)
    print(result)
    ### ASSERTIONS

    assert_frame_equal(result, self.df)

这是控制台输出(注意测试代码中的print语句。手动构造的数据框位于顶部,从正在测试的函数派生的数据框位于底部):

.   id  facility_id       name created            modified external_id
0   1            1  1st Floor    None 2012-10-06 01:08:27        None
1   2            1  2nd Floor    None 2012-10-06 01:08:27        None
2   3            1  3rd Floor    None 2012-10-06 01:08:27        None
   id  facility_id       name created            modified external_id
0   1            1  1st Floor    None 2012-10-06 01:08:27        None
1   2            1  2nd Floor    None 2012-10-06 01:08:27        None
2   3            1  3rd Floor    None 2012-10-06 01:08:27        None
F
======================================================================
FAIL: test_get_data (__main__.ReportTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/path/to/file/ReportsTestCase.py", line 46, in test_get_data
    assert_frame_equal(result, self.df)
   File "/usr/local/lib/python2.7/site-packages/pandas/util/testing.py", line 1313, in assert_frame_equal
obj='DataFrame.iloc[:, {0}]'.format(i))
  File "/usr/local/lib/python2.7/site-packages/pandas/util/testing.py", line 1181, in assert_series_equal
obj='{0}'.format(obj))
  File "pandas/src/testing.pyx", line 59, in pandas._testing.assert_almost_equal (pandas/src/testing.c:4156)
  File "pandas/src/testing.pyx", line 173, in pandas._testing.assert_almost_equal (pandas/src/testing.c:3274)
  File "/usr/local/lib/python2.7/site-packages/pandas/util/testing.py", line 1018, in raise_assert_detail
raise AssertionError(msg)

AssertionError:DataFrame.iloc [:,3]不同

DataFrame.iloc[:, 3] values are different (100.0 %)
[left]:  [None, None, None]
[right]: [None, None, None]

----------------------------------------------------------------------
Ran 1 test in 0.354s

FAILED (failures=1)

根据我的估算,专栏'创建了'包含三个字符串值'无'在左右数据帧中。为什么断言不平等?

1 个答案:

答案 0 :(得分:1)

Python还有一个与字符串None不同的内置常量'None'。来自docs

  

     

NoneType类型的唯一值。没有人经常使用   表示缺少值,因为默认参数不是   传递给一个函数。分配给None是非法的,并提出一个   的SyntaxError。

如果将None'None'None == 'None')进行比较,结果将为False。因此,如果其中一个DataFrame包含assert_frame_equal但另一个包含None'None'将引发AssertionError。