使用tzinfo合并时间戳列时,pandas.merge失败

时间:2015-11-01 22:34:22

标签: python datetime numpy pandas

我需要在Timestamps列上进行合并,但行为取决于是否设置了时区。

以下代码可以正常使用

$keyterms = $_GET['term'];
$query = $database->query("SELECT * FROM table WHERE A LIKE '%".$keyterms."%'    ORDER BY A ASC");
while ($row = $query->fetch_assoc()) {
    $data[] = [
        'value' => $row['B']
        ,'label' => '<img src="' . $row['imgurl'] . '">'
    ];
}
echo json_encode($data);

相反,这不是

import pandas as pd, datetime
now = datetime.datetime.now()
df1 = pd.DataFrame({'ts': pd.to_datetime([now])})
df2 = pd.DataFrame({'ts': pd.to_datetime([now])})
pd.merge(df1, df2, on='ts')

我有以下错误

import pandas as pd, datetime
now = datetime.datetime.now().replace(tzinfo=pytz.utc)
df3 = pd.DataFrame({'ts': pd.to_datetime([now])})
df4 = pd.DataFrame({'ts': pd.to_datetime([now])})
pd.merge(df3, df4, on='ts')

我的环境:

  1. python 3.4
  2. pandas 0.17.0
  3. numpy 1.10.1
  4. dtypes不同:

    /path/to/env3.4/lib/python3.4/site-packages/pandas/tools/merge.py in merge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator)
         33                          right_index=right_index, sort=sort, suffixes=suffixes,
         34                          copy=copy, indicator=indicator)
    ---> 35     return op.get_result()
         36 if __debug__:
         37     merge.__doc__ = _merge_doc % '\nleft : DataFrame'
    
    /path/to/env3.4/lib/python3.4/site-packages/pandas/tools/merge.py in get_result(self)
        194             self.left, self.right = self._indicator_pre_merge(self.left, self.right)
        195 
    --> 196         join_index, left_indexer, right_indexer = self._get_join_info()
        197 
        198         ldata, rdata = self.left._data, self.right._data
    
    /path/to/env3.4/lib/python3.4/site-packages/pandas/tools/merge.py in _get_join_info(self)
        323              right_indexer) = _get_join_indexers(self.left_join_keys,
        324                                                  self.right_join_keys,
    --> 325                                                  sort=self.sort, how=self.how)
        326 
        327             if self.right_index:
    
    /path/to/env3.4/lib/python3.4/site-packages/pandas/tools/merge.py in _get_join_indexers(left_keys, right_keys, sort, how)
        514 
        515     # get left & right join labels and num. of levels at each location
    --> 516     llab, rlab, shape = map(list, zip( * map(fkeys, left_keys, right_keys)))
        517 
        518     # get flat i8 keys from label lists
    
    TypeError: type object argument after * must be a sequence, not map
    

    大熊猫的约会处理有点神秘。你必须知道:

    • 如果您操作datetime.datetime,numpy.datetime64或pandas.Timestamp
    • 有或没有时区
    • 秒/毫秒/微秒/纳秒精度

    我在这里缺少什么?

1 个答案:

答案 0 :(得分:2)

这是0.17.0中的错误,已在主人here中修复,并将在即将到来的0.17.1中。

w / o tz

In [13]: now = datetime.datetime.now()

In [14]: df1 = pd.DataFrame({'ts': pd.to_datetime([now])})

In [15]: df2 = pd.DataFrame({'ts': pd.to_datetime([now])})

In [16]: pd.merge(df1, df2, on='ts')
Out[16]: 
                          ts
0 2015-11-01 18:33:59.771962

用tz

In [8]: now = datetime.datetime.now().replace(tzinfo=pytz.utc)

In [9]: df3 = pd.DataFrame({'ts': pd.to_datetime([now])})

In [10]: df4 = pd.DataFrame({'ts': pd.to_datetime([now])})

In [11]: pd.merge(df3, df4, on='ts')
Out[11]: 
                                ts
0 2015-11-01 18:32:46.801009+00:00