Question

错误？如果是这样，关于解决方案的建议？

我有以下内容：

df1['DecisionDate'].head()
Out[238]: 
ID
RED            2017-02-13 00:00:00
GREEN          2016-07-29 00:00:00
ORANGE         2017-01-26 00:00:00
PURPLE         2016-10-31 00:00:00
YELLOW          NaT
Name: DecisionDate, dtype: datetime64[ns]

而且：

df2['DecisionDate']
Out[239]: 
YELLOW   2014-04-05 00:00:00
Name: DecisionDate, dtype: datetime64[ns]

现在，如果我尝试这样做：

for ID in df2.index:

    df1.ix[ID,'DecisionDate'] = df2.ix[ID,'DecisionDate']

我明白了

TypeError: long() argument must be a string or a number, not 'Timestamp'

无论我尝试什么，我似乎无法用时间戳值替换pd.NaT。

两个数据框中的所有值都是时间戳。 df2观测值是df1观测值的一个子集，因此df2.index中的所有值都在df1.index中。

我是否直截了当地忽略了什么？或者这是一个错误吗？

提前感谢您的帮助。

修改

这是完整的追溯。我不擅长阅读本文，但也许有助于诊断：

  File "C:\Users\Anaconda2\lib\site-packages\pandas\core\indexing.py", line 141, in __setitem__
    def _slice(self, obj, axis=0, kind=None):

  File "C:\Users\Anaconda2\lib\site-packages\pandas\core\indexing.py", line 533, in _setitem_with_indexer

  File "C:\Users\Anaconda2\lib\site-packages\pandas\core\indexing.py", line 473, in setter
    value = getattr(value, 'values', value).ravel()

  File "C:\Users\Anaconda2\lib\site-packages\pandas\core\internals.py", line 3168, in setitem

  File "C:\Users\Anaconda2\lib\site-packages\pandas\core\internals.py", line 3056, in apply
    align_copy = False

  File "C:\Users\Anaconda2\lib\site-packages\pandas\core\internals.py", line 668, in setitem
    def _replace_single(self, *args, **kwargs):

  File "C:\Users\Anaconda2\lib\site-packages\pandas\core\internals.py", line 2265, in _try_coerce_args
    Parameters

TypeError: long() argument must be a string or a number, not 'Timestamp'

Answer 1

我认为问题在于确定NaT类型。这有效：

    def identify_null(r):
        x = r['ID']
        X = df2.loc[df2.ID == x]['date'] #collect date from other dataframe
        if type(r['date']) == pd.tslib.NaTType:
            return X #intended value from the other data table
        return r['date']

    df['date'] = df.apply(identify_null, axis=1)

编辑：以前，我曾建议：解决方法：删除缺少值的行。这被认为是无效的：

    df2 = df[df.date.notnull()]

并考虑：

    df1 = df[df.date.isnull()]

对于df1，删除日期列并将其与具有日期的数据集合并。

    frame = [df1_merged, df2]
    df = pd.concat(frame)

注意：抱歉不是一个非常聪明的答案，因为我没有尝试过NaT。但这应该适合你。

Answer 2

这似乎是Spyder的一个错误。

for ID in df2.index:

        df1.ix[ID,'DecisionDate'] = df2.ix[ID,'DecisionDate']

作品。

我在行后注意到的内容如下：

for ID in df2.index:

        df1.ix[ID,'DecisionDate'] = df2.ix[ID,'DecisionDate']# some comment

Spyder并不喜欢评论哈希就在活动代码旁边。 df2.ix[ID,'DecisionDate']# some comment转换为元组(myvalue,)而不是单个值，我得到了类型错误。一旦我删除了该行的评论，它就应该正常工作。

Pandas使用来自另一个的时间戳值替换一个df中的空日期时间值

2 个答案: