Question

我正在尝试将文档从一个数据库逐步复制到另一个数据库。

某些字段包含以下格式的日期时间值：

*******3256

而其他人采用这种格式：

0935***3256

我提取并插入文档如下：

2016-09-22 00:00:00

这是错误：

2016-09-27 09:03:08.988

我实际上不需要修改时间戳，只需按原样插入它们。

任何帮助表示赞赏。

Answer 1

将其替换为熊猫可以解释的None值

df[['_updated_at']] = df[['_updated_at']].astype(object).where(df[['_updated_at']].notnull(), None)

Answer 2

显然，它通过使用对我有用

df.fillna("-",inplace=True)

Answer 3

该单元格可能没有相同的日期时间格式，您应该首先使用 pandas.DataFrame.apply 对其进行标准化，这是示例方法。

import datetime as dt

def handleString(probably_string):
    # string pattern: 2016-09-27 09:03:08.988
    try:
        _date, _time = probably_string.split(' ')
        _year, _month, _day = (int(x) for x in _date.plit('-'))
        _hour, _minute, _second = (int(x) for x in _time.split(':'))
        return dt.datetime(_year, _month, _day, _hour, _minute, _second)
    except AttributeError:
        # it's NoneType oject
        # but you should return datetime object for mongodb datetime field
        return dt.datetime(1970,1,1)
    except ValueError:
        # not enough values to unpack
        # but you should return datetime object for mongodb datetime field
        return dt.datetime(1970,1,1)

def formatTime(row, column_name):
    datetime_cell = row[column_name]
    try:
        _second = datetime_cell.second
        return datetime_cell.replace(second=_second, microsecond=0)
    catch AttributeError:
        return handleString(datetime_cell)

time_column = 'time_field'
df[time_column] = df.apply(lambda row: formatTime(row, time_column), axis='columns')

Answer 4

def new_replace(k):
    return k.replace(tzinfo=None)
    

df[time_column]= df.apply(lambda row: new_replace(row[time_column]),axis = 1)

这在我的情况下有效。您还可以根据您的情况在 new_replace 函数中添加 try 除外。

Pymongo - ValueError：使用insert_many时，NaTType不支持utcoffset

4 个答案: