Question

我需要在没有日期解析的情况下运行pd.read_sql。

在parse_dates的文档中的pd.read_sql参数下，该参数可以为Dict of {column_name: arg dict}, where the arg dict corresponds to the keyword arguments of pandas.to_datetime() Especially useful with databases without native Datetime support, such as SQLite.

在to_datetime文档中，默认情况下为errors='raise'。如果可以将其更改为errors='ignore'或errors='coerce'，则应解决此问题。

我尝试过这样实现，见下文：

pd.read_sql(query, con, parse_dates={'col_name': {'errors': 'ignore'}}, chunksize=10**5)

这运行时没有错误，但仍然解析日期。

该代码与该问题不是很相关。基本上就是这样：

df = pandas.read_sql(sql, con, index_col=None, coerce_float=True, params=None, parse_dates=None, columns=None, chunksize=10**5)

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_sql.html

需要关闭日期解析以防止出现此错误：


  File "expense.py", line 20, in <module>

    for df in gen:

  File "C:\Users\rfrigo\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\io\sql.py", line 1453, in _query_iterator

    data = cursor.fetchmany(chunksize)

ValueError: year -6371 is out of range

Answer 1

您的问题是当您指定chunksize时，请看以下示例：

if __name__ == '__main__':
    empty_query = 'select * from some_table where id = 8456314523;'
    df =pd.DataFrame()
    df = pd.read_sql(empty_query,connection,chunksize=10**5)
    print "df : {}".format(df if not df.empty else "df is empty")
    print 'END'

当我不指定chunksize = 10 ** 5时，df只是空的，但是当我指定chunksize时，它会导致

AttributeError: 'generator' object has no attribute 'empty'

也许尝试首先运行较小的查询（例如，限制为1），而我这次成功地运行了带有chunksize的查询

我如何运行没有日期解析的pd.read_sql？

1 个答案: