Question

我正在阅读csv并使用以下过程将其插入到云sql中：

df = pd.read_csv(csv_file, sep=';', encoding='utf-8', keep_default_na=False)

    ##  from field “world” remove C. , County , Cnty
    df['world'] = df['world'].str.rstrip('C.CountyCnty')

    ## connects to mysql database and adds the dataframe to it
    connection_string = 'mysql+mysqlconnector://xxxx:xxxx@xx.xxx.x.xx:aaaa/mydatabase'

    engine = create_engine(connection_string, echo=False)
    conn = engine.connect()
    df.to_sql(name="mytable", con=engine, if_exists='append', index=False)
    conn.close()

但是，这会安全地插入到SQL中，直到它面对一个以world字段中的空行值开头的文件。注意：对于具有空行的文件，它会安全插入，稍后会出现。

我认为由于csv数据中的空字段而发生错误。但我使用keep_default_na=False来修复它。但它仍然存在。任何帮助将受到高度赞赏。

这就是错误的样子

 (mysql.connector.errors.OperationalError) 2055: Lost connection to MySQL server at 'xx.xxx.xx.x:aaaa', system error: 10053 An established connection was aborted by the software in your host machine

Answer 1

http://pandas.pydata.org/pandas-docs/stable/io.html#sql-queries：对于某些数据库，写入大型DataFrame可能会因超出数据包大小限制而导致错误。通过在调用to_sql时设置chunksize参数可以避免这种情况。例如，以下内容一次以1000行的数量将数据写入数据库：

 data.to_sql('data_chunked', engine, chunksize=1000)

缺少值/大文件时，云SQL插入错误

1 个答案: