Question

我有一个熊猫Dataframe对象，并使用以下命令遍历行：

for idx, row in df.iterrows():
    # do some stuff
    # save row to database

问题是当我尝试将其保存到数据库时，to_sql将我的row视为列。

变量row的类型似乎为Series，我在手册中的Series.to_sql上进行了仔细的搜索，但没有发现将其视为数据库行而不是列。

我想出的解决方法是将Series转换为DataFrame，然后换位：

    temp = pd.DataFrame(row).T
    temp.to_sql(table, con=engine, if_exists='append', index_label='idx')

有没有更简单的方法？

Answer 1

与其使用df.iterrows来返回索引和每一行的序列表示，一种方法是遍历df.index并使用integer-location based indexing对数据帧进行切片以进行行处理。

df = pd.DataFrame.from_dict({'a':[1,2,3],'b':[4,5,6]})
for i in range(df.index):
    row = df.iloc[i:i+1,:]
    #do Stuff
    row.to_sql(...)

这是修改数据框的推荐方法。来自df.iterrows文档字符串：

2. You should **never modify** something you are iterating over.
   This is not guaranteed to work in all cases. Depending on the
   data types, the iterator returns a copy and not a view, and writing
   to it will have no effect.

如何将熊猫系列连续存储到sql

1 个答案: