Question

I am appending pandas dataframes to sqlite. My primary key is:

Datetime | UserID | CustomerID

My issue is that sometimes I get a new file with old data that I want to append to the existing sqlite table. I am not reading that table into memory so I can't drop_duplicates in pandas. (For example, one file is always month-to-date data and it is sent to me everyday)

How can I ensure that I am only appending unique values based on my primary key? Is there a pandas to_sql function to insert or replace when I append the new data?

Also, should I specify dtypes in pandas before writing to SQL? I had some error messages when I tried to write to SQLite and I had categorical dtypes.

Answer 1

如果您尝试插入重复数据，您将获得sqlite3.IntegrityError例外。你可以抓住它，什么也不做，例如：

try:
  df.to_sql('t',conn,flavor='sqlite',if_exists='append',index=False,
            index_label='user_id')
except sqlite3.IntegrityError: 
  pass

Replacing duplicates pandas to_sql (sqlite)

1 个答案: