Replacing duplicates pandas to_sql (sqlite)

时间:2015-07-08 15:57:26

标签: sqlite pandas

I am appending pandas dataframes to sqlite. My primary key is:

Datetime | UserID | CustomerID

My issue is that sometimes I get a new file with old data that I want to append to the existing sqlite table. I am not reading that table into memory so I can't drop_duplicates in pandas. (For example, one file is always month-to-date data and it is sent to me everyday)

How can I ensure that I am only appending unique values based on my primary key? Is there a pandas to_sql function to insert or replace when I append the new data?

Also, should I specify dtypes in pandas before writing to SQL? I had some error messages when I tried to write to SQLite and I had categorical dtypes.

1 个答案:

答案 0 :(得分:0)

如果您尝试插入重复数据,您将获得sqlite3.IntegrityError例外。你可以抓住它,什么也不做,例如:

try:
  df.to_sql('t',conn,flavor='sqlite',if_exists='append',index=False,
            index_label='user_id')
except sqlite3.IntegrityError: 
  pass