我有11个.json文件(2006.json,2007.json,...,2016.json)存储在名为/ arbetsformedlingen的目录中。我想要的是将所有这11个.json文件转换为11个.db文件,例如(2006.db,2007.db,...,2016.db)。我正在使用jupyter(ipython)笔记本和python 3.6(py36)。以下代码获取所有.json文件并将其转换为一个名为arbetsformedlingen.db的文件(这是一个大文件)。
import pandas as pd
import sqlite3
import uuid
conn=sqlite3.connect('/Users/mo/PBL/arbetsformedlingen/arbetsformedlingen.db')
conn.execute('drop table if exists stillinger')
for year in range(2006,2017,1):
file = '/Users/mo/PBL/arbetsformedlingen/' + str(year) + '.json'
df = pd.read_json(file, lines=True)
guids = []
for i in range(0,len(df)):
guids.append(str(uuid.uuid4()))
guids_s = pd.Series(guids)
df.insert(0, 'ID', guids_s)
df.to_sql("stillinger", conn, if_exists="append", index=False, chunksize=1000)
sql = """
select * from stillinger limit 1
"""
res = pd.read_sql(sql, conn); res
如果我想将所有内容存储在一个.db文件中,哪个效果很好。有关如何创建11 .db而不是一个.db文件的任何建议吗?也许对代码进行简单修改,或者更有效地转换它们?
这是一种合理的方式吗?:
for year in range(2006,2017,1):
for file in year:
conn=sqlite3.connect('/Users/mo/PBL/arbetsformedlingen/+ str(year) + '.db')
conn.execute('drop table if exists stillinger')
file = '/Users/mo/PBL/arbetsformedlingen/' + str(year) + '.json'
df = pd.read_json(file, lines=True)
guids = []
for i in range(0,len(df)):
guids.append(str(uuid.uuid4()))
guids_s = pd.Series(guids)
df.insert(0, 'ID', guids_s)
df.to_sql("stillinger", conn, if_exists="append", index=False, chunksize=1000)
sql = """
select * from stillinger limit 1
"""
res = pd.read_sql(sql, conn); res
祝你好运 莫
答案 0 :(得分:0)
感谢@Christian Stade-Schuldt的提示!这解决了我的问题:
for year in range(2006,2017,1):
conn=sqlite3.connect('/Users/mo/PBL/arbetsformedlingen/' + str(year) + '.db')
cur = conn.cursor()
conn.execute('drop table if exists stillinger')
file = '/Users/mo/PBL/arbetsformedlingen/' + str(year) + '.json'
df = pd.read_json(file, lines=True)
guids = []
for i in range(0,len(df)):
guids.append(str(uuid.uuid4()))
guids_s = pd.Series(guids)
df.insert(0, 'ID', guids_s)
df.to_sql("stillinger", conn, if_exists="append", index=False, chunksize=1000)