我正在AWS Sage Maker上的Jupyter笔记本上工作。我已经对具有5000行的数据执行了文本处理。我想使用以下代码将此代码写入另一个SQL查询。
conn=sqlite3.connect('final_2.sqlite')
c=conn.cursor()
conn.text_factory=str
final.to_sql('Reviews',conn,schema=None,if_exists='replace')
它可以节省2.09 GB内存并停止运行。当我打开此文件时,其不视为文件。然后,我尝试写入.csv文件,但仍然存在相同的问题。 下载并打开csv时,出现以下错误。
Jupyter Notebook
current mode
File
Edit
View
Language
1
Error! Traceback (most recent call last):
2
File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/tornado/web.py", line 1699, in _execute
3
result = await result
4
File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/tornado/gen.py", line 209, in wrapper
5
yielded = next(result)
6
File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/notebook/services/contents/handlers.py", line 112, in get
7
path=path, type=type, format=format, content=content,
8
File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/notebook/services/contents/filemanager.py", line 438, in get
9
model = self._file_model(path, content=content, format=format)
10
File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/notebook/services/contents/filemanager.py", line 365, in _file_model
11
content, format = self._read_file(os_path, format)
12
File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/notebook/services/contents/fileio.py", line 309, in _read_file
13
bcontent = f.read()
14
MemoryError
15
16
Saving disabled.
17
See Console for more details.
我尝试在python中检查我的可用空间,但仍有大约30 GB的可用空间。
有人可以让我知道这种情况是什么问题。谢谢!
答案 0 :(得分:0)
这个确切的问题发生在我身上。我已经通过增加RAM大小解决了这个问题。
发生此问题是因为to_sql
命令正在尝试将整个数据帧转换为SQL代码。一方面,它用完了内存。
解决该问题的方法是按以下方式批量加载数据:
batch_size = 10000
for i in range(0,range(len(final)),batch_size):
final[i,i+batch_size].to_sql('Reviews',con=conn,schema=None,if_exists='append')