首先。如果这不是共谋结构,我很抱歉。我只是不确定从哪里开始或结束,但我尽力给你尽可能多的信息。
我在AWS M3.large,py2neo 2.0.4和neo4j-community-2.1.7上工作
我正在尝试使用py2neo将大型数据集导入neo4j。我的问题是,当我读了大约150k时,它只给我一个:py2neo.packages.httpstream.http.SocketError: timed out
我需要投入数百万的投入,所以150k才能正常工作。
整个错误:
Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/py2neo/packages/httpstream/http.py", line 322, in submit
response = send()
File "/usr/local/lib/python3.4/dist-packages/py2neo/packages/httpstream/http.py", line 318, in send
return http.getresponse(**getresponse_args)
File "/usr/lib/python3.4/http/client.py", line 1147, in getresponse
response.begin()
File "/usr/lib/python3.4/http/client.py", line 351, in begin
version, status, reason = self._read_status()
File "/usr/lib/python3.4/http/client.py", line 313, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/usr/lib/python3.4/socket.py", line 371, in readinto
return self._sock.recv_into(b)
socket.timeout: timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/py2neo/packages/httpstream/http.py", line 331, in submit
response = send("timeout")
File "/usr/local/lib/python3.4/dist-packages/py2neo/packages/httpstream/http.py", line 318, in send
return http.getresponse(**getresponse_args)
File "/usr/lib/python3.4/http/client.py", line 1147, in getresponse
response.begin()
File "/usr/lib/python3.4/http/client.py", line 351, in begin
version, status, reason = self._read_status()
File "/usr/lib/python3.4/http/client.py", line 313, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/usr/lib/python3.4/socket.py", line 371, in readinto
return self._sock.recv_into(b)
socket.timeout: timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "transactions.py", line 221, in <module>
read_zip("data")
File "transactions.py", line 44, in read_zip
create_tweets(lines)
File "transactions.py", line 215, in create_tweets
tx.process()
File "/usr/local/lib/python3.4/dist-packages/py2neo/cypher/core.py", line 296, in process
return self.post(self.__execute or self.__begin)
File "/usr/local/lib/python3.4/dist-packages/py2neo/cypher/core.py", line 248, in post
rs = resource.post({"statements": self.statements})
File "/usr/local/lib/python3.4/dist-packages/py2neo/core.py", line 322, in post
response = self.__base.post(body, headers, **kwargs)
File "/usr/local/lib/python3.4/dist-packages/py2neo/packages/httpstream/http.py", line 984, in post
return rq.submit(**kwargs)
File "/usr/local/lib/python3.4/dist-packages/py2neo/packages/httpstream/http.py", line 433, in submit
http, rs = submit(self.method, uri, self.body, self.headers)
File "/usr/local/lib/python3.4/dist-packages/py2neo/packages/httpstream/http.py", line 362, in submit
raise SocketError(code, description, host_port=uri.host_port)
py2neo.packages.httpstream.http.SocketError: timed out
现在我使用密码。我分批写了~1000,但是较小的批次也不起作用。我的问题是,我可以使用别的东西来加快速度吗?
现在,我这样做:
stagement = "match (p:person {id=123}) ON CREATE SET p.age = 132"
def add_names(names):
for-loop with batches of 1000:
tx = graph.cypher.begin()
for name in names:
tx.append(statement, {"N": name})
tx.process()
tx.commit()
但是,使用执行或流,或者我能做些什么来使它工作会更好吗?
有用的链接:
答案 0 :(得分:6)
尝试添加
from py2neo.packages.httpstream import http
http.socket_timeout = 9999