我正在编写一个Python爬虫来克隆git存储库并对其进行分析。我使用subprocess.call()来克隆给定的存储库。问题是,在只有几个存储库之后,我得到一个" OSError:[Errno 12]无法分配内存":
File "main.py", line 44, in main
call(["git", "clone", remote_url.strip(), os.getcwd() + '/' + DIR_NAME])
File /usr/lib/python2.7/subprocess.py", line 522, in call
return Popen(*popenargs, **kwargs).wait()
File /usr/lib/python2.7/subprocess.py", line 709, in __init__
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1222, in _execute_child
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
我尝试使用sh模块和GitPython。我该如何避免这个问题?
我的代码如下:
for remote_url in remote_urls:
try:
if os.path.isdir(os.getcwd() + '/' + DIR_NAME):
shutil.rmtree(os.getcwd() + '/' + DIR_NAME)
os.mkdir(DIR_NAME)
# repo_url = remote_url.replace('ssh://', '')
call(["git", "clone", remote_url.strip(), os.getcwd() + '/' + DIR_NAME])
# with sh.git.bake(_cwd=os.getcwd() + '/' + DIR_NAME) as git:
# git.clone(remote_url.strip())
print 'Pulled # ' + str(repo_count) + ' repos'
except:
traceback.print_exc()
continue