我收到一条非常长的错误消息,因为我试图挑剔和破坏。主要故障来自RunTimeError:
错误
<previous messages cut>
File "/opt/conda/lib/python2.7/pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
File "/opt/conda/lib/python2.7/pickle.py", line 425, in save_reduce
save(state)
File "/opt/conda/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/opt/conda/lib/python2.7/pickle.py", line 655, in save_dict
self._batch_setitems(obj.iteritems())
File "/opt/conda/lib/python2.7/pickle.py", line 669, in _batch_setitems
save(v)
File "/opt/conda/lib/python2.7/pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
File "/opt/conda/lib/python2.7/pickle.py", line 425, in save_reduce
save(state)
File "/opt/conda/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/opt/conda/lib/python2.7/pickle.py", line 655, in save_dict
self._batch_setitems(obj.iteritems())
File "/opt/conda/lib/python2.7/pickle.py", line 669, in _batch_setitems
save(v)
File "/opt/conda/lib/python2.7/pickle.py", line 306, in save
rv = reduce(self.proto)
File "/opt/conda/lib/python2.7/copy_reg.py", line 74, in _reduce_ex
getstate = self.__getstate__
RuntimeError: maximum recursion depth exceeded while calling a Python object
我的代码的主要目的是从多个网页中删除信息,转储它,然后用pickle加载它以进行进一步分析。我的目标是这样做,因为我不想在每次进行分析时从头开始重做所有东西 - 泡菜听起来像是一个很棒的工具。这是我的代码:
主要代码
import urllib
import urlparse
import re
from bs4 import BeautifulSoup
Links = ["http://www.newyorksocialdiary.com/party-pictures?page=" + str(i) for i in range(3,27)]
Rs = [urllib.urlopen(Link).read() for Link in Links]
soups = [BeautifulSoup(R) for R in Rs]
As = [soup.find_all('a', href=True) for soup in soups]
pattern = re.compile(r"/party-pictures/")
parties = [[a["href"] for a in soup.find_all('a', href=pattern)] for soup in soups]
flattened_parties = []
for party in parties:
for y in party:
flattened_parties.append(y)
## Make new soups with all of the 1193 links
NewLinks = ["http://www.newyorksocialdiary.com" + flattened_parties[i] for i in range(len(flattened_parties))]
NewRs = [urllib.urlopen(NewLink).read() for NewLink in NewLinks]
NewSoups = [BeautifulSoup(NewR) for NewR in NewRs]
味酸/ Unpickle
import pickle
file_Name = "soupsfile"
fileObject = open(file_Name, 'wb')
pickle.dump(a, fileObject)
fileObject.close()
## open the file for reading
fileObject = open(file_Name, 'r')
#load the object from the file into var b
b = pickle.load(fileObject)
print(a==b)
有人可以帮我理解通过泡菜(或其他模块)存储和检索信息的最佳策略吗?谢谢!!