我最近一直在教自己网络抓取,我在尝试使我的代码模块化时遇到了一些错误。我收到错误:
File "ValueScraper.py", line 16, in <module>
table = ts.lol_table(soup)
TypeError: 'list' object is not callable
我正在尝试编写一个使用原始程序方法的新程序,但无法弄清楚为什么会失败。这是使用所有方法的旧程序:
from bs4 import BeautifulSoup
from urllib2 import urlopen
import csv
import pandas
#url of Basketball Reference page
url = "http://www.basketball-reference.com/leagues/NBA_2017_totals.html"
#set columns for pandas dataframe
header = ["Player", "Pos", "Age", "Tm", "G", "GS", "MP", "FG", "FGA", "FG%", "3P", "3PA", "3P%", "2P", "2PA", "2P%", "eFG%", "FT", "FTA", "FT%", "ORB", "DRB", "TRB", "AST", "STL", "BLK", "TOV", "PF", "PTS"]
#open url and turn to BS object
def make_soup(url):
html = urlopen(url).read()
soup = BeautifulSoup(html, "lxml")
return soup
#returns list of lists (lol_table) of html table
def lol_table(soup, class_name = ""):
rows = []
if class_name is "":
rows = soup.find_all('tr')
else:
rows = soup.find_all('tr', class_ = class_name)
data = []
for row in rows:
cols = row.find_all('td')
data_row = []
for col in cols:
data_row.append(col.find(text=True))
data.append(data_row)
return data
#create pandas dataframe from lol_table and create csv of it
def to_pandas_csv(lol_table):
df = pandas.DataFrame(lol_table, columns=header)
df.to_csv("nba.csv")
return df
soup = make_soup(url)
lol_table = lol_table(soup, "full_table")
data_frame = to_pandas_csv(lol_table)
这是新文件:
from bs4 import BeautifulSoup
from urllib2 import urlopen
import csv
import pandas
import TableScraper as ts
#url of table with values
url = "http://www.rotowire.com/daily/NBA/optimizer.php?site=FanDuel"
#columns for table
columns = ["Player", "Value"]
#make soup of url
soup = ts.make_soup(url)
table = ts.lol_table(soup)
任何帮助都将不胜感激。
答案 0 :(得分:4)
lol_table = lol_table(soup, "full_table")
不要将该函数的名称重新绑定到其调用的结果中。 Python不区分函数对象的名称和非函数对象的名称:在该范围内只能有一个该名称的对象,并且最新的绑定优先。选择其他名称。
result = lol_table(soup, "full_table")