试图从Python中抓取数据制作一个csv文件

时间:2017-06-18 20:24:57

标签: python export export-to-csv

我是Python和任何类型编码的新手......我希望这不是一个简单的问题。

我正在尝试从网络上的抓取数据制作一个csv文件。

AttributeError:'Doctype'对象没有属性'find_all'

但是这个错误不会消失!

这是整个代码

import bs4 as bs
import urllib.request


req = urllib.request.Request('http://www.mobygames.com/game/tom-clancys-rainbow-six-siege',headers={'User-Agent': 'Mozilla/5.0'})

sauce = urllib.request.urlopen(req).read()

soup = bs.BeautifulSoup(sauce,'lxml')

scores = soup.find_all("div")

filename = "scores1.csv"
f = open(filename, "w")

headers = "Hi, Med, Low\n"

f.write(headers)

for scores in soup:
    scoreHi = scores.find_all("div", {"class":"scoreHi"})
    Hi = scoreHi[0].text
    scoreMed = scores.find_all("div", {"class":"scoreMed"})
    Med = scoreMed[0].text
    scoreLow = scores.find_all("div", {"class":"scoreLow"})
    Low = scoreLow[0].text

    print ("Hi: " + Hi)

    print ("Med: " + Med)

    print ("Low: "+ Low)

    f.write(Hi + "," + Med.replace(",","|") + "," + Low + "\n")


f.close() 

1 个答案:

答案 0 :(得分:0)

首先分配给分数:

var rng1=sht.getRange('A1:C27');

这很好,但你应该走过这些分数:

scores = soup.find_all("div")

尝试使用:

迭代Doc(即for score in scores: scoreHi = score.find_all("div", {"class":"scoreHi"}) Hi = scoreHi[0].text scoreMed = score.find_all("div", {"class":"scoreMed"}) Med = scoreMed[0].text scoreLow = score.find_all("div", {"class":"scoreLow"}) Low = scoreLow[0].text
soup

毫无意义。