Question

所以这是我正在运行的脚本，它在windows中输出正常，但是在ubuntu中，它只是打印一个空列表

import urllib2
import os
import re
import csv
from bs4 import BeautifulSoup

useragent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1309.0 Safari/537.17'

def main():
    # lib-talkingpointsmemo.py
    archive = 'http://talkingpointsmemo.com/archive.php'    
    getweeklinks(archive)

def getweeklinks(archivelink):
    print 'something'
    urls = []
    request = urllib2.Request(archivelink, headers={'User-agent': useragent})
    webpage = urllib2.urlopen(request).read()   
    soup = BeautifulSoup(webpage)
    anchors = soup('a') 
    print anchors
    for a in anchors:
        print a['href']

if __name__ == '__main__' : main()

和输出：

something
[]

什么错了？我正在使用Ubuntu 12.04.1 LTS

Answer 1

嗯...你的脚本没什么问题，它在Ubuntu上运行正常，除了我正在使用：Ubuntu 10.04.2 LTS和Python 2.6.5

在黑暗中拍摄，但也许试试......

soup = BeautifulSoup(webpage,"html.parser")

...确保在windows和ubuntu测试之间使用相同的解析器。您可能还想尝试其他一些parser options

脚本不在ubuntu中运行，但在Windows中运行良好

1 个答案: