为什么我的def函数在Python中不起作用?

时间:2019-11-22 19:09:35

标签: python beautifulsoup

我正在尝试将表格中的某些数据保存到CSV文件中。

import requests
import csv
from bs4 import BeautifulSoup

#Main function
def getContent(link):
    #Request content
    result1 = requests.get(link)

    #Save source in var
    src1 = result1.content

    #Activate soup
    soup = BeautifulSoup(src1,'lxml')

    #Look for table
    table = soup.find('table')

    #Save in csv
    with open('averageheight.csv','w',newline='') as f:
        writer = csv.writer(f)
        for tr in table('tr'):
            row = [t.get_text(strip=True)for t in tr(['td','th'])]
            writer.writerow(row)


#LINKS
getContent('https://en.wikipedia.org/wiki/Average_human_height_by_country')

我得到的错误:

  File "c:/Users/Agent 1/Desktop/Datapackages/Average Height/process.py", line 31, in <module>
    getContent('https://en.wikipedia.org/wiki/Average_human_height_by_country')
  File "c:/Users/Agent 1/Desktop/Datapackages/Average Height/process.py", line 27, in getContent
    writer.writerow(row)
  File "C:\Users\Agent 1\AppData\Local\Programs\Python\Python38-32\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2044' in position 24: character maps to <undefined>

2 个答案:

答案 0 :(得分:3)

在您的计算机上运行您的代码,未发现任何错误。但是,您可能需要考虑将encoding='utf-8'设置为with open(...) as f

import requests
import csv
from bs4 import BeautifulSoup

#Main function
def getContent(link):
    #Request content
    result1 = requests.get(link)

    #Save source in var
    src1 = result1.content

    #Activate soup
    soup = BeautifulSoup(src1,'lxml')

    #Look for table
    table = soup.find('table')

    #Save in csv
    with open('averageheight.csv','w',newline='', encoding='utf-8') as f:
        writer = csv.writer(f)
        for tr in table('tr'):
            row = [t.get_text(strip=True)for t in tr(['td','th'])]
            writer.writerow(row)


#LINKS
getContent('https://en.wikipedia.org/wiki/Average_human_height_by_country')

答案 1 :(得分:2)

将ascii字符转换为utf-8。使用下面的修改的代码行:

row = [(t.get_text(strip=True)).encode('utf-8') for t in tr(['td','th'])]