Question

TypeError：需要类似字节的对象，而不是＆＃39; str＆＃39;

在执行下面的python代码时将上述错误保存在Csv文件中以保存HTML表数据。不知道如何骑车。请帮助我。

import csv
import requests
from bs4 import BeautifulSoup

url='http://www.mapsofindia.com/districts-india/'
response=requests.get(url)
html=response.content

soup=BeautifulSoup(html,'html.parser')
table=soup.find('table', attrs={'class':'tableizer-table'})
list_of_rows=[]
for row in table.findAll('tr')[1:]:
    list_of_cells=[]
    for cell in row.findAll('td'):
        list_of_cells.append(cell.text)
    list_of_rows.append(list_of_cells)
outfile=open('./immates.csv','wb')
writer=csv.writer(outfile)
writer.writerow(["SNo", "States", "Dist", "Population"])
writer.writerows(list_of_rows)

在最后一行上方。

Answer 1

您使用的是Python 2方法而不是Python 3。

变化：

outfile=open('./immates.csv','wb')

要：

outfile=open('./immates.csv','w')

您将获得一个包含以下输出的文件：

SNo,States,Dist,Population
1,Andhra Pradesh,13,49378776
2,Arunachal Pradesh,16,1382611
3,Assam,27,31169272
4,Bihar,38,103804637
5,Chhattisgarh,19,25540196
6,Goa,2,1457723
7,Gujarat,26,60383628
.....

在Python 3中，csv以文本模式获取输入，而在Python 2中，它以二进制模式获取。

已编辑添加

以下是我运行的代码：

url='http://www.mapsofindia.com/districts-india/'
html = urllib.request.urlopen(url).read()
soup = BeautifulSoup(html)
table=soup.find('table', attrs={'class':'tableizer-table'})
list_of_rows=[]
for row in table.findAll('tr')[1:]:
    list_of_cells=[]
    for cell in row.findAll('td'):
        list_of_cells.append(cell.text)
    list_of_rows.append(list_of_cells)
outfile = open('./immates.csv','w')
writer=csv.writer(outfile)
writer.writerow(['SNo', 'States', 'Dist', 'Population'])
writer.writerows(list_of_rows)

Answer 2

我遇到了与Python3相同的问题。我的代码写入了io.BytesIO()。

替换为io.StringIO()已解决。

Answer 3

您正在以二进制模式打开csv文件，该文件应为'w'

import csv

# open csv file in write mode with utf-8 encoding
with open('output.csv','w',encoding='utf-8',newline='')as w:
    fieldnames = ["SNo", "States", "Dist", "Population"]
    writer = csv.DictWriter(w, fieldnames=fieldnames)
    # write list of dicts
    writer.writerows(list_of_dicts) #writerow(dict) if write one row at time

Answer 4

file = open('parsed_data.txt', 'w')
for link in soup.findAll('a', attrs={'href': re.compile("^http")}): print (link)
soup_link = str(link)
print (soup_link)
file.write(soup_link)
file.flush()
file.close()

就我而言，我使用BeautifulSoup用Python 3.x编写.txt。它有同样的问题。正如@tsduteba所说，将第一行的'wb'改为'w'。

Answer 5

只需将wb更改为w

outfile=open('./immates.csv','wb')

到

outfile=open('./immates.csv','w')

TypeError：需要类似字节的对象，而不是＆＃39; str＆＃39;在python和CSV中

5 个答案: