Question

我有一些用BeautifulSoup解析HTML的代码，并打印代码。 Here is the source code（如果感兴趣，请联系主旨）：

import csv
import requests
from bs4 import BeautifulSoup
import lxml

r = requests.post('https://opir.fiu.edu/instructor_evals/instr_eval_result.asp', data={'Term': '1175', 'Coll': 'CBADM'})
soup = BeautifulSoup(r.text, "lxml")

tables = soup.find_all('table')
print(tables)



print(tables)

导出到CSV之前我的代码输出如下所示：

 Question   No Response Excellent   Very Good     
  Good   Fair    Poor   
  Description of course objectives and assignments  
  0.0%  76.1%   17.4%   6.5%    0.0%    
  0.0%  
  Communication of ideas and information    0.0%    
  78.3% 17.4%   4.3%    0.0%    0.0%

我真的很喜欢这个输出，并希望将其导出为CSV，因此我添加了以下内容：

writer = csv.writer(open("C:\\Temp\\output_file.csv", 'w'))

for table in tables:
rows = table.find_all("tr")
for row in rows:
    cells = row.find_all("td")
    if len(cells) == 7:  # this filters out rows with 'Term', 'Instructor Name' etc.
        for cell in cells:
            print(cell.text + "\t", end="") 
            writer.writerow(cell.text)
        print("")  # newline after each row
print("-------------")  # table delimiter

不幸的是，这段代码导致每个单独的字符或字母都有自己的单元格：

所以我的问题是：如何修复此代码以便将输出正确导出到CSV文件，而不为每个字符添加新单元格？我不完全是确定它为什么这样做。它似乎只是导出第一个表，并忽略代码中的每一个其他数据。

Answer 1

cell.text是一个字符串，但writerow需要一个可迭代的数据，因此它可以将每个元素写入自己的单元格。由于您传递了一个列表，因此每个字符都被视为一个单独的元素并写入单独的单元格。

您必须在字符串周围包裹[]才能使其正常工作，因此您需要传递字符串列表：

writer.writerow([cell.text])

写入CSV会导致每个字母都有自己的单元格

1 个答案: