所以我的CSV_Output文件是空的,虽然我没有收到任何错误。我正在尝试从CSV_to_Read文件中再添加一列。 article.cleaned_text的印刷品有效。所以我觉得我只是在做一些愚蠢的事情。谢谢!
from csv import reader, writer
import unicodecsv as csv
from goose import Goose
with open('CSV_to_Read.csv','r') as csvfile:
readCSV = csv.reader(csvfile, encoding='utf-8')
out = writer(open("CSV_Output.csv", "a"))
for row in readCSV:
g = Goose({'browser_user_agent': 'Mozilla', 'parser_class':'soup'})
try:
article = g.extract(url=row[0])
print article.cleaned_text
out.writerow([row[0], row[1], row[2], row[3], row[4], row[5], row[6], article.cleaned_text, row[7], row[8], row[9]])
except Exception:
pass
答案 0 :(得分:0)
在这里打开输出文件的文件对象,但不要关闭它。
out = writer(open("CSV_Output.csv", "a"))
数据可能已缓冲且尚未刷新到磁盘。避免此错误的一种方法是确保关闭文件对象。后者由文件对象上下文管理器(即with open(path) as file:
语法)处理。
因此,我建议您将代码更改为:
with open('CSV_to_Read.csv','r') as csvfile:
readCSV = csv.reader(csvfile, encoding='utf-8')
with open("CSV_Output.csv", "a") as outfile:
out = writer(outfile)
for row in readCSV:
g = Goose({'browser_user_agent': 'Mozilla', 'parser_class':'soup'})
try:
article = g.extract(url=row[0])
print article.cleaned_text
out.writerow([row[0], row[1], row[2], row[3], row[4], row[5], row[6], article.cleaned_text, row[7], row[8], row[9]])
except Exception:
pass