在python中处理后如何保存文件?

时间:2018-08-17 16:29:49

标签: python beautifulsoup

我是python的新手。我开始制作一个脚本,用于使用Beautiful Soup处理HTML文件。一切都可以正确处理,但是我现在想将文章保存在名为nowe的新文件夹中,而不是打印它。处理后,我需要将所有文章放入同一文件夹中,或者制作一个CSV文件。

from bs4 import BeautifulSoup
import glob
import os, os.path


path = '/home/darek/Dokumenty/pliki/'
path_out = '/home/darek/Dokumenty/pliki/nowe'
for filename in glob.glob(os.path.join(path, '*.html',)):
    f = filename
    tresc = open(f)
    soup = BeautifulSoup(tresc, 'html.parser') 
    article = soup.find('div',class_='post')
    tagi = soup.find('div', class_='ph_social_share_box ph_social_share_box_bottom')

    fout = open( +filename, "w")
    fout.close()

print(article)

我的错误日志:

File "/home/darek/Dokumenty/parser.py", line 21, in <module>
    fout = open( +filename, "w")

TypeError: bad operand type for unary +: 'str'

适用于印刷

从bs4导入BeautifulSoup     导入球     导入os,os.path

path = '/home/darek/Dokumenty/pliki/'
path_out = '/home/darek/Dokumenty/pliki/nowe'
for filename in glob.glob(os.path.join(path, '*.html',)):
    f = filename
    content = open(f)
    soup = BeautifulSoup(content, 'html.parser') 
    article = soup.find('div',class_='post')
    tags = soup.find('div', class_='ph_social_share_box ph_social_share_box_bottom')


print(article)

那行得通,我无法写入文件提示?

2 个答案:

答案 0 :(得分:0)

在此行中删除“ +”: fout = open( +filename, "w")

“ w”的意思是:“以写模式打开文件”。 如果在其中添加“ +”(例如“ w +”),则打开后它将从头开始写入文件。所以这行应该是

fout = open(filename, "w+")

答案 1 :(得分:0)

更改此代码块:

fout = open( +filename, "w")
fout.close()

要这样:

fout = open( filename, "w")
fout.write(article) # I assume here that article is what you want to be writing
fout.close()

tresc.close() # You never closed this, so it was a memory leak