BeautifulSoup截断旧文件,但将新文件空写

时间:2019-12-06 08:40:55

标签: python beautifulsoup

在“ r +”模式soup下,内容将循环附加到所有文件的末尾,并且一切正常。但是我想用新文件替换旧文件内容。但是,当我将文件模式更改为“ w”或“ w +”时,该函数仅截断第一个文件,并且不写入任何内容-它为空。其余文件也保持不变。最后,它引发了我错误:

blog_text.attrs ['href'] = replace_string_1 AttributeError:'NoneType'对象没有属性'attrs'

    def replace_urls(self):
            find_string_1 = '/blog/'
            find_string_2 = '/contact/'
            replace_string_1 = 'blog.html'
            replace_string_2 = 'contact.html'

            exclude_dirs = ['media', 'static']

            for (root_path, dirs, files) in os.walk(f'{settings.BASE_DIR}/static/'):
                dirs[:] = [d for d in dirs if d not in exclude_dirs]
                for file in files:
                    get_file = os.path.join(root_path, file)
                    with codecs.open(get_file, "r+", "utf-8") as f:
                        soup = BeautifulSoup(f, "lxml", from_encoding="utf-8")
                        blog_text = soup.find('a', attrs={'href':find_string_1})
                        contact_text = soup.find('a', attrs={'href':find_string_2})
                        blog_text.attrs['href'] = replace_string_1
                        contact_text.attrs['href'] = replace_string_2
                        f.write(str(soup.prettify()))

0 个答案:

没有答案