Question

问题：

我正在尝试从.txt文件中删除空行。因为我的.txt文件是由Python通过HTML下载生成的，我想将它们保存在某个位置，所以我必须使用Os.path.join。

这是在删除所有TAGS并仅保留标记内部后将HTML保存在该位置的代码：

cntent = re.sub('<[^>]+>',"\n", str(cntent))
with open(os.path.join('/Users/Brian/Documents/test',titles), "wb") as file: 
        file.writelines(str(cntent))

我怎样才能实现这一目标？

文件的结果：

Productspecificaties




Uiterlijke kenmerken















Gewicht










185 g

我尝试了什么：

filtered = filter(lambda x: not re.match(r'^\s*$', x), original)

期望的结果

 Productspecificaties
 Uiterlijke Kenmerken
 Gewicht
 185Gr

请注意，在第一行代码re.sub...中，我使用“\ n”，否则根本就没有空格。

Answer 1

您不需要使用正则表达式：

cntent = re.sub('<[^>]+>',"\n", str(cntent))
with open(os.path.join('/Users/Brian/Documents/test', titles), "wb") as f: 
    f.writelines(line for line in cntent.splitlines(True) if line.strip())

str.strip()在字符串的开头和结尾处去除空格（包括换行符）。对于仅包含空格的行，它将返回空字符串;被评估为假值。

带有 True 的

str.splitlines用于拆分行，但不排除新行。

Answer 2

尝试这种模式
^\s+ w / m选项
Demo

python从文件中删除行

2 个答案: