Question

我正在学习Python并坚持我认为是一个微不足道的问题。我试图在文本文件中的每一行的末尾添加一个分隔符//当该行的分隔符不存在时。

示例文本文件＆＃39; example.txt＆＃39;：

A string of information that does not require the delimiter
95 full !oe, !oeha //
96 new  kaba
100 name    !uo5 //

在这个示例文本文件中，我希望将//添加到以96开头的行的末尾。我的策略是找到那些需要分隔符的行（即以数字开头的行），test to看是否//存在，如果不存在则将//追加到该行的末尾。我的代码如下：

import re
infile = open("example.txt", 'r+w')

for line in infile:
    m = re.match(r'(\d+)\s+\w+\s+([^/]+)', line)
    if m:
        test = line.find('//')
        if test == -1:
            infile.write(line + ' // \n')
        continue

我的example.txt文件的输出如下所示：

A string of information that does not require the delimiter
95 full !oe, !oeha //
96 new  kaba
100 name    !uo5 //
96 new  kaba
 //

为什么infile.write(line + ' // \n')会在.txt文件中附加一个新行而不是替换缺少分隔符的行？此外，为什么分隔符//不出现在同一行？

我已使用infile.replace(line, line + ' // \n')代替infile.write(line + ' // \n')进行了实验，但收到错误消息AttributeError: 'file' object has no attribute 'replace'。

Answer 1

使用re.sub函数可以大大简化代码。

^(\d+.*)(?<!//)$

使用示例：

>>> file = open('input', 'r')
>>> for line in file:
...     print re.sub(r'^(\d+.*)(?<!//)$', r'\1//', line),

将产生输出

A string of information that does not require the delimiter
95 full !oe, !oeha //
96 new  kaba//
100 name    !uo5 //

<强>正则表达式

^将正则表达式锚定在字符串的开头
\d+匹配任意数量的字符串。 Anchor确保该行以数字开头
.*匹配任何内容直至行尾
(?<!//)负面的背后隐藏。断言字符串$的结尾不是由//
$将正则表达式锚定在字符串的末尾

Answer 2

^(?=\d+(?:(?!\/\/).)*$)(.*)

试试这个。\1 //。见。演示。

http://regex101.com/r/rA7aS3/13

import re
p = re.compile(ur'^(?=\d+(?:(?!\/\/).)*$)(.*)', re.MULTILINE)
test_str = u"A string of information that does not require the delimiter\n95 full !oe, !oeha //\n96 new kaba\n100 name !uo5 //\n100 name !uo5 "
subst = u"\1 //"

result = re.sub(p, subst, test_str)

将test_str替换为file.read（）。

Answer 3

你不需要正则表达式，如果该行以数字/数字开头并且不以"//"结尾，只需删除换行符添加"//\n"到最后，重新打开{ {1}}模式覆盖并写入更新的行。

Answer 4

我会使用不同的文件来输出而不是输入，如果你真的需要替换它，请手动覆盖后面的字。我在Python 2.7中做了以下内容：

import re

# Open an output file distinct from the input file
infile = open("example.txt", 'r')
outfile = open("output.txt", 'w')

for line in infile:
    # Newline already present in input line - rstrip() to kill it
    result = line.rstrip()
    m = re.match(r'(\d+)\s+\w+\s+([^/]+)', result)
    if m:
        test = result.find('//')
        if test == -1:

            # Add the delimiter
            result += ' //'

    # Just write the original line if no changes were needed
    outfile.write(result + "\n")

# Close the streams
infile.close()
outfile.close()

Python：如何找到丢失的分隔符并将其附加到文本文件

4 个答案: