Question

我试图让用户这样做：

让我们说最初的文字说：

"hello world hello earth"

当用户搜索“hello”时，它应显示：

|hello| world |hello| earth

这就是我所拥有的：

m = re.compile(pattern)
i =0
match = False
while i < len(self.fcontent):
    content = " ".join(self.fcontent[i])
    i = i + 1;
    for find in m.finditer(content):    
        print i,"\t"+content[:find.start()]+"|"+content[find.start():find.end()]+"|"+content[find.end():]
        match = True
        pr = raw_input( "(n)ext, (p)revious, (q)uit or (r)estart? ")
        if (pr == 'q'):
            break
        elif (pr == 'p'):
            i = i -  2
        elif (pr == 'r'):
            i = 0
if match is False:
    print "No matches in the file!"

其中：

pattern = user specified pattern
fcontent = contents of a file read in and stored as array of words and lines e.g:
[['line','1'],['line','2','here'],['line','3']]

然而它打印

|hello| world hello earth
hello world |hello| earth

如何将两条线合并为一条？感谢

修改：

这是一个较大的搜索功能的一部分，其中模式......在这种情况下，单词“hello”从用户传递，所以我必须使用正则表达式搜索/匹配/查找器来查找模式。遗憾的是，替换和其他方法不起作用，因为用户可以选择搜索“[0-9] $”，这意味着将结束数字放在|'s之间

Answer 1

如果您刚刚这样做，请使用str.replace。

print self.content.replace(m.find, "|%s|" % m.find)

Answer 2

您可以按如下方式使用正则表达式：

import re
src = "hello world hello earth"
dst = re.sub('hello', '|hello|', src)
print dst

或使用字符串替换：

dst = src.replace('hello', '|hello|')

Answer 3

好的，回到原始解决方案，因为OP确认该单词可以独立存在（即不是另一个单词的子串）。

target = 'hello'
line = 'hello world hello earth'
rep_target = '|{}|'.format(target)

line = line.replace(target, rep_target)

的产率：

|hello| world |hello| earth

Answer 4

正如基于您的示例所指出的，使用str.replace是最简单的。如果需要更复杂的标准，那么您可以调整以下内容......

import re

def highlight(string, words, boundary='|'):
    if isinstance(words, basestring):
        words = [words]
    rs = '({})'.format(boundary.join(sorted(map(re.escape, words), key=len, reverse=True)))
    return re.sub(rs, lambda L: '{0}{1}{0}'.format(boundary, L.group(1)), string)

Python搜索多个值并显示边界

4 个答案: