我从文件中提取了一些表达式,我想在同一个文件中插入这些表达式,但格式不同,如括号之间。我的问题是我希望每个表达式只替换一个。 该文件看起来像这样
file = """he is a good man
she is a beautiful woman
this is a clever student
he is a bad neighbour
they are bad men
She is very beautiful"""
,表达式就像这样
ex = """ good, clever, beautiful, bad,"""
使用的代码是
adj = ex.split(",")
for a in adj:
if a in file:
file = file.replace(a, ' ' +'[[' + a + ']]')
print file
这给出了以下输出:
he is a [[good]] man [[
]]she is a [[ beautiful]] woman [[
]]this is a [[ clever]] student [[
]]he is a [[ bad]] neighbour [[
]]they are [[ bad]] men [[
]]She is very [[ beautiful]] [[
]] [[
]]
而预期的输出是
he is a [[good]] man
she is a [[ beautiful]] woman
this is a [[ clever]] student
he is a [[ bad]] neighbour
they are bad men # so here "bad" will not be replaced because there is another 'bad' replaced
She is very beautiful # and here 'beautiful' will not be replaced like 'bad'
答案 0 :(得分:1)
字符串的replace方法也接受名为max
的第三个可选参数。
http://www.tutorialspoint.com/python/string_replace.htm
这将允许您选择要替换的单词的出现位置。
例如,
>>> "he is a good man, and a good husband".replace('good', '[[ good ]]', 1)
'he is a [[ good ]] man, and a good husband'
>>>
坚持下去,我现在正在研究你的例子。
在上面的方法中,我假设您已经读取了文件并将其内容存储为单个字符串。在下面的第二个答案中,我将向您展示如何实现代码来解决您的问题
testfile.txt
,其中包含以下内容:he is a good man
she is a beautiful woman
this is a clever student
he is a bad neighbour
they are bad men
She is very beautifu
#!/usr/bin/env python
# your expression
ex = """ good, clever, beautiful, bad,"""
# list comprehension to clean up your expression,
# first by spliting it by comma and then remove anything that is just a empty
wanted_terms = [x.strip() for x in ex.split(',') if x.strip() != '']
## read file using with statement
with open('testfile.txt') as f:
for line in f:
line = line.strip()
## for each wanted terms check if they exist in the line
for x in wanted_terms:
if x in line:
## I prefer to use string format here.
#replacement = "[[ %s ]]" % x
#line = line.replace(x, replacement, 1)
## if term exist, do replacement. Use max =1 to ensure it replace only the first instance.
line = line.replace(x, '[[' + x +']]', 1 )
## remove it from term list so that in future, it will replace any new occurence
wanted_terms.remove(x)
让我知道您发现这个有用或者是否还有其他意见,
干杯, Biobirdman
答案 1 :(得分:0)
biobirdman似乎有一个很好的解决方案,所以使用它来做正确的事情。我在这里的帖子只是为了解释出了什么问题。当你这样做时:
ex = """ good, clever, beautiful, bad,"""
adj = ex.split(",")
你得到的东西不是你的想法
print adj
[' good', ' clever', ' beautiful', ' bad', '']
我不知道你是不是要在每一根弦之前留一个空格,但你几乎肯定不是说最后有一个''。事实上,我认为你没有这个例子,否则你会得到一个不同的不良行为。我认为你在前面结束时是一个新的线条角色。所以'那显示实际上是你尝试的换行符。
因此,它符合您所期望的所有内容,以及适合您的所有换行符。对于使用您发布的代码的任何人,他们将在每对角色之间获得匹配。
[[]]h [[]]e [[]] [[]]i [[]]s [[]] [[]]a ........
修复:摆脱换行符。消除额外的空间。怎么样?看看条带。
答案 2 :(得分:0)
对代码进行两处更改。将adj
替换为word
时,避免[[word]]
中的空字符串并删除前导空格。 word
的值类似于"美丽","聪明"在你的代码中。
file = """he is a good man
she is a beautiful woman
this is a clever student
he is a bad neighbour
they are bad men
She is very beautiful"""
ex = """ good, clever, beautiful, bad,"""
adj = filter(None, ex.split(",")) # removing empty strings from list
# SO ref: http://stackoverflow.com/questions/3845423/remove-empty-strings-from-a-list-of-strings
for a in adj:
if a in file:
file = file.replace(a, ' ' +'[[' + a.strip() + ']]') # strip() removes leading or trailing whitespaces
print file