我正在尝试将一个单词的出现替换为另一个单词:
label = tf.constant([0.35, 0.5, 0.17, 0.14]) # just an example
uniform_random = tf.random_uniform([], 0, 1.0)
# Create a tensor with [1.0, 0.0, 0.0, 0.0] if uniform_random > 50%
# else it's only zeroes
inv = tf.pack([tf.round(uniform_random), 0.0, 0.0, 0.0])
label = tf.sub(inv, label)
label = tf.abs(label) # need abs because it inverted the other elements
# output will be either [0.35, 0.5, 0.17, 0.14] or [0.65, 0.5, 0.17, 0.14]
虽然这有效但word_list = { "ugh" : "disappointed"}
tmp = ['laughing ugh']
for index, data in enumerate(tmp):
for key, value in word_list.iteritems():
if key in data:
tmp[index]=data.replace(key, word_list[key])
print tmp
中ugh
的出现也在输出中被替换:laughing
如何避免这种情况,使输出为ladisappointeding disappointed.
?
答案 0 :(得分:4)
在这种情况下,您可能需要考虑逐字替换。
示例:强>
word_list = { "ugh" : "disappointed"}
tmp = ['laughing ugh']
for t in tmp:
words = t.split()
for i in range(len(words)):
if words[i] in word_list.keys():
words[i] = word_list[words[i]]
newline = " ".join(words)
print(newline)
<强>输出:强>
laughing disappointed
逐步说明:
获取tmp list
中的每个句子:
for t in tmp:
将句子分成words
:
words = t.split()
检查word
words
中的word_list
中是否有keys
。如果是,请将其替换为value
:
for i in range(len(words)):
if words[i] in word_list.keys():
words[i] = word_list[words[i]]
重新加入被替换的单词并打印出结果:
newline = " ".join(words)
print(newline)
答案 1 :(得分:3)
您可以使用RegEx执行此操作:
>>> import re
>>> re.sub(r'\bugh\b', 'disappointed', 'laughing ugh')
'laughing disappointed'
\b
代表单词边界。
答案 2 :(得分:1)
使用re.sub
:
for key, value in word_list.items():
tmp = re.sub("\\b{}\\b".format(key), value, tmp[index])
答案 3 :(得分:1)
word_list = { "ugh" : "disappointed", "123" : "lol"}
tmp = ['laughing 123 ugh']
for word in tmp:
words = word.split()
for i in words[:]:
if i in word_list.keys():
replace_value = word_list.get(i)
words[words.index(i)] = replace_value
output = " ".join(words)
print output
此代码会将dict的每个键(因此要替换的单词)与该键的dict值(您想要替换它的单词)交换,并且具有多个值!
Output:
laughing lol disappointed
希望有所帮助!
答案 4 :(得分:0)
您可以使用正则表达式:
import re
for index, data in enumerate(tmp):
for key, value in word_list.iteritems():
if key in data:
pattern = '\b' + key + '\b'
data = re.sub(pattern, value, data)
tmp[index] = data
附注:您需要data = ...
行(以覆盖data
变量),否则当word_list
包含多个条目时,它将无法正常工作。
答案 5 :(得分:0)
快速:
>>> [re.sub(r'\w+', lambda m: word_list.get(m.group(), m.group()), t)
for t in tmp]
['laughing disappointed']
>>>
非常快:
>>> [re.sub(r'\b(?:%s)\b' % '|'.join(word_list.keys()), lambda m: word_list.get(m.group(), m.group()), t)
... for t in tmp]
['laughing disappointed']
>>>