给定一个单词列表,我想强调(使用<b>
... </b>
标签)字符串中的这些单词。不使用正则表达式。
例如,我有:
list_of_words = ["python", "R", "Julia" ...]
a_Speech = "A paragraph about programming languages ......R is good for statisticians . Python is good for programmers . ....."
输出应为
a_Speech = "A paragraph about programming languages ......<b>R</b> is good for statisticians . <b>Python</b> is good for programmers . ....."
我尝试过类似的事情:
def right_shift(astr, index, n):
# shift by n = 3,n = 4 characters
def function_name(a_speech):
for x in list_of_words:
if x in a_speech:
loc = a_speech.index(x)
right_shift(a_speech, loc, 3)
a_speech[loc] = "<b>"
right_shift(a_speech, loc+len(x), 4)
a_speech[loc+len(x)] = "</b>
return a_speech
答案 0 :(得分:0)
这完全有效。
你需要在空格上然后在句点上拆分 a_Speech ,所以我们编写一个复合拆分函数is_split_char()
然后将它传递给itertools.groupby()
,这是一个非常整洁的迭代器。
bold_words = set(word.lower() for word in ["python", "R", "Julia"])
# faster to use a set than a list to test membership
import itertools
def bold_specific_words(bold_words, splitchars, text):
"""Generator to split on specified splitchars, and bold words in wordset, case-insensitive. Don't split contiguous blocks of splitchars. Don't discard the split chars, unlike string.split()."""
def is_split_char(char, charset=splitchars):
return char not in charset
for is_splitchar, chars in itertools.groupby(text, is_split_char):
word = ''.join(chars) # reform our word from the sub-iterators
if word.lower() in bold_words:
yield '<b>' + word + '</b>'
else:
yield word
>>> ''.join(word for word in bold_specific_words(bold_words, ' .', a_Speech))
'A paragraph about programming languages ......<b>R</b> is good for statisticians . <b>Python</b> is good for programmers . .....'
答案 1 :(得分:0)
这样的事情可能有用,创建一个带有详细信息的子串列表,并在最后附加它们:
def function_name(a_speech):
loc = 0
substrings = []
for word in list_of_words:
if word in a_speech[loc:]:
currentloc = loc
loc = a_speech.index(word, start=currentloc)
substrings.append(a_speech[currentloc:loc])
substrings.append("<b>")
substrings.append(word)
substrings.append("</b>")
loc += 3 + len(word) + 4
return "".join(substrings)
(注意:未经测试。您可能需要弄清楚最后的一些细节)