我在网站上有内容,其中某些关键字和关键字应与其他内容相关联。我不想手动将链接插入到内容中。在Python中执行此操作的最佳方法是什么,最好不使用任何DOM库?
例如,我有这样的文字:
Key-phrase: awesome method
...And this can be accomplished with the <a href="/path/to/awesome-method">Awesome Method</a>. Blah blah blah....
这是所需的输出:
{{1}}
我列出了这些关键短语和相应的网址。这些短语可以在内容中出现,但在关键短语定义中将全部小写。
目前我正在使用string-find-replace和case-changed单词的组合。这是非常低效的。
答案 0 :(得分:1)
像
这样的东西for keyphrase, url in links:
content = re.sub('(%s)' % keyphrase, r'<a href="%s">\1</a>' % url, content, flags=re.IGNORECASE)
例如,在你的例子中,你可以做到
import re
content = "...And this can be accomplished with the Awesome Method. Blah blah blah...."
links = [('awesome method', '/path/to/awesome-method')]
for keyphrase, url in links:
content = re.sub('(%s)' % keyphrase, r'<a href="%s">\1</a>' % url, content, flags=re.IGNORECASE)
# content:
# '...And this can be accomplished with the <a href="/path/to/awesome-method">Awesome Method</a>. Blah blah blah....'
答案 1 :(得分:1)
您可以迭代文本中的位置,并使用基本字符串操作构建新文本:
import re
text = """And this can be accomplished with the Awesome Method. Blah blah blah"""
keyphrases = [
('awesome method', 'http://awesome.com/'),
('blah', 'http://blah.com')
]
new_parts = []
pos = 0
while pos < len(text):
replaced = False
for phrase, url in keyphrases:
substring = text[pos:pos+len(phrase)]
if substring.lower() == phrase.lower():
new_parts.append('<a href="%s">%s</a>' % (url, substring))
pos += len(substring)
replaced = True
break
if not replaced:
new_parts.append(text[pos])
pos += 1
new_text = ''.join(new_parts)
print(new_text)
答案 2 :(得分:0)
查看anchorman模块 - 将文字转换为超文本。
import anchorman
text = "...And this can be accomplished with the Awesome Method. Blah blah blah...."
links = [{'awesome method': {'value': '/path/to/awesome-method'}}]
markup_format = {
'tag': 'a',
'value_key': 'href',
'attributes': [
('class', 'anups')
],
'case_sensitive': False
}
a = anchorman.add(text, links, markup_format=markup_format)
print a
...And this can be accomplished with the <a href="/path/to/awesome-method"
class="anups">Awesome Method</a>. Blah blah blah....