filecontents = f.read()
from nltk.tokenize import sent_tokenize
sent_tokenize_list = sent_tokenize(filecontents)
for sentence in sent_tokenize_list:
sentence = "Start " + sentence + " End"
print sentence
结果如
"start ~~~~~~ end "
"start ~~~~~ end"
"start ~~~~~ end"
但我希望将它们全部连接起来,作为整个字符串。我该怎么做。
答案 0 :(得分:0)
一个班轮:
" ".join(["Start " + sentence + " End" for sentence in sent_tokenize_list])
答案 1 :(得分:0)
您可以将每个已处理的句子添加到结果中,例如
result = ''
for sentence in sent_tokenize_list:
result += "Start " + sentence + " End"
print result
或者(并且以我认为的更加pythonic的方式)你可以使用list comprehension而不是for子句来制作修改后的句子列表,然后将所有句子加在一起
new_sentences = ['Start ' + sentence + ' End' for sentence in sent_tokenize_list]
result = ''.join(new_sentences)
答案 2 :(得分:0)
您可以对空字符串使用列表推导和连接方法来连接所有句子。
print "".join(["Start " + sentence + " End" for sentence in sent_tokenize_list])
您还可以使用评论中提到的生成器理解来获得更好的性能。
print "".join("Start " + sentence + " End" for sentence in sent_tokenize_list)
答案 3 :(得分:0)
尝试列表理解:
result = "".join("Start {} end".format(sentence) for sentence in sent_tokenize_list)