如何将字符添加到元组或列表中的字符串元素?

时间:2018-04-20 12:37:06

标签: python string replace

我有以下句子:

sentence = "<s> online auto body <s>" 

我想先用3克的单词作为:

('<s>', 'outline', 'auto')
('online', 'auto', 'body')
('auto', 'body', '<s>')

为此,我使用了以下代码:

sentence = '<s> online auto body <s>'
n = 3
word_3grams = ngrams(sentence.split(), n)
for grams in word_3grams: 
    print(grams)

现在,我希望在每个单词的开头和结尾都加上“#”,如下所示:

('#<s>#','#outline#','#auto#')
('#online#', '#auto#', '#body#')
('#auto#', '#body#', '#<s>#')

但我不知道该怎么办才能得到它。这里的旁注元素是元组,但不介意使用列表。

5 个答案:

答案 0 :(得分:1)

你想要一个像功能一样的滑动窗口。

from itertools import islice

sentence = "<s> online auto body <s>"
myList = sentence.split()
myList = ['#' + word + '#' for word in myList]

slidingWindow = [islice(myList, s, None) for s in range(3)]
print(list(zip(*slidingWindow)))

# [('#<s>#', '#online#', '#auto#'), ('#online#', '#auto#', '#body#'), ('#auto#', '#body#', '#<s>#')]

答案 1 :(得分:0)

如果您只想更改字符串,请尝试:

map(lambda s: "#" + s + "#", sentence.split())

答案 2 :(得分:0)

在Python中,元组是不可变的,这意味着它不能被修改。 正如您以某种方式建议的那样,更准确地说,使用列表会更好  list comprehension

aList = ['auto', 'body', '<s>']
newList = ['#' + item + '#' for item in aList]
print (newList)
# ['#auto#', '#body#', '#<s>#']

答案 3 :(得分:0)

您可以使用列表推导和format功能

来执行此操作
word_3grams = [('<s>', 'outline', 'auto'),
               ('online', 'auto', 'body'),
               ('auto', 'body', '<s>')]

for grams in word_3grams: 
    print ["{pad}{data}{pad}".format(pad='#', data=s) for s in grams]

['#<s>#', '#outline#', '#auto#']
['#online#', '#auto#', '#body#']
['#auto#', '#body#', '#<s>#']

答案 4 :(得分:0)

从一开始就是一个解决方案:

sentence = "<s> online auto body <s>" 
n = 3

# Split the sentence into words and append the '#' symbol.
words = tuple(map(lambda w: '#'+w+'#', sentence.split()))

# Create a list of elements consisting of three consecutive words.
splits = [words[i:i+n] for i in range(len(words)-(n-1))]

#Print results.
for elem in splits:
    print(elem)

输出:

('#<s>#', '#online#', '#auto#')
('#online#', '#auto#', '#body#')
('#auto#', '#body#', '#<s>#')