所以我在将字符串拆分为单词和标点符号后尝试使用join()
,但它将字符串与标点符号之间的空格连接起来。
b = ['Hello', ',', 'who', 'are', 'you', '?']
c = " ".join(b)
但是返回:
c = 'Hello , who are you ?'
我希望:
c = 'Hello, who are you?'
答案 0 :(得分:1)
你可以先加入标点符号:
def join_punctuation(seq, characters='.,;?!'):
characters = set(characters)
seq = iter(seq)
current = next(seq)
for nxt in seq:
if nxt in characters:
current += nxt
else:
yield current
current = nxt
yield current
c = ' '.join(join_punctuation(b))
join_punctuation
生成器生成的字符串包含以下标点符号:
>>> b = ['Hello', ',', 'who', 'are', 'you', '?']
>>> list(join_punctuation(b))
['Hello,', 'who', 'are', 'you?']
>>> ' '.join(join_punctuation(b))
'Hello, who are you?'
答案 1 :(得分:1)
在得到结果之后执行此操作,而不是已满,但有效...
c = re.sub(r' ([^A-Za-z0-9])', r'\1', c)
输出:
c = 'Hello , who are you ?'
>>> c = re.sub(r' ([^A-Za-z0-9])', r'\1', c)
>>> c
'Hello, who are you?'
>>>
答案 2 :(得分:1)
可能是这样的:
>>> from string import punctuation
>>> punc = set(punctuation) # or whatever special chars you want
>>> b = ['Hello', ',', 'who', 'are', 'you', '?']
>>> ''.join(w if set(w) <= punc else ' '+w for w in b).lstrip()
'Hello, who are you?'
这会在b
中的单词之前添加一个空格,而这些单词并非完全由标点符号组成。
答案 3 :(得分:0)
如何abt
c = " ".join(b).replace(" ,", ",")