如何使用python删除单词中的空白?

时间:2019-04-26 09:04:24

标签: python-3.x text-mining spacy removing-whitespace textacy

这是给定John plays chess and l u d o.的输入,我希望输出采用这种格式(如下所示)

John plays chess and ludo.

我尝试过使用正则表达式删除空格 但对我不起作用。

import re
sentence='John plays chess and l u d o'
sentence = re.sub(r"\s+", "", sentence, flags=re.UNICODE)

print(sentence)

我期望输出John plays chess and ludo.
但是我得到的输出是Johnplayschessandludo

2 个答案:

答案 0 :(得分:2)

这应该有效!从本质上讲,该解决方案从句子中提取单个字符,使其成为单词,然后将其重新连接到其余句子。

s = 'John plays chess and l u d o'

chars = []
idx = 0

#Get the word which is divided into single characters
while idx < len(s)-1:

    #This will get the single characters around single spaces
    if s[idx-1] == ' ' and s[idx].isalpha() and s[idx+1] == ' ':
        chars.append(s[idx])

    idx+=1

#This is get the single character if it is present as the last item
if s[len(s)-2] == ' ' and s[len(s)-1].isalpha():
    chars.append(s[len(s)-1])

#Create the word out of single character
join_word = ''.join(chars)

#Get the other words
old_words = [item for item in s.split() if len(item) > 1]

#Form the final string
res = ' '.join(old_words + [join_word])

print(res)

输出将如下所示

John plays chess and ludo

答案 1 :(得分:0)

以上代码在解决问题时不会保持单词的顺序。 例如,尝试输入此句子“ John演奏并演奏ludo”

如果您在任何位置的文本中都有带空格的单个单词,请尝试使用此选项:

sentence = "John plays c h e s s and ludo"
sentence_list = sentence.split()
index = [index for index, item in enumerate(sentence_list) if len(item) == 1]
join_word = "".join([item for item in sentence_list if len(item) == 1])
if index != []:
    list(map(lambda x: sentence_list.pop(index[0]), index[:-1]))
    sentence_list[index[0]] = join_word
    sentence = " ".join(sentence_list)
else:
    sentence