用大写字母分开一个句子

时间:2018-03-16 20:28:50

标签: python string split

我怎么能分开这个?

'Symptoms may include:Absent or small knucklesCleft palateDecreased skin creases at finger jointsDeformed earsDroopy eyelidsInability to fully extend the joints from birth (contracture deformity)Narrow shouldersPale skinTriple-jointed thumbs'

所需的输出应采用此形式

Symptoms may include:
Absent or small knuckles
Cleft palate
Decreased skin creases at finger joints
Deformed ears
Droopy eyelids
Inability to fully extend the joints from birth (contracture deformity)
Narrow shoulders
Pale skin
Triple-jointed thumbs

喜欢拆分大写字母。

3 个答案:

答案 0 :(得分:6)

使用re.findall(由于@Brendan Abel和@JFF而改善了模式):

fragments = re.findall('[A-Z][^A-Z]*', text)

print(fragments)
['Symptoms may include:',
 'Absent or small knuckles',
 'Cleft palate',
 'Decreased skin creases at finger joints',
 'Deformed ears',
 'Droopy eyelids',
 'Inability to fully extend the joints from birth (contracture deformity)',
 'Narrow shoulders',
 'Pale skin',
 'Triple-jointed thumbs']

<强>详情

[A-Z]      # match must begin with a uppercase char
[^A-Z]*    # further characters in match must not contain an uppercase char

注意:*可让您捕获具有单个大写字符的句子。如果不是所需的功能,请用+替换。

另外,如果您希望输出为多行字符串:

print('\n'.join(fragments))

答案 1 :(得分:2)

>>> s = 'Symptoms may include:Absent or small knucklesCleft palateDecreased skin creases at finger jointsDeformed earsDroopy eyelidsInability to fully extend the joints from birth (contracture deformity)Narrow shouldersPale skinTriple-jointed thumbs'
>>> print(''.join(('\n' + c if c.isupper() else c) for c in s)[1:])
Symptoms may include:
Absent or small knuckles
Cleft palate
Decreased skin creases at finger joints
Deformed ears
Droopy eyelids
Inability to fully extend the joints from birth (contracture deformity)
Narrow shoulders
Pale skin
Triple-jointed thumbs

如何运作

  • (('\n' + c if c.isupper() else c) for c in s)

    上面会在字符串c中生成每个字符s的列表,除非c是大写,在这种情况下它会为该字符添加新行。

    < / LI>
  • ''.join(('\n' + c if c.isupper() else c) for c in s))

    这将列表重新连接成一个字符串。

  • ''.join(('\n' + c if c.isupper() else c) for c in s)[1:]

    这将从字符串的开头删除额外的换行符。

答案 2 :(得分:-1)

我认为以下代码可能很有趣

import re
output = re.sub( r"([A-Z])", r"\n\1", inputString)
print(output)

您还可以通过拆分所有\n

将其存储回列表中
outputList = output.split('\n')[1::]

这最初用\n替换所有大写字母,然后用大写字母

替换