这是我拥有的字符串列表:
[
['It', 'was', 'the', 'besst', 'of', 'times,'],
['it', 'was', 'teh', 'worst', 'of', 'times']
]
我需要将times,
中的标点符号拆分为'times',','
或另一个例子,如果我有Why?!?
我需要它'Why','?!?'
import string
def punctuation(string):
for word in string:
if word contains (string.punctuation):
word.split()
我知道它根本不是python语言!但这就是我想要的。
答案 0 :(得分:3)
即使字符串更复杂,也可以使用finditer
。
>>> r = re.compile(r"(\w+)(["+string.punctuation+"]*)")
>>> s = 'Why?!?Why?*Why'
>>> [x.groups() for x in r.finditer(s)]
[('Why', '?!?'), ('Why', '?*'), ('Why', '')]
>>>
答案 1 :(得分:1)
您可以使用正则表达式,例如:
In [1]: import re
In [2]: re.findall(r'(\w+)(\W+)', 'times,')
Out[2]: [('times', ',')]
In [3]: re.findall(r'(\w+)(\W+)', 'why?!?')
Out[3]: [('why', '?!?')]
In [4]:
答案 2 :(得分:0)
这样的东西? (假设点总是在最后)
def lcheck(word):
for i, letter in enumerate(word):
if not word[i].isalpha():
return [word[0:(i-1)],word[i:]]
return [word]
value = 'times,'
print lcheck(value)
答案 3 :(得分:0)
没有正则表达式的生成器解决方案:
import string
from itertools import takewhile, dropwhile
def splitp(s):
not_punc = lambda c: c in string.ascii_letters+"'" # won't split "don't"
for w in s:
punc = ''.join(dropwhile(not_punc, w))
if punc:
yield ''.join(takewhile(not_punc, w))
yield punc
else:
yield w
list(splitp(s))