如何从段落的开头删除单个空格并使用python大写段落的第一个字母?
输入:
this is a sample sentence. This is a sample second sentence.
输出:
This is a sample sentence. This is a sample second sentence.
到目前为止我的努力:
import spacy, re
nlp = spacy.load('en_core_web_sm')
doc = nlp(unicode(open('2.txt').read().decode('utf8')) )
tagged_sent = [(w.text, w.tag_) for w in doc]
normalized_sent = [w.capitalize() if t in ["NN","NNS"] else w for (w,t) in tagged_sent]
normalized_sent1 = normalized_sent[0].capitalize()
string = re.sub(" (?=[\.,'!?:;])", "", ' '.join(normalized_sent1))
rtn = re.split('([.!?] *)', string)
final = ''.join([i.capitalize() for i in rtn])
print final
除了段落开头之外,这里所有段落的句子的第一个词都是大写的吗?
Output:
on the insert tab, the galleries include items that are designed to coordinate with the overall look of your document. You can use these galleries to insert tables, headers, footers, lists, cover pages, and other document building blocks. When you create pictures, charts, or diagrams, they also coordinate with your current document look.
Expected output:
On the insert tab, the galleries include items that are designed to coordinate with the overall look of your document. You can use these galleries to insert tables, headers, footers, lists, cover pages, and other document building blocks. When you create pictures, charts, or diagrams, they also coordinate with your current document look.
答案 0 :(得分:2)
您可以使用正则表达式和str.capitalize()
:
import re
s = " this is a sample sentence. This is a sample second sentence."
new_s = '. '.join(i.capitalize() for i in re.split('\.\s', re.sub('^\s+', '', s)))
输出:
'This is a sample sentence. This is a sample second sentence.'
答案 1 :(得分:1)
一个简单的解决方案是,(我推荐@Ajax'答案)
x = 'on the insert tab, the galleries include items that are designed to coordinate with the overall look of your document. You can use these galleries to insert tables, headers, footers, lists, cover pages, and other document building blocks. When you create pictures, charts, or diagrams, they also coordinate with your current document look. '
print( '. '.join(map(lambda s: s.strip().capitalize(), x.split('.'))))
输出:
On the insert tab, the galleries include items that are designed to coordinate with the overall look of your document. You can use these galleries to insert tables, headers, footers, lists, cover pages, and other document building blocks. When you create pictures, charts, or diagrams, they also coordinate with your current document look.
答案 2 :(得分:1)
如果你的要求只是删除第一个空格,然后制作首字母大写你可以尝试这样的事情:
your_data=' on the insert tab, the galleries include items that are designed to coordinate with the overall look of your document. you can use these galleries to insert tables, headers, footers, lists, cover pages, and other document building blocks. when you create pictures, charts, or diagrams, they also coordinate with your current document look. '
conversion=list(your_data)
if conversion[0]==' ':
del conversion[0]
capitalize="".join(conversion).split()
for j,i in enumerate(capitalize):
try:
if j==0:
capitalize[j]=capitalize[j].capitalize()
if '.' in i:
capitalize[j + 1] = capitalize[j + 1].capitalize()
except IndexError:
pass
print(" ".join(capitalize))
输出:
On the insert tab, the galleries include items that are designed to coordinate with the overall look of your document. You can use these galleries to insert tables, headers, footers, lists, cover pages, and other document building blocks. When you create pictures, charts, or diagrams, they also coordinate with your current document look.