Question

例如给出字符串"The organization for health, safety and education"，我如何获得：

Required_Output = OHSE

在输出中，我需要由前几个字母组成的字符串（大写）此类单词的长度大于3。

Answer 1

这是一种使用生成器理解的方法，首先使用split，然后在单词的相应长度为>3时采用第一个字符的upper：

s = "The organization for health, safety and education"
''.join(i[0].upper() for i in s.split() if len(i) > 3)
# 'OHSE'

尽管如@tobias_k所述，一个更好的选择可能是使用stopwords从字符串中排除单词。为此，您可以使用nltk.corpus.stopwords。这样做的方法如下：

from nltk.corpus import stopwords
stop_words = set(stopwords.words('english')) 
# {'but', 'wasn', 'during', 'does', 'very', 'at',...

现在将以上内容更改为：

''.join(i[0].upper() for i in s.split() if i.lower() not in stop_words)
# 'OHSE'

Answer 2

这也可以使用正则表达式（re模块）完成：

import re
txt = "The organization for health, safety and education"
letters = re.findall(r'([A-Za-z])[A-Za-z]{3,}',txt)
output = ''.join(letters).upper()
print(output) # print OHSE

我使用的模式从由4个或更多字母组成的子字符串中抓取第一个字母（1个字母在唯一组内，而3个或更多在外部）

Answer 3

这名班轮应该可以解决问题。

input = 'The organization for health, safety and education'

print(''.join(map(lambda y: y[0].upper(), filter(lambda x : len(x) > 3, input.split()))))

连接字符串中单词的前几个字符

3 个答案: