有没有办法在Python中对给定长度的字母进行分组?
我开始研究这个功能:
lenght_words(a,b,text):
returnlist = []
在返回列表中我想要有长度的单词:
a< = lenght< = b
所以我在想:
我知道有splitlines()
方法,但我不知道如何使用它(即使在阅读之后)。
我想举例说明函数的工作原理:
function(6,7,'All in the golden afternoon\nFull leisurely we glide;\nFor both our oars, with little skill,\nBy little arms are plied.')
此功能应分隔行:
一切都在金色的下午
我们悠闲地滑行;
对于我们的桨,
没什么技巧,
通过小武器合作。
- >删除标点并返回:
['golden','','little','little']
我知道我必须将这些单词附加到返回列表中,但我不知道如何继续。
答案 0 :(得分:0)
你可以写一个这样的列表理解:
[token for token in s.split(" ") if a <= len(token) <= b]
它将返回变量s(str)中的所有单词,其字符长度介于a(int)和b(int)之间。关于如何使用它的一个例子是
s = 'All in the golden afternoon\nFull leisurely we glide;'
s += '\nFor both our oars, with little skill,\nBy little arms are plied.'
a = 6
b = 7
result = [token for token in s.split(" ") if a <= len(token) <= b]
结果将是:
[&#39; golden&#39;,&#39; little&#39;,&#39; little&#39;,&#39; plied。&#39;]
要摆脱标点符号,只需添加
即可import string
s = "".join([char for char in s if char not in string.punctuation])
在最后一行之上。结果是:
[&#39; golden&#39;,&#39; little&#39;,&#39; little&#39;]
希望这适合你!
修改强>
如果你想分别搜索不同的行,我会建议这样的解决方案:
import string
def split_by_line_and_find_words_with_length(min, max, s):
#store result
result = []
# separate string lines
lines = s.splitlines()
for line in lines:
# remove punctuation
l = "".join([char for char in line if char not in string.punctuation])
# find words with length between a and b
find = [token for token in l.split(" ") if a <= len(token) <= b]
# add empty string to result if no match
if find == []: find.append("")
# add any findings to result
result += find
return result
使用您的示例字符串和首选字长,这将返回[&#39; golden&#39;,&#39;&#39;,&#39; little&#39;,&#39; little&#39; ]
答案 1 :(得分:0)
当你考虑范围时,你的方向正确。以下是我编写函数的方法。
start
和stop
,目标句子为sentence
。word_list
。.splitlines()
分割句子来迭代句子中的每一行。tmp = [word for word in line.split() if start <= len(word) <= stop]
。将list comprehnsion的结果分配给名为tmp
的列表。tmp
的长度大于1
tmp
中的每个单词加一个空格,然后将加入的字符串添加到word_list
。tmp
列表只有一个元素长
word_list
word_list
word_list
使用上面的步骤,在这里我将如何编写您的函数:
# create a function with the parameters `start`, `stop` and `sentence`
# `start` and `stop` are for the range, and `sentence` is the
# target sentence to iterate over.
def group_words_by_length(start: int, stop: int, sentence: str) -> list:
# import the string module so we can use its punctuation attribute.
import string
# create a list to hold words that
# are in the given `start`-`stop` range
word_list = []
# iterate over each line in the sentence
# using the string attribute `.splitlines()`
# which splits the string at every new line
for line in sentence.splitlines():
# filter out punctuation from
# every line.
line = ''.join([char for char in line if char not in string.punctuation])
# iterate over every word in each line
# via list comprehension. Inside the list comprehension
# we only add a word if is is in the given range.
tmp = [word for word in line.split() if start <= len(word) <= stop]
# if we found more than one valid word
# in the current line...
if len(tmp) > 1:
# join each word in the
# list by a space, and add
# the joined string to the `word_list`.
tmp = ' '.join(tmp)
word_list.append(tmp)
# if we found only
# one valid word...
elif len(tmp) == 1:
# simply add the word
# to the `word_list`.
word_list.extend(tmp)
# otherwise...
else:
# add an empty string to the
# `word_list`.
word_list.append("")
# return the `word_list`
return word_list
# testing of the function with
# your test string.
print(group_words_by_length(6, 7, 'All in the golden afternoon\nFull leisurely we glide;\nFor both our oars, with little skill,\nBy little arms are plied.'))
<强>输出:强>
['golden', '', 'little', 'little']