给定一个文本文件“words.txt”,使用list comprehension读入文件中的所有单词,找到包含至少2个元音的所有单词。
所以,我有一个文本文件:
The quick brown fox jumps over the lazy dog
并且,获得所有单词的最佳尝试以及包含两个或更多元音的所有单词是:
#This could be hardcoded in, but for the sake of simplicity (as simple as simplicity gets)
vowels = ["a","e","i","o","u"]
filename = "words.txt"
words = [word for word in open(filename, "r").read().split()]
multivowels = [each for each in open(filename, "r").read().split() if sum(letter in vowels for letter in each) >= 2]
输出应该模仿:
All words in the file: ['The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']
The words in the file that contain 2 or more vowels: ['quick', 'over']
我试图把它放到一行只是为了打印“单词”和“multivowels”的列表理解方以及“文件中的所有单词”......等等。
是否有人将这两个列表理解合并为一个挑战?我和我的队友很难过,但很想向我们的教授展示它!
同样,我的最终单行代码是:
print "All words in the file: " + str([word for word in open(filename, "r").read().split()]) + "\nAll words with more than 2 vowels: " + str([each for each in open(filename, "r").read().split() if sum(letter in vowels for letter in each) >= 2])
编辑: 我试图获取文件中的所有单词,以及包含两个或更多元音的所有单词。
vowels = ["a", "e", "i", "o", "u"]
filename = "words.txt"
print [(word, each) for word in open(filename, "r").read().split() if sum([1 for each in word if each in vowels]) >= 2]
答案 0 :(得分:1)
这里有一些极端情况要处理,但如果你假设一个简单的文本文件:
import re
vowels = "a","e","i","o","u"
answer = [[word for word in re.sub("[^\w]", " ", sentence).split() if (sum(1 for letter in word if letter in vowels)>=2)] for sentence in open(filename,"r").readlines()]
答案 1 :(得分:0)
我在较大的数据集上运行了稍微不同的版本,发现对str
的重复调用开始加起来。
vowels = ['a', 'e', 'i', 'o', 'u']
#filename = 'vowelcount.txt'
filename = 'largetextfile.txt'
print "All words in the file: ", [w for w in open(filename).read().split()], "\n", "All words with more than 2 vowels: ", [w for w in open(filename).read().split() if sum(1 for l in w if l in vowels) > 1]
使用cProfile
调用此版本会有一些小改进:
python -m cProfile vowelcount1.py
7839 function calls in 0.023 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.005 0.005 0.023 0.023 vowelcount1.py:1(<module>)
6045 0.009 0.000 0.009 0.000 vowelcount1.py:4(<genexpr>)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
2 0.000 0.000 0.000 0.000 {method 'read' of 'file' objects}
2 0.000 0.000 0.000 0.000 {method 'split' of 'str' objects}
2 0.000 0.000 0.000 0.000 {open}
1786 0.009 0.000 0.018 0.000 {sum}
你输入的代码几乎是函数调用数量的两倍:
#filename = 'vowelcount.txt'
filename = 'largetextfile.txt'
vowels = ['a', 'e', 'i', 'o', 'u']
print "All words in the file: " + str([word for word in open(filename, "r").read().split()]) + "\nAll words with more than 2 vowels: " + str([each for each in open(filename, "r").read().split() if sum(letter in vowels for letter in each) >= 2])
python -m cProfile vowelcount2.py
14568 function calls in 0.036 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.004 0.004 0.036 0.036 vowelcount2.py:2(<module>)
12774 0.016 0.000 0.016 0.000 vowelcount2.py:4(<genexpr>)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
2 0.000 0.000 0.000 0.000 {method 'read' of 'file' objects}
2 0.000 0.000 0.000 0.000 {method 'split' of 'str' objects}
2 0.000 0.000 0.000 0.000 {open}
1786 0.016 0.000 0.032 0.000 {sum}
正如大家已经明确提到的那样,这不是你想要编写别人必须阅读的代码的方式。虽然我承认我也能看到自己能够融入一个蟒蛇list
理解中的多少:D
答案 2 :(得分:0)
所以,谢谢大家的意见。有很多非常有趣的方法可以找到两个或更多元音的单词。今天早上我和我的教授谈到了我对这个问题遇到的困难,他清除了我的误解。
我的印象是他希望单个列表理解能够返回一个列表,其中包含文件中所有单词的列表,以及仅包含两个或更多元音的单词列表。但事实上,他只是想要我完成的事情;每个场景的列表理解:文件中的所有单词;文件中包含两个或更多元音的所有单词。
感谢大家的投入!