我有一个文件名列表:
names = ['aet2000','ppt2000', 'aet2001', 'ppt2001']
虽然我找到了一些可以用来grep字符串的函数,但我还没弄清楚如何grep列表中的所有元素。
例如我想:
grep(names,'aet')
并获得:
['aet2000','aet2001']
当然不是太难,但我是Python的新手
更新 上面的问题显然不够准确。下面的所有答案都适用于示例,但不适用于我的实际数据。这是我的代码来制作文件名列表:
years = range(2000,2011)
months = ["jan","feb","mar","apr","may","jun","jul","aug","sep","oct","nov","dec"]
variables = ["cwd","ppt","aet","pet","tmn","tmx"] # *variable name* with wildcards
tifnames = list(range(0,(len(years)*len(months)*len(variables)+1) ))
i = 0
for variable in variables:
for year in years:
for month in months:
fullname = str(variable)+str(year)+str(month)+".tif"
tifnames[i] = fullname
i = i+1
运行过滤器(x,tifnames中的lambda x:'aet')或其他答案返回:
Traceback (most recent call last):
File "<pyshell#89>", line 1, in <module>
func(tifnames,'aet')
File "<pyshell#88>", line 2, in func
return [i for i in l if s in i]
TypeError: argument of type 'int' is not iterable
尽管tifnames是一个字符串列表:
type(tifnames[1])
<type 'str'>
你们看到这里发生了什么吗?再次感谢!
答案 0 :(得分:41)
使用filter()
:
>>> names = ['aet2000','ppt2000', 'aet2001', 'ppt2001']
>>> filter(lambda x:'aet' in x, names)
['aet2000', 'aet2001']
regex
:
>>> import re
>>> filter(lambda x: re.search(r'aet', x), names)
['aet2000', 'aet2001']
在Python 3中,过滤器返回一个迭代器,因此在其上获得一个列表调用list()
。
>>> list(filter(lambda x:'aet' in x, names))
['aet2000', 'aet2001']
否则使用list-comprehension(它将在Python 2和3中都有效:
>>> [name for name in names if 'aet' in name]
['aet2000', 'aet2001']
答案 1 :(得分:10)
试一试。它可能不是所有代码中“最短”的,但对于试图学习python的人来说,我认为它教的更多
names = ['aet2000','ppt2000', 'aet2001', 'ppt2001']
found = []
for name in names:
if 'aet' in name:
found.append(name)
print found
输出
['aet2000', 'aet2001']
编辑: 更改为生成列表。
另见:
How to use Python to find out the words begin with vowels in a list?
答案 2 :(得分:5)
>>> names = ['aet2000', 'ppt2000', 'aet2001', 'ppt2001']
>>> def grep(l, s):
... return [i for i in l if s in i]
...
>>> grep(names, 'aet')
['aet2000', 'aet2001']
正则表达式版本,更接近grep,虽然在这种情况下不需要:
>>> def func(l, s):
... return [i for i in l if re.search(s, i)]
...
>>> func(names, r'aet')
['aet2000', 'aet2001']
答案 3 :(得分:3)
您应该尝试查看名为re的pythong模块。 Bellow我在python中有一个使用re的grep函数实现。它将帮助您了解如何重新工作(当然只有在您阅读了重新开始之后)
def grep(pattern,word_list):
expr = re.compile(pattern)
return [elem for elem in word_list if expr.match(elem)]
答案 4 :(得分:1)
您无需预先分配列表tifnames
或使用计数器放入元素。只需将数据附加到生成的列表中或使用列表推导。
即,就这样做:
import re
years = ['2000','2011']
months = ["jan","feb","mar","apr","may","jun","jul","aug","sep","oct","nov","dec"]
variables = ["cwd","ppt","aet","pet","tmn","tmx"] # *variable name* with wildcards
tifnames = []
for variable in variables:
for year in years:
for month in months:
fullname = variable+year+month+".tif"
tifnames.append(fullname)
print tifnames
print '==='
print filter(lambda x: re.search(r'aet',x),tifnames)
打印:
['cwd2000jan.tif', 'cwd2000feb.tif', 'cwd2000mar.tif', 'cwd2000apr.tif', 'cwd2000may.tif', 'cwd2000jun.tif', 'cwd2000jul.tif', 'cwd2000aug.tif', 'cwd2000sep.tif', 'cwd2000oct.tif', 'cwd2000nov.tif', 'cwd2000dec.tif', 'cwd2011jan.tif', 'cwd2011feb.tif', 'cwd2011mar.tif', 'cwd2011apr.tif', 'cwd2011may.tif', 'cwd2011jun.tif', 'cwd2011jul.tif', 'cwd2011aug.tif', 'cwd2011sep.tif', 'cwd2011oct.tif', 'cwd2011nov.tif', 'cwd2011dec.tif', 'ppt2000jan.tif', 'ppt2000feb.tif', 'ppt2000mar.tif', 'ppt2000apr.tif', 'ppt2000may.tif', 'ppt2000jun.tif', 'ppt2000jul.tif', 'ppt2000aug.tif', 'ppt2000sep.tif', 'ppt2000oct.tif', 'ppt2000nov.tif', 'ppt2000dec.tif', 'ppt2011jan.tif', 'ppt2011feb.tif', 'ppt2011mar.tif', 'ppt2011apr.tif', 'ppt2011may.tif', 'ppt2011jun.tif', 'ppt2011jul.tif', 'ppt2011aug.tif', 'ppt2011sep.tif', 'ppt2011oct.tif', 'ppt2011nov.tif', 'ppt2011dec.tif', 'aet2000jan.tif', 'aet2000feb.tif', 'aet2000mar.tif', 'aet2000apr.tif', 'aet2000may.tif', 'aet2000jun.tif', 'aet2000jul.tif', 'aet2000aug.tif', 'aet2000sep.tif', 'aet2000oct.tif', 'aet2000nov.tif', 'aet2000dec.tif', 'aet2011jan.tif', 'aet2011feb.tif', 'aet2011mar.tif', 'aet2011apr.tif', 'aet2011may.tif', 'aet2011jun.tif', 'aet2011jul.tif', 'aet2011aug.tif', 'aet2011sep.tif', 'aet2011oct.tif', 'aet2011nov.tif', 'aet2011dec.tif', 'pet2000jan.tif', 'pet2000feb.tif', 'pet2000mar.tif', 'pet2000apr.tif', 'pet2000may.tif', 'pet2000jun.tif', 'pet2000jul.tif', 'pet2000aug.tif', 'pet2000sep.tif', 'pet2000oct.tif', 'pet2000nov.tif', 'pet2000dec.tif', 'pet2011jan.tif', 'pet2011feb.tif', 'pet2011mar.tif', 'pet2011apr.tif', 'pet2011may.tif', 'pet2011jun.tif', 'pet2011jul.tif', 'pet2011aug.tif', 'pet2011sep.tif', 'pet2011oct.tif', 'pet2011nov.tif', 'pet2011dec.tif', 'tmn2000jan.tif', 'tmn2000feb.tif', 'tmn2000mar.tif', 'tmn2000apr.tif', 'tmn2000may.tif', 'tmn2000jun.tif', 'tmn2000jul.tif', 'tmn2000aug.tif', 'tmn2000sep.tif', 'tmn2000oct.tif', 'tmn2000nov.tif', 'tmn2000dec.tif', 'tmn2011jan.tif', 'tmn2011feb.tif', 'tmn2011mar.tif', 'tmn2011apr.tif', 'tmn2011may.tif', 'tmn2011jun.tif', 'tmn2011jul.tif', 'tmn2011aug.tif', 'tmn2011sep.tif', 'tmn2011oct.tif', 'tmn2011nov.tif', 'tmn2011dec.tif', 'tmx2000jan.tif', 'tmx2000feb.tif', 'tmx2000mar.tif', 'tmx2000apr.tif', 'tmx2000may.tif', 'tmx2000jun.tif', 'tmx2000jul.tif', 'tmx2000aug.tif', 'tmx2000sep.tif', 'tmx2000oct.tif', 'tmx2000nov.tif', 'tmx2000dec.tif', 'tmx2011jan.tif', 'tmx2011feb.tif', 'tmx2011mar.tif', 'tmx2011apr.tif', 'tmx2011may.tif', 'tmx2011jun.tif', 'tmx2011jul.tif', 'tmx2011aug.tif', 'tmx2011sep.tif', 'tmx2011oct.tif', 'tmx2011nov.tif', 'tmx2011dec.tif']
===
['aet2000jan.tif', 'aet2000feb.tif', 'aet2000mar.tif', 'aet2000apr.tif', 'aet2000may.tif', 'aet2000jun.tif', 'aet2000jul.tif', 'aet2000aug.tif', 'aet2000sep.tif', 'aet2000oct.tif', 'aet2000nov.tif', 'aet2000dec.tif', 'aet2011jan.tif', 'aet2011feb.tif', 'aet2011mar.tif', 'aet2011apr.tif', 'aet2011may.tif', 'aet2011jun.tif', 'aet2011jul.tif', 'aet2011aug.tif', 'aet2011sep.tif', 'aet2011oct.tif', 'aet2011nov.tif', 'aet2011dec.tif']
而且,根据你是否觉得这更具可读性,这将是更惯用的Python:
years = ['2000','2011']
months = ["jan","feb","mar","apr","may","jun","jul","aug","sep","oct","nov","dec"]
vars = ["cwd","ppt","aet","pet","tmn","tmx"]
tifnames = [v+y+m+".tif" for y in years for m in months for v in vars]
print tifnames
print '==='
print [e for e in tifnames if re.search(r'aet',e)]
...相同的输出
答案 5 :(得分:-1)
如果您使用上述过滤器之一,例如:
filter(lambda x:'aet' in x, names)
您会收到此错误:
TypeError: a bytes-like object is required, not 'str'
您需要对字符串术语进行编码:
filter(lambda x:'aet'.encode() in x, names)