检查gzip或纯文本并读取文件而不检查扩展名 - python

时间:2014-10-29 10:00:43

标签: python file-io compression gzip

我有文件,读取时格式完全相同,但唯一的区别是我不确定其中某些文件是gzip

示例文件是这样的:

der ||| the ||| 0.3 ||| ||| 
das ||| the ||| 0.4 ||| |||  
das ||| it ||| 0.1 ||| ||| 
das ||| this ||| 0.1 ||| ||| 
die ||| the ||| 0.3 ||| ||| 

当我读到它时,我目前正在这样做:

try: 
    with gzip.open(phrasetablefile, 'rb') as fin:
        for line in fin:
            # do something
except:
    with open(phrasetablefile, 'rb') as fin:
        for line in fin:
            # do something

如果没有丑陋的重复代码,是否有其他方法可以做到这一点?(请注意#做一些代码很长的代码)

有没有办法执行以下操作?

try: 
    with gzip.open(phrasetablefile, 'rb') as fin:
except:
    with open(phrasetablefile, 'rb') as fin:
        for line in fin:
            # do something

2 个答案:

答案 0 :(得分:1)

警告:未经测试的代码

要么(如@jonrsharpe建议的那样):

def process(fin):
    for line in fin:
        pass # do something

try:
    with gzip.open(phrasetablefile, 'rb') as fin:
        process(fin)
except:
    with open(phrasetablefile, 'rb') as fin:
        process(fin)

或尝试这样的事情:

try: 
    fin = gzip.open(phrasetablefile, 'rb')
except:
    fin = open(phrasetablefile, 'rb')

for line in fin:
    pass # do something
fin.close()

答案 1 :(得分:0)

如果您有gzip后缀,可以执行以下操作吗?

if phrasetablefile.endswith('.gz'):
    opener = gzip.open
else:
    opener = open

with opener(phrasetablefile, 'rb') as fin:
    for line in fin:
        # do something