Question

尝试解压缩文件时出错，删除我不感兴趣的行，最后将剩余的行写入文件。这是我的代码：

import gzip, os, sys
dataset_names=[]
dir_path=('local drive path')
dataset_names= os.listdir(dir_path)
count=0
read_zip = [];
for dataset in dataset_names:
        each_dataset=os.path.join(dir_path+'\\'+dataset+'\\'+'soft'+'\\'+dataset+'_full'+'.soft')
        with gzip.open(each_dataset+'.gz', 'rb') as each_gzip_file:
            if count == 2: # I wanted to check with 2 datasets first
                continue;
            for line in each_gzip_file:    
                if line.startwith !=('#', '!', '^'):
                    continue;
                read_zip.append('\t' + line);
            with open('name of a file', 'wb') as f:                   

                    f.writelines(read_zip)
        print(dataset);
        count+=1;

这是我得到的错误：

 AttributeError: 'bytes' object has no attribute 'startwith'

然后我尝试将其更改为此代码：

......
.......            
for line in each_gzip_file:
                if not PY3K:
                    if lines.startwith != ('#', '!', '^'):
                        continue;
                    lines.append(line)

                else:
                    lines.append(line.decode('cp437'))                
                    makeitastring = ''.join(map(str, lines))
               with open('fine name', 'wb') as f:   

                    my_str_as_bytes = str.encode(str(,lines))
                    f.writelines(makeitastring)

这次出现了这个错误：

TypeError: a bytes-like object is required, not 'str'

我也用以下内容改变了它，但它也没有用。就好像它一遍又一遍地重复：

for line in each_gzip_file:
                read_zip.append(line);
                for x in read_zip:
                    if str(x).startswith != ('#', '!', '^'):
                     continue;                         
                else:
                    final.append(x);                        

                with open('file name', 'ab') as f:  

                f.writelines(final)

我错过了什么吗？谢谢，

Answer 1

我看到有两个错误。首先，你拼错了方法名称。它是bytes.startswith()，而不是bytes.startwith()。注意“开始”和“有”之间的“s”。

其次，代码line.startswith != ('#', '!', '^')没有按照您的想法行事。 startswith()是bytes个对象的方法，您想要的是使用'#'等作为参数调用该方法。现在，你问“这个方法等于这三个字符串的元组？”。在这种情况下，这毫无意义，但Python很乐意返回False。

你想要的是line.startswith((b'#', b'!', b'^'))。（b是区分字符串和字节所必需的，因为它们在Python 3中是不同的。）如果该行以这三个字符中的任何一个开头，则返回True。

解压缩文件时，在将不需要的行写入文件之前删除它们

1 个答案: