如何阅读此文件,使用Python跳过一些行?

时间:2016-03-02 14:44:14

标签: python

我需要跳过47行文件(标题等),然后阅读本文

 4.163186002791e+04  3.578830331359e+04  3.076496349687e+04  2.644671278966e+04  2.273458304119e+04 
 1.954349752908e+04  1.680032112209e+04  1.444218412726e+04  1.241504140604e+04  1.067243373686e+04 
 9.174423035938e+03  7.886677033340e+03  6.779682426302e+03  5.828068476394e+03  5.010025548360e+03 
 1.737988920100e+03  1.284332855871e+03  1.104060538508e+03  8.158747205330e+02  7.013564117662e+02 
 6.029121922103e+02  5.182858606802e+02  4.455379022877e+02  2.433020871700e+02  2.091515701348e+02 
 1.797945089525e+02  1.545580816278e+02  1.328639052196e+02  9.818329499070e+01  7.255514128762e+01 
 5.361653963401e+01  4.609078195788e+01  3.962135930423e+01  3.406000172766e+01  2.927925083995e+01 
 2.516953864546e+01  2.163667639887e+01  1.859969593339e+01  1.598899398582e+01  1.374473698894e+01 
 1.181548977143e+01  1.015703673713e+01  8.731368506527e+00  7.505810795983e+00  6.452275569743e+00 
 5.546617302183e+00  4.768079596776e+00  4.098819479081e+00  3.523498461194e+00  3.028931005477e+00 
 2.603782330822e+00  2.238308635635e+00  1.924133783786e+00  1.654057335509e+00  1.421889523591e+00 
 1.222309392724e+00  1.050742850800e+00  9.032578372386e-01  7.764742057598e-01  6.674862562538e-01 
 5.737961402745e-01  4.932566139141e-01  3.133421372728e-01  2.315524554696e-01  1.990511474577e-01 
 1.711118080085e-01  1.470941072881e-01  1.264475938317e-01  1.086990789815e-01  9.344179207682e-02 
 8.032605785014e-02  6.905128236880e-02  5.935906385039e-02  5.102727046220e-02 

可能作为列表,然后再次跳过21行并以与上面相同的格式读取文件的一部分。 首先我的想法是这样的:

from itertools import islice
n=15
with open('91_FULLMERGED.edi') as f:
    lines_after_48 = f.readlines()[48:]
    while True:
        next_15_lines = list(islice(lines_after_48, n))
        if not next_15_lines:
            break

但这不起作用。

milenko@milenko-HP-Compaq-6830s:~/EDIs$ python k1.py 

它只是停留在终端。

如何解决这个问题?

2 个答案:

答案 0 :(得分:1)

<强>代码

我使用csv阅读器导致格式为csv with delimiter&#39; &#39 ;.起初我刚刚跳过47行(f,None)。之后csv模块正在制作技巧。如果要输出为文件,可以使用csv writer。如果要从输出列表中删除空字符串,可以取消注释代码。但是然后输出到文件与输入不相似。这取决于你想如何使用数据。

Default

输出到屏幕

import csv

with open('input.txt', 'rb') as f:
    for i in range(47):
        next(f, None)
    reader = csv.reader(f,delimiter=' ')
    values = list(reader)

# if you want to remove the ''
#for idx, val in enumerate(values):
#    values[idx] = [x for x in values[idx] if x != '']

print values

with open('output.txt', 'wb') as f:
    writer = csv.writer(f, delimiter=' ', quotechar='"', quoting=csv.QUOTE_MINIMAL)
    for line in values:
        writer.writerow(line)    

输出到文件output.txt

[['4.163186002791e+04', '', '3.578830331359e+04', '', '3.076496349687e+04', '', '2.644671278966e+04', '', '2.273458304119e+04'],
['1.954349752908e+04', '', '1.680032112209e+04', '', '1.444218412726e+04', '', '1.241504140604e+04', '', '1.067243373686e+04'],
['9.174423035938e+03', '', '7.886677033340e+03', '', '6.779682426302e+03', '', '5.828068476394e+03', '', '5.010025548360e+03'],
['1.737988920100e+03', '', '1.284332855871e+03', '', '1.104060538508e+03', '', '8.158747205330e+02', '', '7.013564117662e+02'],
['6.029121922103e+02', '', '5.182858606802e+02', '', '4.455379022877e+02', '', '2.433020871700e+02', '', '2.091515701348e+02'],
['1.797945089525e+02', '', '1.545580816278e+02', '', '1.328639052196e+02', '', '9.818329499070e+01', '', '7.255514128762e+01'],
['5.361653963401e+01', '', '4.609078195788e+01', '', '3.962135930423e+01', '', '3.406000172766e+01', '', '2.927925083995e+01'],
['2.516953864546e+01', '', '2.163667639887e+01', '', '1.859969593339e+01', '', '1.598899398582e+01', '', '1.374473698894e+01'],
['1.181548977143e+01', '', '1.015703673713e+01', '', '8.731368506527e+00', '', '7.505810795983e+00', '', '6.452275569743e+00'],
['5.546617302183e+00', '', '4.768079596776e+00', '', '4.098819479081e+00', '', '3.523498461194e+00', '', '3.028931005477e+00'],
['2.603782330822e+00', '', '2.238308635635e+00', '', '1.924133783786e+00', '', '1.654057335509e+00', '', '1.421889523591e+00'],
['1.222309392724e+00', '', '1.050742850800e+00', '', '9.032578372386e-01', '', '7.764742057598e-01', '', '6.674862562538e-01'],
['5.737961402745e-01', '', '4.932566139141e-01', '', '3.133421372728e-01', '', '2.315524554696e-01', '', '1.990511474577e-01'],
['1.711118080085e-01', '', '1.470941072881e-01', '', '1.264475938317e-01', '', '1.086990789815e-01', '', '9.344179207682e-02'],
['8.032605785014e-02', '', '6.905128236880e-02', '', '5.935906385039e-02', '', '5.102727046220e-02']]

我使用的输入值

4.163186002791e+04  3.578830331359e+04  3.076496349687e+04  2.644671278966e+04  2.273458304119e+04
1.954349752908e+04  1.680032112209e+04  1.444218412726e+04  1.241504140604e+04  1.067243373686e+04
9.174423035938e+03  7.886677033340e+03  6.779682426302e+03  5.828068476394e+03  5.010025548360e+03
1.737988920100e+03  1.284332855871e+03  1.104060538508e+03  8.158747205330e+02  7.013564117662e+02
6.029121922103e+02  5.182858606802e+02  4.455379022877e+02  2.433020871700e+02  2.091515701348e+02
1.797945089525e+02  1.545580816278e+02  1.328639052196e+02  9.818329499070e+01  7.255514128762e+01
5.361653963401e+01  4.609078195788e+01  3.962135930423e+01  3.406000172766e+01  2.927925083995e+01
2.516953864546e+01  2.163667639887e+01  1.859969593339e+01  1.598899398582e+01  1.374473698894e+01
1.181548977143e+01  1.015703673713e+01  8.731368506527e+00  7.505810795983e+00  6.452275569743e+00
5.546617302183e+00  4.768079596776e+00  4.098819479081e+00  3.523498461194e+00  3.028931005477e+00
2.603782330822e+00  2.238308635635e+00  1.924133783786e+00  1.654057335509e+00  1.421889523591e+00
1.222309392724e+00  1.050742850800e+00  9.032578372386e-01  7.764742057598e-01  6.674862562538e-01
5.737961402745e-01  4.932566139141e-01  3.133421372728e-01  2.315524554696e-01  1.990511474577e-01
1.711118080085e-01  1.470941072881e-01  1.264475938317e-01  1.086990789815e-01  9.344179207682e-02
8.032605785014e-02  6.905128236880e-02  5.935906385039e-02  5.102727046220e-02

答案 1 :(得分:0)

给出一个linecache的机会。如果是大文件,请尝试生成器。

import linecache
from sys import argv

def get_specific_lines(f, start, stop):
    for i in range(start, stop):
        yield linecache.getline(f, i)

script, f, s, e = argv

# check code if (f)ile exists, (s)tart and (e)nd are numeric.    

for line in get_specific_lines(f, int(s), int(e)):
    print line



$ wc -l asbc.txt 
197310 asbc.txt
$ python read.py asbc.txt 144322 144325
Bu yarım dəqiqə ərzində Qleb qabaqcadan dəqiq hesablanmış ləng hərəkətlə əlini qəbulediciyə uzatdı, diktora xırıldamağa imkan vermədən boynunu bururmuş kimi açarın dəstəyini burdu.

Onun azca bundan əvvəlki canlı sifəti yorğun, bozumtul rəng almışdı.

Pryançikovun başı isə başqa problemə qarışmışdı. Hansı gücləndirici kaskadı qoymaq lazım olduğu haqda fikirləşə-fikirləşə o, qayğısız zümzümə edirdi: