如何在python中定期跳过几个enrties?

时间:2018-04-19 15:50:56

标签: python numpy

我有一个数据文件name.txt,其中包含以下元素:

object 1
num 2
24
56
67
object 3
num 4
34
56
78
num 5
count 4
69
78
56

我想删除文本行并制作3 * 3矩阵。任何人都可以帮我解决python代码。我想运行100个事件的代码。

我尝试了以下内容:

import itertools
from itertools import islice with open('name.txt') as fp: for line in itertools.islice(fp, 2, None): print line

我只能跳过前两行是字符串,但我想跳过所有文本行并制作一个3 * 3矩阵。

2 个答案:

答案 0 :(得分:0)

仅保留所需行的方法是使用itertools.compress来过滤它们。由于要删除/保留的行的模式是常规的,我们可以使用itertools.repeat

生成它

因此,生成矩阵的一种方法是:

from itertools import chain, repeat, compress

lines_filter = chain.from_iterable(repeat([False]*2 + [True]*3, 3))
# will repeat 3 times the sequence 2 x False, 3 x True,

matrix = [[0]*3 for i in range(3)]

with open('test.txt') as f:
    lines = compress(f, lines_filter)
    values = map(lambda line: int(line.strip()), lines)  # or float

    # Your question doesn't make clear if the values are given by line 
    # or column. I assume by line, swap i and j otherwise.
    for i in range(3):
        for j in range(3):
            matrix[i][j] = next(values)

print(matrix)
# [[24, 56, 67], [34, 56, 78], [69, 78, 56]]

我们也可以使用itertools创建矩阵 - 但它可能更难阅读:

from itertools import chain, repeat, compress

lines_filter = chain.from_iterable(repeat([False]*2 + [True]*3, 3))
# will repeat 3 times the sequence 2 x False, 3 x True

with open('test.txt') as f:
    lines = compress(f, lines_filter)
    values = map(lambda line: int(line.strip()), lines)  # or float
    line_items = [iter(values)]*3
    matrix = list(map(list, zip(*line_items)))

print(matrix)
# [[24, 56, 67], [34, 56, 78], [69, 78, 56]]

# and if you want it transposed:
t = list(map(list, zip(*matrix)))
print(t)
# [[24, 34, 69], [56, 56, 78], [67, 78, 56]]

或者,更短,更好,使用islice

from itertools import chain, repeat, compress, islice

lines_filter = chain.from_iterable(repeat([False]*2 + [True]*3, 3))
# will repeat 3 times the sequence 2 x False, 3 x True

with open('test.txt') as f:
    lines = compress(f, lines_filter)
    values = map(lambda line: int(line.strip()), lines)  # or float
    matrix = [list(islice(values, 3)) for i in range(3)] 

print(matrix)
# [[24, 56, 67], [34, 56, 78], [69, 78, 56]]

答案 1 :(得分:0)

由于您已使用numpy标记了您的问题,因此我认为使用numpy的答案可能很有用。您可以尝试以下方法:

import numpy as np
t = np.genfromtxt('name.txt', usecols=0)
m = t[np.isfinite(t)].reshape((3, 3))

然后,就你的具体例子而言:

In [44]: import numpy as np
    ...: t = np.genfromtxt('name.txt', usecols=0)
    ...: m = t[np.isfinite(t)].reshape((3, 3))
    ...: print(m)
    ...: 
[[ 24.  56.  67.]
 [ 34.  56.  78.]
 [ 69.  78.  56.]]