Question

我有一个看起来像这样的文件：

    1   2   3   4   5   6   7
1   0   1   1   1   1   1   1
2   0   0   1   1   1   1   1
3   0   0   0   1   1   1   1
4   0   0   0   0   1   1   1
5   0   0   0   0   0   1   1
6   0   0   0   0   0   0   1
7   0   0   0   0   0   0   0

我想只读取1和0并忽略顶部标题行和行名称（第一列）。

到目前为止，我已经设置了标题行，但是如何跳过跳过列。我的代码到目前为止

with open('file') as f:
    next(f) #skips header row
    content = [x.strip('\n') for x in f.readlines()]

我试图只使用基本python而没有库。

Answer 1

使用简单的索引：

ProductSale
   .where(net_sale: ProductSale.pluck("MAX(net_sale)"))
   .and("sale_date BETWEEN ? AND ?", date1, date2)
   .where.not(product_location_id: nil)
   .group(:product_location_id)

这将为您提供分割的零和1作为嵌套列表。

如果您不想分割线条，您仍然可以使用索引来移除第一个字符。

with open('file') as f:
    next(f)
    content = [x.strip().split()[1:] for x in f]

或者作为Numpythonic方法，您可以使用content = [x[1:].strip() for x in f]函数：

loadtxt()

Answer 2

使用pandas.read_csv，

import pandas as pd

df = pd.read_csv(filename, delim_whitespace=True, index_col=0)
matrix = df.as_matrix(df)

print(matrix)
# Output
[[0 1 1 1 1 1 1]
 [0 0 1 1 1 1 1]
 [0 0 0 1 1 1 1]
 [0 0 0 0 1 1 1]
 [0 0 0 0 0 1 1]
 [0 0 0 0 0 0 1]
 [0 0 0 0 0 0 0]]

Answer 3

如果第一个和第二个数字之间总共有3个空格，您可以使用它：

with open('file1.txt','r') as f:
    next(f)
    content = [x.strip('\n')[4:] for x in f.readlines()]

输出：

>>> for i in content:
    print(i)


0   1   1   1   1   1   1
0   0   1   1   1   1   1
0   0   0   1   1   1   1
0   0   0   0   1   1   1
0   0   0   0   0   1   1
0   0   0   0   0   0   1
0   0   0   0   0   0   0

Answer 4

您可以映射 str.split 对文件对象进行操作：

with open("in.txt") as f:
    next(f)
    matrix = [list(map(int, row[1:]) for row in map(str.split, f)]

如果您有制表符分隔文件，则可以使用 csv lib：

from itertools import islice
import csv
with open("in.txt") as f:
    next(f)
    matrix = [list(map(int, row[1:]) for row in csv.reader(f,delimiter="\t")]

除非你想要一个实际的列表，否则你永远不需要调用 readlines ，你可以简单地遍历文件对象。

在txt矩阵中读取时，如何跳过第一列

4 个答案: