Question

说，我有一个Excel文件导出为CSV文件，5行3列，具有以下值：

1.0 0.0 5.0
2.0 0.0 4.0
3.0 0.0 3.0
4.0 0.0 2.0
5.0 0.0 1.0

我需要获取列表的列表，其中包含相关列的排序值（在此示例中为3列，但可能更多......），如：

OutputList = [[1.0, 2.0, 3.0, 4.0, 5.0], [0.0, 0.0, 0.0, 0.0, 0.0], [5.0, 4.0, 3.0, 2.0, 1.0]]

不幸的是我无法使用熊猫。我发现的所有答案都与行中的pandas或列出值有关而不是列（或者对我不起作用的代码片段）。

Answer 1

使用默认csv module

<强>演示：

import csv
with open(filename, "r") as infile:
    reader = csv.reader(infile, delimiter=' ')
    OutputList = [map(float, list(i)) for i in zip(*reader)]

print(OutputList)

<强>输出：

[[1.0, 2.0, 3.0, 4.0, 5.0], [0.0, 0.0, 0.0, 0.0, 0.0], [5.0, 4.0, 3.0, 2.0, 1.0]]

根据评论进行编辑。

from itertools import izip_longest
import csv
with open(filename, "r") as infile:
    reader = csv.reader(infile, delimiter=' ')
    OutputList = [map(float, [j for j in list(i) if j is not None]) for i in izip_longest(*reader)]

print(OutputList)

Answer 2

以下是一种不使用pandas或csv来解决问题的方法：

将文件读入行列表，然后使用zip将其转换为列列表：

delim = ";"  # based on OP's comment
with open("myfile") as f:
    OutputList = [[float(x) for x in line.split(delim)] for line in f]
OutputList = zip(*OutputList)

print(OutputList)
#[(1.0, 2.0, 3.0, 4.0, 5.0),
# (0.0, 0.0, 0.0, 0.0, 0.0),
# (5.0, 4.0, 3.0, 2.0, 1.0)]

这会生成一个元组列表。如果您想将这些更改为列表，可以使用以下命令轻松转换它们：

OutputList = [list(val) for val in OutputList]
print(OutputList)
#[[1.0, 2.0, 3.0, 4.0, 5.0],
# [0.0, 0.0, 0.0, 0.0, 0.0],
# [5.0, 4.0, 3.0, 2.0, 1.0]]

Answer 3

您可以使用defaul csv模块和zip函数尝试：

import csv
with open('book1.csv') as f:
    reader = csv.reader(f)
    a = list(zip(*reader))
    for i in a:
        print(i)

输出是：

('1.0', '2.0', '3.0', '4.0', '5.0')
('0.0', '0.0', '0.0', '0.0', '0.0')
('5.0', '4.0', '3.0', '2.0', '1.0')

Answer 4

def sort_columns(myfile):
    # open the file with your data
    with open(myfile, "r") as f:
        # read the data into a "rows"
        rows = f.readlines()

    # store the number of columns or width of your file
    width = len(rows[0].split())
    # initialize your "result" variable that will be a list of lists
    result = []
    # initialize i to 0 and use it access each column value from your csv data
    i = 0
    while i < width:
        # initialize temp list before each while loop run
        temp = []
        # using list comprehension, store the i'th column from each row into temp
        temp = [ float(row.split()[i]) for row in rows if row.split() ]
        # temp now has the value of entire i'th column, append this to result
        result.append(temp)
        # increment i to access the next column
        i += 1
    # return your result
    return result

print sort_columns("file-sort-columns.txt")

将CSV列放入列表列表中而不使用pandas

4 个答案: