Generate 3D array from my data

时间:2016-10-20 19:57:19

标签: python arrays io formatting

I am struggling to get the data below into a 3D array. A little help would be appreciated. If the outline below is what you would do, feel free to change it.

with open("data.txt","r") as f:
    info = []
    for line in f:
        if line[0] == "#":
            continue
        freq, delay, reading = line.split(',')

        if not info:
            # Enter first entry into info
        else:
            # Insert freq, delay and reading into 3D array

    # Print data.txt in something like the format at the end of question 
    # (not necessarily in order)
    for freq in info:
        print freq[0]
        for delay in freq:
            print "\t%d" % delay[0]
            for reading in delay:
                print "\t\t%f" % reading

Below is a sample data.txt

#freq,delay,reading
291,1000,-29.22
320,1000,-29.33
270,2000,-29.11
240,1500,-29.04
220,1000,-28.89
272,1000,-29.11
291,1500,-29.21
320,1000,-29.34
270,2000,-29.1
240,1000,-29.02
220,1500,-28.89
272,1500,-29.12
291,1000,-29.19
320,1000,-29.32
270,2000,-29.1
240,1000,-29.02
220,1000,-28.88
272,1000,-29.1

Roughly desired output below.

220
    1000
        -28.89
        -28.88
    1500    
        -28.89
240    
    1000    
        -29.02
        -29.02
    1500    
        -29.04
270    
    2000    
        -29.11
        -29.1
        -29.1
272    
    1000    
        -29.11
        -29.1
    1500    
        -29.12
291    
    1000    
        -29.22
        -29.19
    1500    
        -29.21
320    
    1000    
        -29.34
        -29.33
        -29.32

2 个答案:

答案 0 :(得分:1)

我不知道你的文本文件有多大。但最简单的方法是在分割之前对线进行排序。但是,这需要将它们读入内存(或者如果没有足够的内存则创建新文件 - 不太可能)。

lines = []
with open("data.txt") as file:
   for line in file:
      lines.append(line)
sorted_lines = sorted(lines)

last_freq = None
last_delay = None

for line in sorted_lines:
    freq, delay, reading = line.strip('\n').split(',')
    if last_freq_column != freq:
         print freq,
    last_freq = freq
    if last_delay_column != delay:
         print delay,
    last_delay = delay
    print reading

考虑它伪代码,因为我没有测试它,它们可能是拼写错误。但你会明白这一点。

顺便说一句,你的输出实际上不是3D数组。但是如果你要生成一个列表列表(因为你必须在Python中这样做),你可以很容易地修改代码。

答案 1 :(得分:1)

不是你想要的,但简洁:

import numpy as np

data = np.loadtxt("data.txt", delimiter=",")
dsort = data[np.lexsort(data.T[::-1])]

print dsort

的产率:

[[  220.    1000.     -28.89]
 [  220.    1000.     -28.88]
 [  220.    1500.     -28.89]
 [  240.    1000.     -29.02]
 [  240.    1000.     -29.02]
 [  240.    1500.     -29.04]
 [  270.    2000.     -29.11]
 [  270.    2000.     -29.1 ]
 [  270.    2000.     -29.1 ]
 [  272.    1000.     -29.11]
 [  272.    1000.     -29.1 ]
 [  272.    1500.     -29.12]
 [  291.    1000.     -29.22]
 [  291.    1000.     -29.19]
 [  291.    1500.     -29.21]
 [  320.    1000.     -29.34]
 [  320.    1000.     -29.33]
 [  320.    1000.     -29.32]]