将.txt数据转换为Python中的两个列表

时间:2014-08-04 13:17:01

标签: python list text-files

我在.txt文件中找到了这些数据:

    Round 1
Data Point 0: time=  0.0[hour], movement=   0.5[feet]
Data Point 1: time=  3.2[hour], movement=   5.54[feet]
Data Point 2: time= 10.1[hour], movement=   6.4[feet]
Data Point 3: time= 14.0[hour], movement=   7.02[feet]

+++++++++++++++++++++++++++++++++++++++++++++
    Round 2
Data Point 0: time=  0.0[hour], movement=  -5.2[feet]
Data Point 1: time=  2.3[hour], movement=   3.06[feet]
Data Point 2: time= 8.9[hour], movement=   4.07[feet]
Data Point 3: time= 9.4[hour], movement=   9.83[feet]

并且,我想获取时间和移动数据,并将它们放入第1轮和第2轮的两个单独列表中。示例输出:

time_1 = [0.0, 3.2, 10.1, 14.0]
movement_1 = [0.5, 5.54, 6.4, 7.02]

并且,第2轮的格式相同。我知道使用with语句调用和打开文件的一般方法,以及使用forif语句来查看在每一行中,但我不太清楚如何分别处理每个Round的数据以及分隔符+++++

4 个答案:

答案 0 :(得分:2)

您可以先阅读文件,将其拆分为几轮:

import re
with open("myfile.txt") as infile:
    rounds = re.split("\+{10,}", infile.read())

然后遍历轮次/线:

result = []
for round in rounds:
    r = {"time":[], "move":[]}
    for match in re.findall(r"time=\s+(\d+\.\d+).*movement=\s+(-?\d+\.\d+)",
                            round):
        time, move = float(match[0]), float(match[1])
        r["time"].append(time)
        r["move"].append(move)
    result.append(r)

结果:

>>> result
[{'time': [0.0, 3.2, 10.1, 14.0], 'move': [0.5, 5.54, 6.4, 7.02]}, 
 {'time': [0.0, 2.3, 8.9, 9.4], 'move': [-5.2, 3.06, 4.07, 9.83]}]

答案 1 :(得分:0)

如果您的文件与发布完全一致:

import re

time_1 = []
movement_1 = []
time_2 = []
movement_2 = []
with open("in.txt") as f:
    for line in iter(lambda: f.readline().strip(),"+++++++++++++++++++++++++++++++++++++++++++++"): # keep going till the line "+++++++++++++++++++++++++++++++++++++++++++++"
        match = re.findall('\d+\.\d+|-\d+\.\d+', line)
        if match:
            time_1.append(match[0])
            movement_1.append(match[1])
    for line in f:       # move to lines after "+++++++++++++++++++++++++++++++++++++++++++++"
        match = re.findall('\d+\.\d+|-\d+\.\d+', line)
        if match:
            time_2.append(match[0])
            movement_2.append(match[1])

print time_1,movement_1
print time_2,movement_2
['0.0', '3.2', '10.1', '14.0'] ['0.5', '5.54', '6.4', '7.02']
['0.0', '2.3', '8.9', '9.4'] ['-5.2', '3.06', '4.07', '9.83']

如果你想要花车使用time_1.append(float(match[0]))等..

每个部分的每个子列表将在时间和运动中相互对应

times = []
movements = []

with open("in.txt") as f:
    lines = f.read().split("+++++++++++++++++++++++++++++++++++++++++++++")
    for line in lines:
        match = re.findall('\d+\.\d+|-\d+\.\d+', line)         
        times.append(match[::2])
        movements.append(match[1::2])

如果您有三轮只需打开包装:

r1_times, r2_times, r3_times = times
r1_move, r2_move, r3_move = movements

print r1_times,r1_move
['0.0', '3.2', '10.1', '14.0'] ['0.5', '5.54', '6.4', '7.02']

答案 2 :(得分:0)

这有点脏,但它为您提供了两个列表,其中包含每轮的列表 所以时间将是[time_1,time_2] 和移动将是[movement_1,movement_2]

time = []
movement = []
totalTime = []
totalMovement = []
with open('data.txt') as f:
    for line in f:
        if line.find('+') == -1 and line.find('Round') == -1:
            tempTime = line[line.find('=')+1:line.find('[')]
            time.append(tempTime)
            tempMovement = line[line.find('t=')+2:line.find('[feet')]
            movement.append(tempMovement)
        elif line.find('+') != -1:
            totalTime.append(time)
            totalMovement.append(movement)
            time = []
            movement = []

答案 3 :(得分:0)

通过这种方式,您可以创建两个列表,其中包含尽可能多的子列表。你可以通过查看第一个,第二个子列表(第一轮,第二轮)和更多

来获得你想要的东西
with open("prova.txt","r") as f:       # put here the right filename
    Round = -1
    Times=[]
    Movements=[]
    for i in f.readlines(): 
        if "Round" in i:
            Round=Round+1
            Times.append([])
            Movements.append([])
        if i[0:4]=="Data":
            Times[Round].append(float(i.split("=")[1].split("[")[0]))
            Movements[Round].append(float(i.split("=")[2].split("[")[0]))
    print Times
    print Movements

    >>> [[0.0, 3.2, 10.1, 14.0], [0.0, 2.3, 8.9, 9.4]]      #Take a look to my results
    >>> [[0.5, 5.54, 6.4, 7.02], [-5.2, 3.06, 4.07, 9.83]]

    print Times[0]  #for times of first round
    print Times[1]  #for second round

......等等(这取决于文本文件中有多少轮次)