解析器txt文件python

时间:2018-11-22 13:36:02

标签: python parsing

我有一个包含加速度计数据的txt文件,我想将此文件解析为列。

下面是数据,问题是我只希望将这些值用作列 (X值,Y值,Z值,时间差(以毫秒为单位)),我想删除文件的页眉和页脚。

# Accelerometer Values
# filename:  default__3.txt
# Saving start time: Sat Nov 15 11:09:33 GMT+03:30 2014

# sensor resolution: 0.1m/s^2
#Sensorvendor: ST Microelectronic, name: ST accelerometer, type: 1,version : 104, range 16.0

# X value, Y value, Z value, time diff in ms
-3.236 -4.726 8.982 1
-3.206 -4.716 8.884 10
-3.187 -4.716 8.816 10
-3.138 -4.716 8.757 10
-3.138 -4.746 8.757 1
-3.059 -4.815 8.816 9
-3.059 -4.864 8.825 10
-3.069 -5.021 8.865 10
-3.069 -4.903 8.865 1
-3.089 -4.864 8.924 9
-3.108 -4.903 9.051 13
-3.157 -4.903 9.247 8
-3.206 -4.893 9.404 9
-3.275 -4.883 9.581 11
-3.314 -4.726 9.62 10
-3.314 -4.805 9.62 1
-3.324 -4.756 9.512 9
-3.324 -4.667 9.335 11
-3.246 -4.589 9.247 9
-3.177 -4.56 9.041 11
-3.02 -4.56 8.855 9
-3.128 -4.54 8.855 1
-3.098 -4.628 8.708 10
-3.098 -4.628 8.62 9
-3.02 -4.687 8.62 1
-3.02 -4.687 8.541 9
-2.991 -4.775 8.541 1
-2.961 -4.805 8.512 10

# end
#Sat Nov 15 11:10:36 GMT+03:30 2014

2 个答案:

答案 0 :(得分:1)

由于您已经在文件中明确定义了注释行,因此将它们过滤掉非常简单。

这是我想出的:

with open("default__3.txt", "r") as f:
    lines = f.readlines()

x_values = []
y_values = []
z_values = []
time_diffs = []

for line in lines:
    if line.startswith('#'):  # filter out comment lines
        continue
    tokens = line.split(' ')
    if len(tokens) < 4:  # filter out blank lines
        continue
    x_values.append(float(tokens[0]))
    y_values.append(float(tokens[1]))
    z_values.append(float(tokens[2]))
    time_diffs.append(int(tokens[3].strip('\n')))  # remove carriage returns from last token

print(x_values)
print(y_values)
print(z_values)
print(time_diffs)

这会将您的值放入列表中,您可以根据自己的意愿进行操作。我用它来打印以下内容:

[-3.236, -3.206, -3.187, -3.138, -3.138, -3.059, -3.059, -3.069, -3.069, -3.089, -3.108, -3.157, -3.206, -3.275, -3.314, -3.314, -3.324, -3.324, -3.246, -3.177, -3.02, -3.128, -3.098, -3.098, -3.02, -3.02, -2.991, -2.961]
[-4.726, -4.716, -4.716, -4.716, -4.746, -4.815, -4.864, -5.021, -4.903, -4.864, -4.903, -4.903, -4.893, -4.883, -4.726, -4.805, -4.756, -4.667, -4.589, -4.56, -4.56, -4.54, -4.628, -4.628, -4.687, -4.687, -4.775, -4.805]
[8.982, 8.884, 8.816, 8.757, 8.757, 8.816, 8.825, 8.865, 8.865, 8.924, 9.051, 9.247, 9.404, 9.581, 9.62, 9.62, 9.512, 9.335, 9.247, 9.041, 8.855, 8.855, 8.708, 8.62, 8.62, 8.541, 8.541, 8.512]
[1, 10, 10, 10, 1, 9, 10, 10, 1, 9, 13, 8, 9, 11, 10, 1, 9, 11, 9, 11, 9, 1, 10, 9, 1, 9, 1, 10]

答案 1 :(得分:0)

不要重新发明轮子。使用pandas加载和处理数据:

>>> import pandas as pd
>>> data = pd.read_csv('data.txt', sep=' ', header=None, comment='#')
>>> data
        0      1      2   3
0  -3.236 -4.726  8.982   1
1  -3.206 -4.716  8.884  10
2  -3.187 -4.716  8.816  10
3  -3.138 -4.716  8.757  10
4  -3.138 -4.746  8.757   1
5  -3.059 -4.815  8.816   9
6  -3.059 -4.864  8.825  10
7  -3.069 -5.021  8.865  10
8  -3.069 -4.903  8.865   1
9  -3.089 -4.864  8.924   9
10 -3.108 -4.903  9.051  13
11 -3.157 -4.903  9.247   8
12 -3.206 -4.893  9.404   9
13 -3.275 -4.883  9.581  11
14 -3.314 -4.726  9.620  10
15 -3.314 -4.805  9.620   1
16 -3.324 -4.756  9.512   9
17 -3.324 -4.667  9.335  11
18 -3.246 -4.589  9.247   9
19 -3.177 -4.560  9.041  11
20 -3.020 -4.560  8.855   9
21 -3.128 -4.540  8.855   1
22 -3.098 -4.628  8.708  10
23 -3.098 -4.628  8.620   9
24 -3.020 -4.687  8.620   1
25 -3.020 -4.687  8.541   9
26 -2.991 -4.775  8.541   1
27 -2.961 -4.805  8.512  10

要获取特定列作为数组:

>>> data[2].values
array([8.982, 8.884, 8.816, 8.757, 8.757, 8.816, 8.825, 8.865, 8.865,
       8.924, 9.051, 9.247, 9.404, 9.581, 9.62 , 9.62 , 9.512, 9.335,
       9.247, 9.041, 8.855, 8.855, 8.708, 8.62 , 8.62 , 8.541, 8.541,
       8.512])