我试图将SoDAR数据文件读入python,如果它只有一个标题块那么会很容易。但是它每40行有4个标题行。此外,日期和时间位于其中一个标题行中。
任何人都可以帮我读入数据和日期/时间(我假设我必须为这两个信息读取此文件两次)。
该文件是一个空格分隔的.dat文件。
SJSU_Sodar_DiabloCanyon 04/21/2013 00:00:00 TO 04/21/2013 00:10:00 VR1.44 4400 150 100 60 15 0 0
600 5 20 7 0 0 25 15 64 1000 6 5 5 -600 600 -600 600 -400 400 0 10 359 100 63 1 80 7 1 0 0 59 2 12 6 14 6 0 0 0 5 5
3 COMPONENT 37HTS ZENITH 16-16 ARA 359 SEPANG 090 MXHT 0 UNOISE 15 VNOISE 13 WNOISE 21 ANTENNA STATUS: OK AC STATUS:N/A BATTV:12.67 TEMPC: 3.9
HT SPD DIR W SDW IW GSPD GDIR U SDU NU IU SNRU V SDV NV IV SNRV NW SNRW
200 99.99 9999 99.99 99.99 21 99.99 9999 99.99 99.99 5 15 5 99.99 99.99 3 12 5 3 6
195 99.99 9999 99.99 99.99 21 99.99 9999 99.99 99.99 3 15 5 99.99 99.99 0 12 5 2 6
190 99.99 9999 99.99 99.99 21 99.99 9999 99.99 99.99 2 15 5 99.99 99.99 4 12 5 8 6
185 99.99 9999 99.99 99.99 21 99.99 9999 99.99 99.99 2 15 5 99.99 99.99 1 12 5 5 6
180 99.99 9999 99.99 99.99 21 99.99 9999 99.99 99.99 0 15 5 99.99 99.99 2 12 5 3 6
175 99.99 9999 99.99 99.99 21 99.99 9999 99.99 99.99 0 15 5 99.99 99.99 3 12 5 1 6
170 99.99 9999 99.99 99.99 21 99.99 9999 99.99 99.99 0 15 5 99.99 99.99 4 12 5 4 6
165 99.99 9999 99.99 99.99 21 99.99 9999 99.99 99.99 0 15 5 99.99 99.99 1 12 5 7 6
160 99.99 9999 99.99 99.99 21 99.99 9999 99.99 99.99 1 15 5 99.99 99.99 5 12 5 5 6
155 99.99 9999 99.99 99.99 21 99.99 9999 99.99 99.99 1 15 5 99.99 99.99 3 13 5 5 6
150 99.99 9999 99.99 99.99 22 99.99 9999 99.99 99.99 1 15 5 99.99 99.99 1 13 5 6 6
145 99.99 9999 99.99 99.99 22 99.99 9999 99.99 99.99 1 15 5 99.99 99.99 0 13 5 3 6
140 99.99 9999 99.99 99.99 22 99.99 9999 99.99 99.99 2 15 5 99.99 99.99 1 12 5 5 6
135 99.99 9999 99.99 99.99 21 99.99 9999 99.99 99.99 2 15 5 99.99 99.99 1 12 5 4 6
130 99.99 9999 99.99 99.99 22 99.99 9999 99.99 99.99 4 16 5 99.99 99.99 2 13 5 1 6
125 99.99 9999 99.99 99.99 22 99.99 9999 99.99 99.99 3 16 5 99.99 99.99 1 13 5 5 6
120 99.99 9999 99.99 99.99 22 99.99 9999 99.99 99.99 0 16 5 99.99 99.99 5 13 5 7 6
115 99.99 9999 99.99 99.99 22 99.99 9999 99.99 99.99 2 16 5 99.99 99.99 2 13 5 4 6
110 99.99 9999 99.99 99.99 23 99.99 9999 99.99 99.99 1 16 5 99.99 99.99 0 14 5 7 6
105 1.54 137 0.28 1.13 27 2.32 146 -1.02 0.36 16 20 6 1.15 0.80 16 18 6 16 6
100 1.24 93 0.16 0.75 33 3.53 118 -1.24 0.61 36 24 6 0.09 0.42 34 26 6 32 7
95 0.77 70 0.09 0.58 40 3.06 140 -0.73 0.51 48 30 7 -0.25 0.57 46 32 7 42 7
90 0.63 80 -0.06 0.55 39 3.42 84 -0.62 0.46 38 34 7 -0.10 0.60 45 37 7 41 7
85 0.33 13 0.00 0.55 41 3.11 239 -0.08 0.41 49 39 7 -0.32 0.54 37 36 7 61 9
80 0.62 352 0.06 0.42 43 3.02 30 0.08 0.43 60 41 8 -0.62 0.62 30 33 6 57 9
75 0.96 354 0.06 0.45 46 3.50 305 0.08 0.51 53 42 7 -0.96 0.75 26 35 6 61 9
70 0.80 318 0.03 0.41 58 4.13 1 0.52 0.36 57 51 8 -0.60 0.70 41 42 7 69 10
65 1.23 273 0.01 0.33 86 3.25 263 1.23 0.40 68 75 9 -0.08 0.49 63 61 8 83 13
60 1.40 262 0.00 0.39 118 2.77 268 1.39 0.38 83 103 10 0.18 0.39 71 78 9 98 15
55 0.66 267 0.05 0.26 152 2.14 356 0.66 0.29 93 129 12 0.02 0.33 90 97 11 103 17
50 1.12 260 -0.09 0.51 173 2.61 243 1.11 0.34 100 134 12 0.18 0.54 95 114 11 102 19
45 0.44 345 0.07 0.32 214 4.70 26 0.11 0.33 99 153 12 -0.42 0.56 91 158 12 105 19
40 0.37 15 0.03 0.25 268 2.38 0 -0.10 0.34 101 172 13 -0.36 0.55 103 208 14 105 22
35 0.10 144 -0.03 0.30 270 1.88 191 -0.06 0.29 94 169 13 0.08 0.45 92 189 13 100 22
30 0.40 40 0.03 0.32 241 2.50 107 -0.26 0.37 90 175 12 -0.30 0.35 92 185 12 98 19
25 0.62 70 0.00 0.38 234 2.75 76 -0.58 0.37 96 190 13 -0.20 0.40 94 171 12 102 21
20 1.42 78 -0.07 0.35 333 4.40 82 -1.39 0.43 103 284 14 -0.28 0.40 95 232 11 104 18
SJSU_Sodar_DiabloCanyon 04/21/2013 00:10:00 TO 04/21/2013 00:20:00 VR1.44 4400 150 100 60 15 0 0
600 5 20 7 0 0 25 15 64 1000 6 5 5 -600 600 -600 600 -400 400 0 10 359 100 63 1 80 7 1 0 0 59 2 12 6 14 6 0 0 0 5 5
3 COMPONENT 37HTS ZENITH 16-16 ARA 359 SEPANG 090 MXHT 0 UNOISE 15 VNOISE 12 WNOISE 21 ANTENNA STATUS: OK AC STATUS:N/A BATTV:12.67 TEMPC: 3.8
HT SPD DIR W SDW IW GSPD GDIR U SDU NU IU SNRU V SDV NV IV SNRV NW SNRW
200 99.99 9999 99.99 99.99 21 99.99 9999 99.99 99.99 0 15 5 99.99 99.99 3 13 5 1 6
195 99.99 9999 99.99 99.99 21 99.99 9999 99.99 99.99 2 15 5 99.99 99.99 1 12 5 1 6
190 99.99 9999 99.99 99.99 21 99.99 9999 99.99 99.99 1 15 5 99.99 99.99 0 12 5 1 6
185 99.99 9999 99.99 99.99 21 99.99 9999 99.99 99.99 1 15 5 99.99 99.99 2 12 5 0 6
答案 0 :(得分:4)
如果它总是4行标题,40行阻止,4行标题等。那么你可以调整以下内容:
from itertools import islice, cycle, izip
with open('input_file.data') as fin:
lines = iter(lambda limit=cycle([4, 40]): [line.split() for line in islice(fin, next(limit))], [])
for header, data in izip(*[iter(lines)] * 2):
print header[0][1] # date
print data[0] # first row of data under header
如果以数字开头的行是比固定大小的块更合适的检查,则:
with open('input_file.dat') as fin:
grouped = groupby(fin, lambda L: L.lstrip()[0].isdigit())
lines = ([line.split() for line in g] for k, g in grouped)
for header, data in izip(*[iter(lines)] * 2):
print header[0][1]