我正在尝试编写Python脚本以将txt文件中的Wi-Fi数据提取到csv
这是txt数据:
Wed Oct 7 09:00:01 UTC 2020
BSS 02:ca:fe:ca:ca:40(on ap0_1)
freq: 2422
capability: IBSS (0x0012)
signal: -60.00 dBm
primary channel: 3
last seen: 30 ms ago
BSS ac:86:74:0a:73:a8(on ap0_1)
TSF: 229102338752 usec (2d, 15:38:22)
freq: 2422
capability: ESS (0x0421)
signal: -62.00 dBm
primary channel: 3
我需要将txt数据以以下格式提取到csv文件中:
Time | BSS | freq |capability |signal| primary channel |
----------------------------+---------------------------+------+-------------+------+-----------------+
Wed Oct 7 09:00:01 UTC 2020|02:ca:fe:ca:ca:40(on ap0_1)| 2422 |IBSS (0x0012)|-60.00| 3 |
|ac:86:74:0a:73:a8(on ap0_1)| 2422 |IBSS (0x0012)|-62.00| 3 |
这是我未完成的代码:
import csv
import re
fieldnames = ['TIME', 'BSS', 'FREQ','CAPABILITY', 'SIGNAL', 'CHANNEL']
re_fields = re.compile(r'({})+:\s(.*)'.format('|'.join(fieldnames)), re.I)
with open('ap0_1.txt') as f_input, open('ap0_1.csv', 'w', newline='') as f_output:
csv_output = csv.DictWriter(f_output, fieldnames= fieldnames)
csv_output.writeheader()
start = False
for line in f_input:
line = line.strip()
if len(line):
if 'BSS' in line:
if start:
start = False
block.append(line)
text_block = '\n'.join(block)
for field, value in re_fields.findall(text_block):
entry[field.upper()] = value
if line[0] == 'on ap0_1':
entry['BSS'] = block[0]
csv_output.writerow(entry)
else:
start = True
entry = {}
block = [line]
elif start:
block.append(line)
运行时,数据放置不正确。
请让我知道如何解决此问题。我只是编程的初学者,将不胜感激。谢谢。
答案 0 :(得分:1)
使用str.startswith
例如:
import csv
fieldnames = ('TIME', 'BSS', 'freq','capability', 'signal', 'primary channel')
with open(filename) as f_input, open(outfile,'w', newline='') as f_output:
csv_output = csv.DictWriter(f_output, fieldnames= fieldnames)
csv_output.writeheader()
result = {"TIME": next(f_input).strip()} #Get Time, First Line
for line in f_input:
line = line.strip()
if line.startswith(fieldnames):
if line.startswith('BSS'):
key, value = line.split(" ", 1)
else:
key, value = line.split(": ")
result[key] = value
csv_output.writerow(result)
根据评论进行编辑
如果上面的文本有多个块
import re
import csv
week_ptrn = re.compile(r"\b(" + "|".join(('Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun')) + r")\b")
fieldnames = ('TIME', 'BSS', 'freq','capability', 'signal', 'primary channel')
with open(filename) as f_input, open(outfile,'w', newline='') as f_output:
csv_output = csv.DictWriter(f_output, fieldnames= fieldnames)
csv_output.writeheader()
result = [] #Get Time, First Line
for line in f_input:
line = line.strip()
week = week_ptrn.match(line)
if week:
result.append({"TIME": line})
if line.startswith(fieldnames):
if line.startswith('BSS'):
key, value = line.split(" ", 1)
else:
key, value = line.split(": ")
result[-1][key] = value
csv_output.writerows(result)
答案 1 :(得分:0)
这是我的代码版本。
import csv, re
fieldnames = ['TIME', 'BSS', 'FREQ','CAPABILITY', 'SIGNAL', 'CHANNEL']
re_fields = re.compile(r'({})+:\s(.*)'.format('|'.join(fieldnames)), re.I)
with open('ap0_1.txt') as f_input, open('ap0_1.csv', 'w', newline='') as f_output:
csv_output = csv.DictWriter(f_output, fieldnames= fieldnames)
csv_output.writeheader()
start = False
time_condition = lambda @l: l.startswith('Mon') or l.startswith('Tue') or \
l.startswith('Wed') or l.startswith('Thu') or l.startswith('Fri') \
or l.startswith('Sat') or l.startswith('Sun')
row = dict{}
for line in f_input:
line = line.strip()
if not line:
continue
elif time_condition(line):
row['TIME'] = line
else:
# not sure how you define the start of a new block, say, it is by 'BSS' string
key, value = line.split(' ', 1) # split one time exactly
key = key.rstrip(':').upper()
if key == 'BSS' and row:
row = (row.get(k, '') for k in fieldnames)
csv_output.writerow(row)
row = dict()
row[key.upper()] = value
row = (row.get(k, '') for k in fieldnames)
csv_output.writerow(row)
看起来'\ n'创建了空白行。
答案 2 :(得分:0)
您尝试使用“ TIME”搜索时间。但是输入数据中没有“ TIME”。 因此,空时间输出是很自然的。
我认为遵循规则也有问题。
if line[0] == 'on ap0_1':
entry['BSS'] = block[0]
据我所知,您试图在on ap0_1
中找到BSS ac:86:74:0a:73:a8(on ap0_1)
。
但是第[0]行是[BSS],['BSS','ac:86:74:0a:73:a8(on','ap0_1)']的第一个。它应该像这样更改:
if 'on ap0_1' in block[0]:
entry['BSS'] = block[0][4:].lstrip()