Python - 将文本文件解析为csv文件

时间:2017-12-19 03:44:19

标签: python csv

我有一个文本文件,它是从我与Netmiko一起运行的命令输出的,用于从思科WLC中检索导致我们WiFi网络受到干扰的数据。我从原始的600k代码行中删除了我需要的内容,直到这样的几千行:

AP Name.......................................... 010-HIGH-FL4-AP04
Microwave Oven      11       10      -59         Mon Dec 18 08:21:23 2017   
WiMax Mobile               11       0       -84         Fri Dec 15 17:09:45 2017   
WiMax Fixed                11       0       -68         Tue Dec 12 09:29:30 2017   
AP Name.......................................... 010-2nd-AP04
Microwave Oven             11       10      -61         Sat Dec 16 11:20:36 2017   
WiMax Fixed                11       0       -78         Mon Dec 11 12:33:10 2017   
AP Name.......................................... 139-FL1-AP03
Microwave Oven             6        18      -51         Fri Dec 15 12:26:56 2017   
AP Name.......................................... 010-HIGH-FL3-AP04
Microwave Oven             11       10      -55         Mon Dec 18 07:51:23 2017   
WiMax Mobile               11       0       -83         Wed Dec 13 16:16:26 2017   

目标是最终得到一个csv文件,删除“AP名称...”'并将与剩余信息保持在同一行的内容放在下一行。问题是有些人在AP名称下面有两行,有些有1或没有。我已经在这里工作了8个小时,找不到实现这一目标的最佳方法。

这是我尝试使用的最新版本的代码,是否有任何使这项工作的建议?我只想要一些我可以在excel中加载并使用以下内容创建报告的内容:

with open(outfile_name, 'w') as out_file:
    with open('wlc-interference_raw.txt', 'r')as in_file:
        #Variables
        _ap_name = ''
        _temp = ''
        _flag = False
        for i in in_file:
            if 'AP Name' in i:
                #write whatever was put in the temp file to disk because new ap now
                #add another temp variable in case an ap has more than 1 interferer and check if new AP name
                out_file.write(_temp)
                out_file.write('\n')
                #print(_temp)
                _ap_name = i.lstrip('AP Name.......................................... ')
                _ap_name = _ap_name.rstrip('\n')
                _temp = _ap_name
                #print(_temp)
            elif '----' in i:
                pass
            elif 'Class Type' in i:
                pass
            else:
                line_split = i.split()
                for x in line_split:
                    _temp += ','
                    _temp += x
                _temp += '\n'

1 个答案:

答案 0 :(得分:0)

我认为你最好的选择是读取文件的所有行,然后分成以AP名称开头的部分。然后你可以解析每个部分。

实施例

s = """AP Name.......................................... 010-HIGH-FL4-AP04
Microwave Oven      11       10      -59         Mon Dec 18 08:21:23 2017   
WiMax Mobile               11       0       -84         Fri Dec 15 17:09:45 2017   
WiMax Fixed                11       0       -68         Tue Dec 12 09:29:30 2017   
AP Name.......................................... 010-2nd-AP04
Microwave Oven             11       10      -61         Sat Dec 16 11:20:36 2017   
WiMax Fixed                11       0       -78         Mon Dec 11 12:33:10 2017   
AP Name.......................................... 139-FL1-AP03
Microwave Oven             6        18      -51         Fri Dec 15 12:26:56 2017   
AP Name.......................................... 010-HIGH-FL3-AP04
Microwave Oven             11       10      -55         Mon Dec 18 07:51:23 2017   
WiMax Mobile               11       0       -83         Wed Dec 13 16:16:26 2017"""

import re

class AP:
    """ 
    A class holding each section of the parsed file
    """
    def __init__(self):
        self.header = ""
        self.content = []

sections = []
section = None
for line in s.split('\n'):  # Or 'for line in file:'
    # Starting new section
    if line.startswith('AP Name'):
        # If previously had a section, add to list
        if section is not None:
            sections.append(section)  
        section = AP()
        section.header = line
    else:
        if section is not None:
            section.content.append(line)
sections.append(section)  # Add last section outside of loop


for section in sections:
    ap_name = section.header.lstrip("AP Name.")  # lstrip takes all the characters given, not a literal string
    for line in section.content:
        print(ap_name + ",", end="") 
        # You can extract the date separately, if needed
        # Splitting on more than one space using a regex
        line = ",".join(re.split(r'\s\s+', line))
        print(line.rstrip(','))  # Remove trailing comma from imperfect split

输出

010-HIGH-FL4-AP04,Microwave Oven,11,10,-59,Mon Dec 18 08:21:23 2017
010-HIGH-FL4-AP04,WiMax Mobile,11,0,-84,Fri Dec 15 17:09:45 2017
010-HIGH-FL4-AP04,WiMax Fixed,11,0,-68,Tue Dec 12 09:29:30 2017
010-2nd-AP04,Microwave Oven,11,10,-61,Sat Dec 16 11:20:36 2017
010-2nd-AP04,WiMax Fixed,11,0,-78,Mon Dec 11 12:33:10 2017
139-FL1-AP03,Microwave Oven,6,18,-51,Fri Dec 15 12:26:56 2017
010-HIGH-FL3-AP04,Microwave Oven,11,10,-55,Mon Dec 18 07:51:23 2017
010-HIGH-FL3-AP04,WiMax Mobile,11,0,-83,Wed Dec 13 16:16:26 2017

提示:

您不需要Python来编写CSV,您可以使用命令行输出到文件

python script.py > output.csv