我试图在Python中编写一个脚本,将文本文件中的数据提取到 CSV 。
数据如下所示:
*------------*
102GCPC-XP
not online
*------------*
------------
105PEACHPC
name : 105PEACHPC
manufacturer : Dell Inc.
model : OptiPlex 755
totalphysicalmemory : 2101907456
domain : abc.com
serialnumber : 90QZGG
version : 5.1.2600
Processor: Intel(R) Pentium(R)
size : 79999073280
ipaddress : 255.255.0.0
------------
我希望数据如下所示:
COMPUTER NAME | STATUS | NAME | MANUFACTURER | MODEL | TOTALPHYSICALMEMORY | DOMAIN | SERIALNUMBER | VERSION | PROCESSOR | SIZE | IPADDRESS |
--------------+----------+----------+--------------+------------+---------------------+--------+--------------+---------+-------------------+-----------+-----------+
102GCPC-XP |not online| | | | | | | | | | |
--------------+----------+----------+--------------+------------+---------------------+--------+--------------+---------+-------------------+-----------+-----------+
105PEACHPC |Online |105PEACHPC|Dell Inc. |OptiPlex 755|2101907456 |abc.com |90QZGG |5.1.2600 |Intel(R) Pentium(R)|79999073280|255.255.0.0|
先谢谢。
答案 0 :(得分:0)
您的广告素块似乎以----------
开头和结尾,另外*
显示该条目是否在线。代码首先需要通过搜索此分隔符然后逐行构造条目来将文本文件拆分为块。找到结束分隔符后,它会使用正则表达式查找所有匹配的fieldnames
。最后,csv.DictWriter()
用于将条目写入格式正确的CSV文件,该文件可以加载到Excel中:
import csv
import re
fieldnames = ['NAME', 'MANUFACTURER', 'MODEL', 'TOTALPHYSICALMEMORY',
'DOMAIN', 'SERIALNUMBER', 'VERSION', 'PROCESSOR', 'SIZE', 'IPADDRESS']
re_fields = re.compile(r'({})\s+:\s(.*)'.format('|'.join(fieldnames)), re.I)
with open('input.txt') as f_input, open('output.csv', 'wb') as f_output:
csv_output = csv.DictWriter(f_output, fieldnames=['COMPUTER NAME', 'STATUS'] + fieldnames)
csv_output.writeheader()
start = False
for line in f_input:
line = line.strip()
if len(line):
if '------------' in line:
if start:
start = False
block.append(line)
text_block = '\n'.join(block)
for field, value in re_fields.findall(text_block):
entry[field.upper()] = value
if line[0] == '*':
entry['COMPUTER NAME'] = block[1]
entry['STATUS'] = block[2]
else:
entry['COMPUTER NAME'] = entry['NAME']
entry['STATUS'] = 'Online'
csv_output.writerow(entry)
else:
start = True
entry = {}
block = [line]
elif start:
block.append(line)
因此,对于您提供的数据,您将获得output.csv
包含:
COMPUTER NAME,STATUS,NAME,MANUFACTURER,MODEL,TOTALPHYSICALMEMORY,DOMAIN,SERIALNUMBER,VERSION,PROCESSOR,SIZE,IPADDRESS
102GCPC-XP,not online,,,,,,,,,,
105PEACHPC,Online,105PEACHPC,Dell Inc.,OptiPlex 755,2101907456,abc.com,90QZGG,5.1.2600,,79999073280,255.255.0.0
对于Python 3.x使用,请将输出代码修改为:
open('output.csv', 'w', newline='') as f_output: