将从文件中过滤数据的python脚本

时间:2015-02-26 18:30:01

标签: python json

我正在从文件(任何格式,如csv,text,json,html等)编写废料数据的脚本,并将匹配列表与另一个文件匹配,然后从另一个文件替换该特定字符串,每个文件包含相同的数据,我想使用正则表达式,因为我想在%% string %%之后废弃数据,然后将字符串存储到列表中 文件格式

FILE1.TXT

{ 
 "alias": "%%demo%%",
 "demo": "%%demo%%",
 "dns_domain": "googlr.com",
 "max_physical_memory": "%%maxmemory%%",
 "dataset_uuid": "%%DS_UUID%%",
 "nics": [
 {
  "nic_tag": "stub0",
  "ip": "%%ip%%",
  "netmask": "255.255.240.0",
  "primary": "1"
   }
 ]
 }

我希望将所有字符串输入到%% ____ %% sign

之间的列表中

Python代码

import sys
import re
list = []
list1 = []
i = 0
for n in sys.argv[1:]:
#list = []
#list1 = []
print n
input1 = open(n, "w")
#print input1
output = open(n,"r")
for line1 in output:
s = line1.split("=",1)[1:2]
for m in s:
    list1.append(m.strip())
for line in input1:
    a = re.findall(r"%%([^%^\n]+)%%", line)
for val in a:
    list.append(val)
    stext = list[i:0]
        rtext = list1[i:0]
    input1.write(line.replace(val, rtext))  
i += 1
input1.close()
output.close()

打印列表和list2,list2包含来自file2.txt的值

FILE2.TXT

demo=somehost
demo=somehost2
maxmemory=1025
DS_UUID = 454s5da5d4a
ip=127.0.0.1

我想从file2替换file1,请检查我的代码并让我知道我们该怎么做

2 个答案:

答案 0 :(得分:2)

使用正则表达式很容易在众所周知的标记内找到数据:

>>> import re
>>> re.findall(r"%%([^%^\n]+)%%", "hello %%there%% how\n are %%you%%")
['there', 'you']

从更新的示例中,您可以扩展列表而不是添加子列表

import fileinput
import re
array = []
for line in fileinput.input():
    array.extend(re.findall(r"%%([^%^\n]+)%%", line))
print array
fileinput.close()

答案 1 :(得分:-1)

感谢您的所有时间,最后我实现了我想要的,我的代码在下面

import sys
import re

list2 = []

file1 = 'file1.json'
file2 = 'test-var.txt'
output = open(file2, "r")
for line1 in output:
    s = line1.split("=",1)[1:2]
    for m in s:
    list2.append(m)
input1 = open(file1, "r")
list1 = []
txt = ''
for line in input1:
   a = re.findall(r"%%([^%^\n]+)%%",line)
   a = ''.join(a)
   if a =='':
      txt = txt + line
      continue
   if any(a in s for s in list1):
      val = '%%'+a+"1"+'%%'
      line = line.replace('%%'+a+'%%', val)
      a = a + "1"
      txt = txt + line
      list1.append(a)
for i in range(len(list1)):
    string1 = '%%'+''.join(list1[i])+'%%'
    string2 = ''.join(list2[i])
     txt = txt.replace(string1,string2)
input1.close
output.close()
output = open(file1, "w")
print txt
output.write(txt)
output.close()