我创建了一个python脚本,它读取pdf文件名,然后使用'#'从文件名中提取字段。作为分隔符。提取这些字段后,脚本会读取模板xml文件并从模板中替换标记以创建新的xml文件。一切都正常。我觉得代码不是pythonic,需要让它变干。请指教。
template xml file:
<FaxInfo xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Sender>
<UserName>Administrator</UserName>
<FaxNumber>23456789</FaxNumber>
<TelephoneNumber>12345678</TelephoneNumber>
<Company>RINCON</Company>
</Sender>
<RecipientList>
<Recipient>
<FaxNumber>6194</FaxNumber>
</Recipient>
</RecipientList>
<DocumentList>
<Document>pdfsample1.pdf</Document>
</DocumentList>
<Options>
<SendOptions>
<Subject>SUBJECT</Subject>
</SendOptions>
<OtherOptions>
<Retry>3</Retry>
<Interval>3</Interval>
<BillingCode>EMAIL</BillingCode>
<CustomCode>
<CustomCode1></CustomCode1>
<CustomCode2>STAMPTIME</CustomCode2>
</CustomCode>
</OtherOptions>
</Options>
</FaxInfo>
main script:
import os
import datetime
import glob
from xml.dom.minidom import parse, parseString
import bs4
path = r'C:\Users\sachin\Desktop\xmlcreater'
for file in glob.glob(os.path.join(path, '*.pdf')):
email = file.split("#")[0]
stamptime = file.split("#")[1]
subject = file.split("#")[2].split('.')[0]
f_date = datetime.datetime.strptime(stamptime, "%m%d%Y%H%M%S").strftime("%m/%d/%Y %H:%M:%S")
f_email = email.split('\\')[-1]
with open(r'C:\Users\sachin\Desktop\xmlcreater\sample\sample.xml', 'r') as infile:
contents = infile.read()
soup = bs4.BeautifulSoup(contents, 'html.parser')
infile.close()
with open('{}.xml'.format(file.split('#')[1]), 'w') as x:
x.write(contents)
x.close()
for xml in glob.glob(os.path.join(path, '*.xml')):
with open(xml, 'r') as f:
data = f.read()
data1 = data.replace(soup.billingcode.string, f_email)
with open(xml, 'w+') as k:
k.write(data1)
for xml in glob.glob(os.path.join(path, '*.xml')):
with open(xml, 'r') as f:
data = f.read()
data1 = data.replace(soup.customcode2.string, f_date)
with open(xml, 'w+') as k:
k.write(data1)
for xml in glob.glob(os.path.join(path, '*.xml')):
with open(xml, 'r') as f:
data = f.read()
data1 = data.replace(soup.subject.string, subject)
with open(xml, 'w+') as k:
k.write(data1)
pdf file name sample:
example@example.co.in#06142018123721#testing.pdf