**如果您要给我一个负面观点,请至少说明原因。我只是想了解。
有人遇到吗?我还没弄清楚。我有一个大的XML文件正在尝试解析,当我使用汤(或其他任何东西)将其导入Python时,它的输出在每个字符之间都有一个空格。这是最奇怪的事情。
输出示例:
H e a d e r >
P I E S V e r s i o n > 6 . 5 / P I E S V e r s i o n >
S u b m i s s i o n T y p e > F U L L / S u b m i s s i o n T y p e >
P a r e n t D U N S N u m b e r > 0 5 5 4 3 3 1 9 7 / P a r e n t D U N S N u m b e r >
C u r r e n c y C o d e > U S D / C u r r e n c y C o d e >
L a n g u a g e C o d e > E N / L a n g u a g e C o d e >
T e c h n i c a l C o n t a c t > B r e n t B u s h n e l l / T e c h n i c a l C o n t a c t >
C o n t a c t E m a i l > B r e n t . B u s h n e l l @ g a t e s . c o m / C o n t a c t E m a i l >
/ H e a d e r >
<Header>
<PIESVersion>6.5</PIESVersion>
<SubmissionType>FULL</SubmissionType>
<ParentDUNSNumber>055433197</ParentDUNSNumber>
<CurrencyCode>USD</CurrencyCode>
<LanguageCode>EN</LanguageCode>
<TechnicalContact>Brent Bushnell</TechnicalContact>
<ContactEmail>Brent.Bushnell@gates.com</ContactEmail>
</Header>
应该起作用的代码部分:
from bs4 import BeautifulSoup
import csv
infile = open('GATES.xml','r')
contents = infile.read()
soup = BeautifulSoup(contents,'xml')
bullets = {}
attributes = {}
attr_ids = {}
attr_test = {}
for part in soup.find_all('Item'):
# --Pulls bullet point data from PIES File-------------------------------------------------------
bullets[part.find('PartNumber').get_text()] = [x.get_text() for x in part.find_all('Description')]
# --Pulls attribute data from PIES File----------------------------------------------------------
for x in part.find_all('PartNumber'):
partNum = x.get_text()
for v in part.find_all('ProductAttribute'):
attr_value = v.get_text()
attr_id = v.get('AttributeID')
attr_ids[attr_id] = attr_value
attributes[partNum] = [attr_ids]