使用beautifulsoup将字符串作为列表放入一行

时间:2019-05-04 17:49:17

标签: python xml beautifulsoup

我想在行中获取地址内容,因为当我尝试将其写入csv时会产生问题

text = """
<B721>
<PARTY-US>
<NAM><FNM><PDAT>Minhua</PDAT></FNM><SNM><STEXT><PDAT>Lu</PDAT></STEXT></SNM></NAM>
<ADR>
<CITY><PDAT>Mohegan Lake</PDAT></CITY>
<STATE><PDAT>NY</PDAT></STATE>
</ADR>
</PARTY-US>
</B721>
<B721>
<PARTY-US>
<NAM><FNM><PDAT>Nobushige</PDAT></FNM><SNM><STEXT><PDAT>Korenaga</PDAT></STEXT></SNM></NAM>
<ADR>
<CITY><PDAT>Utsunomiya</PDAT></CITY>
<CTRY><PDAT>JP</PDAT></CTRY>
</ADR>
</PARTY-US>
</B721>
"""


from bs4 import BeautifulSoup
soup = BeautifulSoup(text, 'lxml')

### Address info
inventors = main_inventor.find_all("b721")
address_info = inventor_address = ", ".join([i.find("adr").text.strip() for i in inventors])

我得到以下输出:

Mohegan Lake
NY, Utsunomiya
JP

我该怎么做?

1 个答案:

答案 0 :(得分:1)

如果要替换所有换行符/换行符,请执行以下操作:

# you probably want to use a space ' ' to replace  newlines/breaks '\n'
# `\n` is used in unix like environments; `\r\n` is used in windows. 

address_info = address_info.replace('\n', ' ').replace('\r', '')