br /> (例如,没有。)每行之间的标记
29 BOSWALL PARKWAY
EDINBURGH
EH5 2BR。当我在这个文本编辑器中写作时,它们的标签会消失。
当我通过美丽的汤输出看起来像29 BOSWALL PARKWAYEDINBURGHEH5 2BR
有没有人知道生成逗号空间的好方法是标签是什么?非常感谢提前。
我已经根据其他示例尝试了以下内容
第一种方法在扫描字符串文字错误时生成SyntaxError:EOL 第二种方法AttributeError:' unicode'对象没有属性' replaceAll'
import requests
from bs4 import BeautifulSoup as soup
url = 'https://www.saa.gov.uk/search/?SEARCHED=1&ST=&SEARCH_TERM=city+of+edinburgh%2C+BOSWALL+PARKWAY%2C+EDINBURGH&ASSESSOR_ID=&SEARCH_TABLE=valuation_roll_cpsplit&DISPLAY_COUNT=10&TYPE_FLAG=CP&ORDER_BY=PROPERTY_ADDRESS&H_ORDER_BY=SET+DESC&DRILL_SEARCH_TERM=BOSWALL+PARKWAY%2C+EDINBURGH&DD_TOWN=EDINBURGH&DD_STREET=BOSWALL+PARKWAY&UARN=110B60329&PPRN=000000000001745&ASSESSOR_IDX=10&DISPLAY_MODE=FULL#results'
baseurl = 'https://www.saa.gov.uk'
session = requests.session()
response = session.get(url)
# content of search page in soup
html = soup(response.content,"lxml")
Address = LeftBlockData[3].get_text().strip()
#Address_1 = Address.replace("<br>,",");
Address_1 = Address.replaceAll("[\\t\\n\\r]"," ");
print (Address1)
答案 0 :(得分:1)
<br>
是换行符,我们通常不会在数据中包含这样的控制字符,但您可以使用列表来包含所有行:
In [72]: table = html.find('div', class_='table')
In [77]: for row in table.find_all('div', class_='row table-row'):
...: print(row.get_text(strip=True, separator='|').split('|'))
...:
...:
...:
...:
['Ref No.', '110B60329']
['Office', 'LOTHIAN VJB']
['Description', 'STORE']
['Property Address', '29 BOSWALL PARKWAY', 'EDINBURGH', 'EH5 2BR']
['Proprietor', 'SCOTTISH MIDLAND CO-OP SOCIETY LTD.']
['Tenant', 'PROPRIETOR']
['Occupier']