我是python的新手并使用我已经删除的URL列表。部分网址在其前面带有#
符号返回,如何删除?
这是我的代码:
from bs4 import BeautifulSoup, SoupStrainer
import requests
source = requests.get('https://www.census.gov/programs-surveys/popest.html').text
soup = BeautifulSoup(source, 'html.parser')
links = soup.find_all('a', href=True)
records = []
for results in links:
url = results['href']
records.append(url)
#here i am removing the duplicate URL's from the records list
records = set(records)
records = list(records)
#here i am returning URL's only containing 'http'
filter_records = [k for k in records if 'http' in k]
import csv
with open ('test.csv', 'w') as f:
csv_writer = csv.writer(f, delimiter=',')
csv_writer.writerow(['Web Address'])
[csv_writer.writerow([record]) for record in filter_records]
如何从列表中的某些结果中删除#
?