我有一个列表,其中包含大量网址,数千个网址。这是示例
UrlList = ["www.test.com", "www.123.com", "www.youtube.com", "youtube.com", 123.com, test.com, c.microsoft.com, office.microsoft.com]
有网址没有.com或www。或http://或https://并且有哪些。
我试图忽略所有这些,只是在URL List中搜索test或youtube或microsoft等,一旦发现它打印整个URL。
我该怎么做?
编辑:对不起,我忘了张贴尝试 尝试1#: 尝试代码1#:
fileURLs = urlReader()
print("Here is the URLs in the File which needs to be search in the List.")
print(fileURLs)
for x in data:
for y in x['urls']:
url = str(y)
if url in fileURLs:
print(x['id'] , url)
尝试输出1#:
Here is the URLs in the File which needs to be search in the List
['youtube.com', 'test.com', '123.com']
(u'CUSTOM_03', 'test.com')
(u'CUSTOM_05', 'youtube.com')
(u'CUSTOM_07', 'test.com')
(u'CUSTOM_07', 'youtube.com')
(u'CUSTOM_08', 'youtube.com')
(u'CUSTOM_15', 'test.com')
(u'CUSTOM_16', 'test.com')
(u'CUSTOM_17', 'test.com')
(u'CUSTOM_18', 'test.com')
(u'CUSTOM_19', 'test.com')
(u'CUSTOM_20', 'youtube.com')
(u'CUSTOM_23', 'test.com')
(u'CUSTOM_24', 'youtube.com')
尝试代码2#:
for x in data :
for s in x['urls']:
url = str(s)
matching = [y for y in fileURLs if url in y]
if (matching):
print(x['id'], x['configuredName'], matching)
尝试输出2#:
Here is the URLs in the File which needs to be search in the List.
['www.youtube.com', 'www.test.com', 'www.123.com']
(u'CUSTOM_03', ['www.test.com'])
(u'CUSTOM_03', ['www.test.com'])
(u'CUSTOM_05', ['www.youtube.com'])
(u'CUSTOM_07', ['www.test.com'])
(u'CUSTOM_07', ['www.youtube.com'])
(u'CUSTOM_08', ['www.youtube.com'])
(u'CUSTOM_10', ['www.youtube.com'])
(u'CUSTOM_15', ['www.test.com'])
(u'CUSTOM_16', ['www.test.com'])
(u'CUSTOM_17', ['www.test.com'])
(u'CUSTOM_18', ['www.test.com'])
(u'CUSTOM_19', ['www.test.com'])
(u'CUSTOM_20', ['www.youtube.com'])
(u'CUSTOM_22', ['www.test.com'])
(u'CUSTOM_23', ['www.test.com'])
(u'CUSTOM_24', ['www.test.com'])
(u'CUSTOM_24', ['www.youtube.com'])
(u'CUSTOM_02', ['www.test.com'])
(u'CUSTOM_02', ['www.123.com'])
查看两次尝试之间的区别,我更改了FileURL From:FileURL = ['youtube.com', 'test.com', '123.com']
到FileURL = ['www.youtube.com', 'www.test.com', 'www.123.com']
在输出中添加两个新条目:
(u'CUSTOM_02', ['www.test.com'])
(u'CUSTOM_02', ['www.123.com'])
答案 0 :(得分:0)
for i in urllist:
if 'microsoft' in i:
print(i)
答案 1 :(得分:0)
这,使用简单的列表理解来完成任务:
UrlList = ["www.test.com", "www.123.com", "www.youtube.com", "youtube.com", "123.com", "test.com", "c.microsoft.com", "office.microsoft.com"]
searcher = [i for i in UrlList if "www.test.com" in i]
print(searcher)