我通过执行一些CLI实用程序获得了一堆输出信息,并且在文件末尾有一个Web URL。我需要使用python regex查找该链接并显示为输出。下面是我为目的编写的三行代码。
file = str('/root/PycharmProjects/rest_project/sponge_link')
with open(file, 'r') as fo:
fo.read().__str__()
urls = re.findall('https?://(?:[-\w.]|(?:%[\da-fA-F]{2}))+', fo)
print(urls)
下面是文件的内容
INFO: Streaming results to http://abc/56659bf3-a66d-482b-80e8-6484cafc650d
INFO: Analyzed target <path/path/path> (73 packages loaded, 10521 targets configured).
INFO: Found 1 target...
Target <path>/dence up-to-date:
utility-<path>/dence_0.0-5_amd64.deb
utility-<path>/dence_0.4-5_amd64.changes
INFO: Elapsed time: 23.669s, Critical Path: 0.47s, Remote (0.00% of the time): [queue: 0.00%, setup: 0.00%, process: 0.00%]
INFO: Build Event Protocol files produced successfully.
INFO: Build completed successfully, 1 total action
INFO: Still uploading to http://abc/56659bf3-a66d-482b-80e8-6484cafc650d
但是,当我执行程序时,出现以下错误:
Traceback (most recent call last):
File "/root/PycharmProjects/rest_project/sel.py", line 24, in <module>
urls = re.findall('https?://(?:[-\w.]|(?:%[\da-fA-F]{2}))+', fo)
File "/usr/lib/python3.6/re.py", line 222, in findall
return _compile(pattern, flags).findall(string)
TypeError: expected string or bytes-like object
它抱怨数据类型应该是字符串。因此,我在文件路径上使用了str(),但即使这样也不起作用。
有人可以帮我理解我的错误吗?
答案 0 :(得分:1)
您正在将file object
传递给re.findall
,而不是string
。您需要将文件读取的结果分配给变量,然后将其传递到re.findall
。
fo.read().__str__()
应该类似于lines = fo.read()
urls = re.findall('https?://(?:[-\w.]|(?:%[\da-fA-F]{2}))+', fo)
应该是urls = re.findall('https?://(?:[-\w.]|(?:%[\da-fA-F]{2}))+', lines)