Question

我正在设置代码以检查任何URL的信誉，例如http://go.mobisla.com/在网站“ https://www.virustotal.com/gui/home/url”上

首先，我要做的最基本的操作是使用BeautifulSoup提取网站的所有内容，但似乎我要查找的信息在shadow-root（open）中-div.detections和span.individual-detection。

网页结果中的示例复制元素：

没有引擎检测到该URL

我是Python的新手，想知道您是否可以共享提取信息的最佳方法

尝试了request.get（）函数，但未提供所需的信息

import requests
import os,sys
from bs4 import BeautifulSoup
import pandas as pd

url_check = "deloplen.com:443"
url = "https://www.virustotal.com/gui/home/url"
req  = requests.get(url + url_str)
html = req.text
soup = BeautifulSoup(html, 'html.parser')
print(soup.prettify())

期望看到“ 2个引擎检测到此URL”以及检测示例：Dr. Web Malicious

Answer 1

如果您使用他们的网站，它只会返回VirusTotal的加载屏幕，因为这不是正确的方法。

显示的内容：

相反，您应该做的是使用其公共API发出请求。但是，您必须注册一个帐户才能获得公共API密钥。

您可以使用此代码来检索有关链接的JSON信息。但是，您必须用您的API密钥填写。

import requests, json

user_api_key = "<api key>"
resource = "deloplen.com:443"

# feel free to remove this, just makes it look nicer
def pp_json(json_thing, sort=True, indents=4):
    if type(json_thing) is str:
        print(json.dumps(json.loads(json_thing), sort_keys=sort, indent=indents))
    else:
        print(json.dumps(json_thing, sort_keys=sort, indent=indents))
        return None

response = requests.get("https://www.virustotal.com/vtapi/v2/url/report?apikey=" + user_api_key + "&resource=" + resource)

json_response = response.json()

pretty_json = pp_json(json_response)

print(pretty_json)

如果您想了解有关API的更多信息，可以使用其documentation。

有没有办法从网站上的影子根中提取信息？

1 个答案: