<field name="http.user_agent" showname="User-Agent: CORE/6.506.4.1 OpenCORE/2.02 (Linux;Android 2.2)\r\n" size="62" pos="542" show="CORE/6.506.4.1 OpenCORE/2.02 (Linux;Android 2.2)" value="557365722d4167656e743a20434f52452f362e3530362e342e31204f70656e434f52452f322e303220284c696e75783b416e64726f696420322e32290d0a"/>
<field name="http.user_agent" showname="User-Agent: HTC Streaming Player htc_wwe / 1.0 / htc_vivo / 2.3.5\r\n" size="67" pos="570" show="HTC Streaming Player htc_wwe / 1.0 / htc_vivo / 2.3.5" value="557365722d4167656e743a204854432053747265616d696e6720506c61796572206874635f777765202f20312e30202f206874635f7669766f202f20322e332e350d0a"/>
<field name="http.user_agent" showname="User-Agent: AppleCoreMedia/1.0.0.8C148 (iPad; U; CPU OS 4_2_1 like Mac OS X; sv_se)\r\n" size="85" pos="639" show="AppleCoreMedia/1.0.0.8C148 (iPad; U; CPU OS 4_2_1 like Mac OS X; sv_se)" value="557365722d4167656e743a204170706c65436f72654d656469612f312e302e302e38433134382028695061643b20553b20435055204f5320345f325f31206c696b65204d6163204f5320583b2073765f7365290d0a"/>
上面列出了我所获得的网址样本。我想知道Python中是否有任何模块可用于解析用户代理。我想得到这些样本的输出,如:
Android
HTC Streaming player
ipad
如果是PC用户,我想获得网络浏览器类型。
答案 0 :(得分:13)
有一个名为httpagentparser的库:
import httpagentparser
>>> s = "Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/532.9 (KHTML, like Gecko) Chrome/5.0.307.11 Safari/532.9"
>>> print httpagentparser.simple_detect(s)
('Linux', 'Chrome 5.0.307.11')
>>> print httpagentparser.detect(s)
{'os': {'name': 'Linux'},
'browser': {'version': '5.0.307.11', 'name': 'Chrome'}}
答案 1 :(得分:3)
Werkzeug内置了一个用户代理解析器。
http://werkzeug.pocoo.org/docs/quickstart/?highlight=user_agent#header-parsing
答案 2 :(得分:0)
您可以尝试使用正则表达式编写自己的代码:http://docs.python.org/library/re.html 或者看看这个:http://pypi.python.org/pypi/httpagentparser
答案 3 :(得分:0)
我要给出的答案与开源项目无关,但它确实提供了有关谁正在研究如何解析HTTP user-agent 字符串以获得{{3} }会想知道的。
WURFL是历史悠久的工具,用于执行User-Agent(更常见的是HTTP请求)分析并获得易于消耗的设备/浏览器信息。这是广告技术行业的事实上的标准,这要归功于专有数据库,可以从HTTP请求中压缩出最后一滴信息。在实践中,代码将类似于:
device id = samsung_sm_g981u_ver1_subuau1
get_capability('model_name') = SM-G981U1
get_capabilities(static_capabilities) = {'model_name': 'SM-G981U1', 'brand_name': 'Samsung', 'device_os': 'Android'}
get_virtual_capability('complete_device_name') = Samsung SM-G981U1 (Galaxy S20 5G)
get_virtual_capabilities(virtual_capabilities) = {'complete_device_name': 'Samsung SM-G981U1 (Galaxy S20 5G)', 'form_factor': 'Smartphone'}
上面的代码将返回:
from wmclient import *
try:
client = WmClient.create("http", "localhost", 8080, "")
:
ua = "Mozilla/5.0 (Linux; Android 7.1.1; ONEPLUS A5000 Build/NMF26X) AppleWebKit/537.36 (KHTML, like Gecko) " \
"Chrome/56.0.2924.87 Mobile Safari/537.36 "
client.set_requested_static_capabilities(["brand_name", "model_name"])
client.set_requested_virtual_capabilities(["is_smartphone", "form_factor"])
print()
print("Detecting device for user-agent: " + ua);
# Perform a device detection calling WM server API
device = client.lookup_useragent(ua)
:
# Let's get the device capabilities and print some of them
capabilities = device.capabilities
print("Detected device WURFL ID: " + capabilities["wurfl_id"])
print("Device brand & model: " + capabilities["brand_name"] + " " + capabilities["model_name"])
print("Detected device form factor: " + capabilities["form_factor"])
if capabilities["is_smartphone"] == "true":
更多信息device intelligence。
对于那些想在未获得ScientiaMobile的试用许可证的情况下尝试使用WURFL(特别是PyWURFL)的人,我公司最近发布了一个版本的WURFL(称为WURFL微服务),可以从{{3 }},here和AWS(当然还有ScientiaMobile本身)。同样针对该产品,Pythion也得到了完全支持,尽管语法略有不同,因为该产品依赖于Cloud中的服务器端组件进行更新:
// manifest.json
{
"manifest_version": 2,
"name": "Test Extension",
"version": "1.0",
"content_scripts": [
{
"matches": ["*test.com/*"],
"js": ["main.js"]
}
],
}
完整的示例和对GitHub客户端代码的引用可以在Azure中找到。
披露:我在提供此处描述的库的公司工作。