我正在编写一个简单的Web scraper脚本来从网页中提取单个单词。我需要的单词会定期更改,但会在一个永不改变的单词之后发生变化,因此我可以搜索它。
到目前为止我的脚本:
#!/bin/python
import requests
response = requests.get('http://vpnbook.com/freevpn')
print(response.text)
这显然会打印整个页面的HTML。但我需要的是密码:
<li>All bundles include UDP53, UDP 25000, TCP 80, TCP 443 profile</li>
<li>Username: <strong>vpnbook</strong></li>
<li>Password: <strong>binbd5ar</strong></li>
</ul>
我如何打印仅&#39; binbd5ar&#39; (或其他什么替代它)到STOUT?
答案 0 :(得分:0)
BEGIN TRANSACTION
SET QUOTED_IDENTIFIER ON
SET ARITHABORT ON
SET NUMERIC_ROUNDABORT OFF
SET CONCAT_NULL_YIELDS_NULL ON
SET ANSI_NULLS ON
SET ANSI_PADDING ON
SET ANSI_WARNINGS ON
COMMIT
BEGIN TRANSACTION
GO
CREATE TABLE dbo.TestIssue
(
id int NOT NULL,
MYComments varchar(MAX) NOT NULL
) ON [PRIMARY]
TEXTIMAGE_ON [PRIMARY]
GO
ALTER TABLE dbo.TestIssue SET (LOCK_ESCALATION = TABLE)
GO
COMMIT
select Has_Perms_By_Name(N'dbo.TestIssue', 'Object', 'ALTER') as ALT_Per, Has_Perms_By_Name(N'dbo.TestIssue', 'Object', 'VIEW DEFINITION') as View_def_Per, Has_Perms_By_Name(N'dbo.TestIssue', 'Object', 'CONTROL') as Contr_Per
INSERT INTO [dbo].[TestIssue]
([id]
,[MYComments])
VALUES
(1
,'MY COMMENT 1')
GO
SELECT * FROM [dbo].[TestIssue]
DECLARE @MYComments AS VARCHAR
UPDATE [dbo].[TestIssue]
SET MYComments = ISNULL(@MYComments,MYComments)
SELECT * FROM [dbo].[TestIssue]
答案 1 :(得分:0)
import re
re.search(r'Password: <strong>(.+)</strong>',response.text).group(1)
答案 2 :(得分:0)
您可以使用正则表达式搜索。
&#34; Python基于正则表达式提供两种不同的基本操作:re.match()仅在字符串的开头检查匹配,而re.search()检查字符串中任何位置的匹配&# 34; link
>>> import re
>>> x = re.search(r"Password: <strong>(?P<pass>\w+)</strong>", response.text)
>>> print x.groupdict()
{'pass': 'binbd5ar'}
答案 3 :(得分:-1)
password = re.match(r'Password: <strong>(.*?)</strong>',response.text).group(1)
然后改变它
re.sub(password,newPassword,response.text,max = 1)