使用python

时间:2015-10-01 18:31:42

标签: python web scripting

我正在编写一个简单的Web scraper脚本来从网页中提取单个单词。我需要的单词会定期更改,但会在一个永不改变的单词之后发生变化,因此我可以搜索它。

到目前为止我的脚本:

#!/bin/python

import requests
response = requests.get('http://vpnbook.com/freevpn')
print(response.text)

这显然会打印整个页面的HTML。但我需要的是密码:

<li>All bundles include UDP53, UDP 25000, TCP 80, TCP 443 profile</li>
<li>Username: <strong>vpnbook</strong></li>
<li>Password: <strong>binbd5ar</strong></li>
</ul>  

我如何打印&#39; binbd5ar&#39; (或其他什么替代它)到STOUT?

4 个答案:

答案 0 :(得分:0)

BEGIN TRANSACTION
SET QUOTED_IDENTIFIER ON
SET ARITHABORT ON
SET NUMERIC_ROUNDABORT OFF
SET CONCAT_NULL_YIELDS_NULL ON
SET ANSI_NULLS ON
SET ANSI_PADDING ON
SET ANSI_WARNINGS ON
COMMIT
BEGIN TRANSACTION
GO
CREATE TABLE dbo.TestIssue
    (
    id int NOT NULL,
    MYComments varchar(MAX) NOT NULL
    )  ON [PRIMARY]
     TEXTIMAGE_ON [PRIMARY]
GO
ALTER TABLE dbo.TestIssue SET (LOCK_ESCALATION = TABLE)
GO
COMMIT
select Has_Perms_By_Name(N'dbo.TestIssue', 'Object', 'ALTER') as ALT_Per, Has_Perms_By_Name(N'dbo.TestIssue', 'Object', 'VIEW DEFINITION') as View_def_Per, Has_Perms_By_Name(N'dbo.TestIssue', 'Object', 'CONTROL') as Contr_Per 

INSERT INTO [dbo].[TestIssue]
           ([id]
           ,[MYComments])
     VALUES
           (1
           ,'MY COMMENT 1')
GO

SELECT * FROM [dbo].[TestIssue]

DECLARE @MYComments AS VARCHAR

UPDATE [dbo].[TestIssue] 
SET MYComments = ISNULL(@MYComments,MYComments)

SELECT * FROM [dbo].[TestIssue]

答案 1 :(得分:0)

import re
re.search(r'Password: <strong>(.+)</strong>',response.text).group(1)

答案 2 :(得分:0)

您可以使用正则表达式搜索。

&#34; Python基于正则表达式提供两种不同的基本操作:re.match()仅在字符串的开头检查匹配,而re.search()检查字符串中任何位置的匹配&# 34; link

>>> import re
>>> x = re.search(r"Password: <strong>(?P<pass>\w+)</strong>", response.text)
>>> print x.groupdict()
{'pass': 'binbd5ar'}

答案 3 :(得分:-1)

password = re.match(r'Password: <strong>(.*?)</strong>',response.text).group(1)

然后改变它

re.sub(password,newPassword,response.text,max = 1)