Python - 在字符串中的短语后返回一些字符

时间:2015-05-01 17:36:54

标签: python regex string

我想在字符串中的单词后面捕获一些字符。例如,

Pinging 10.1.1.1 with 32 bytes of data:

Reply from 10.1.1.1: bytes=32 time=39ms TTL=253

Reply from 10.1.1.1: bytes=32 time=17ms TTL=253

Reply from 10.1.1.1: bytes=32 time=17ms TTL=253 

Reply from 10.1.1.1: bytes=32 time=17ms TTL=253

Ping statistics for 10.1.1.1:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), Approximate round trip times in milli-seconds:
    Minimum = 17ms, Maximum = 39ms, Average = 22ms

我想在时间后得到字符=但是在第一个时间实例的TTL之前停止空格=

我知道我可以拆分时间=并获取后面的字符,但我不知道如何让它在TTL之前停止(例如,数字可能超过2位数,所以只得到4以下不是一个选项)

也许正则表达式也是一种选择?我已经看到像(?:time=).*这样的东西会得到第一个实例,但是,再次,我不确定如何指定它在ms之后停止。

编辑 - 添加最终代码,因为它正在运行。谢谢你的帮助!

import os
import subprocess
import re

#Define Target
hostname = raw_input("Enter IP: ") 

#Run ping and return output to stdout.
#subprocess.Popen runs cmdline ping, pipes the output to stdout. .stdout.read() then reads that stream data and assigns it to the ping_response variable
ping_response = subprocess.Popen(["ping", hostname, "-n", '1'], stdout=subprocess.PIPE).stdout.read()

word = "Received = 1"



latency = 1

p = re.compile(ur'(?<=time)\S+')
x = re.findall(p, ping_response)

if word in ping_response:
print "Online with latency of "+x[0]

else:
print "Offline"

3 个答案:

答案 0 :(得分:4)

试试这个RegEx

(?<=time=)\S+

这应该是re.findall为你做的。

See demo here

import re
p = re.compile(ur'(?<=time=)\S+')
test_str = u"\n\n Pinging 10.1.1.1 with 32 bytes of data:\n\n Reply from 10.1.1.1: bytes=32 time=39ms TTL=253\n\n Reply from 10.1.1.1: bytes=32 time=17ms TTL=253\n\n Reply from 10.1.1.1: bytes=32 time=17ms TTL=253\n\n Reply from 10.1.1.1: bytes=32 time=17ms TTL=253\n\n Ping statistics for 10.1.1.1: Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 17ms, Maximum = 39ms, Average = 22ms\n"     
re.findall(p, test_str)

答案 1 :(得分:1)

使用正则表达式,但要保持简单:不要使用前瞻/后视,而是采用传统方式捕捉群组:

>>> times = re.findall(r"time=(.*?) ", pingdata)
>>> times
['39ms', '17ms', '17ms', '17ms']

说明:.*?是一个非贪婪的正则表达式,因此只要空格在parens后匹配就会停止。这恰好符合你的要求。带有捕获表达式的re.findall()将返回匹配内部匹配的内容,而不是整个匹配。

如果你只想要第一场比赛(正如你在问题中所说,我现在注意到了),请取times[0]或改用re.search,这会返回第一场比赛(但作为比赛对象,所以你提取捕获的组。)

>>> m = re.search(r"time=(.*?) ", pingdata)
>>> m.group(1)
'39ms'

答案 2 :(得分:0)

test = 'Reply from 10.1.1.1: bytes=32 time=39ms TTL=253'
test.split()[4].split('=')[1]

一步一步:

test.split()
['Reply', 'from', '10.1.1.1:', 'bytes=32', 'time=39ms', 'TTL=253']
test.split()[4].split('=')
'time=39ms'
test.split()[4].split('=')
['time', '39ms']
test.split()[4].split('=')[1]
'39ms'

输出:

#1 '39ms'

另一项测试:

test = 'Reply from 10.1.1.1: bytes=32 time=38833434343434349ms TTL=253'
test.split()[4].split('=')[1]
'38833434343434349ms'