Question

所以我对使用python很新，但我有一些数据经常通过管道传输给python脚本，该脚本从sys.stdin.readline（）读取信息，然后使用re.search过滤特定的信息。问题是它只读取然后退出的信息字符串。

while True:

 the_line = sys.stdin.readline()
 m = re.search(',"data":"(.+?)}]}', the_line)
 if m:
  print (m.group(1))

示例输入（抱歉，我知道它很乱）

stat update: {"stat":{"time":"2018-02-03 19:37:59       GMT","lati":6.81661,"long":-       58.11185,"alti":0,"rxnb":0,"rxok":0,"rxfw":0,"ackr":0.0,"dwnb":0,"txnb":0,"pfrm":"Single Channel Gateway","mail":"kevic.lall@yahoo.com","desc":"433 MHz          gateway test project 1.0"}}
Packet RSSI: -56, RSSI: -97, SNR: 9, Length: 10
rxpk update: {"rxpk":                                                                                                                                                                  [{"tmst":4153364745,"chan":0,"rfch":0,"freq":433.000000,"stat":1,"modu":"LORA"   ,"datr":"SF7BW125","codr":"4/5","lsnr":9,"rssi":-   56,"size":10,"data":"aGVsbG8gMzA1Nw=="}]}
 Packet RSSI: -49, RSSI: -96, SNR: 9, Length: 10
rxpk update: {"rxpk":[{"tmst":4155404009,"chan":0,"rfch":0,"freq":433.000000,"stat":1,"modu":"LORA","datr":"SF7BW125","codr":"4/5","lsnr":9,"rssi":-49,"size":10,"data":"aGVsbG8gMzA1OA=="}]}
Packet RSSI: -51, RSSI: -97, SNR: 9, Length: 10
....

这些只是不断流式传输的几行。

注意输入不会像这里一样显示，而是逐行显示，因为我管道到python脚本的程序继续运行

因此，我想要的输出应该是

aGVsbG8gMzA1Nw=="
aGVsbG8gMzA1OA=="
....

不断流式传输

但不是那样，我没有打印任何东西，而是程序只是挂起，直到我手动按Ctrl + C

第一个字符串刚退出，因为它不包含所需的信息，即使我确实更改它以过滤那里的东西，它打印我想要的输出然后存在并停止将程序传送到python脚本为退出后有没有更有效的方法来读取和过滤信息？我还能使用re.search功能吗？

另外，我用sys.stdin.realine（）逐行读取它的原因是因为我想过滤每行通过MQTT发送

为清晰起见而编辑

Answer 1

我没有看到相同的行为。

import sys
import re
while True:
    the_line = sys.stdin.readline()
    m = re.search('he(.+?)you', the_line)
    if m:
        print(m.group(1))

我运行程序。我被提示键入并按Enter键。你的正则表达式是针对我输入的wahtever进行测试的。打印匹配模式。 然后，我的提示再次返回给我。然后我可以再次输入另一个随机字符串。该计划并不止于我。从你的代码中，程序没有理由结束。

您的代码效率很高。还有其他方法可以在Python中提示输入，就像搜索字符串的其他方法一样。退房：

这取决于你在寻找什么;如果您正在搜索不同的模式，那么您的速度不会比re.search()快得多。但是，如果你知道确切的短语，或者你正在寻找一小组精确短语，string.find()或in运算符可能会更快。

Answer 2

使用此模式尝试这样做：(?<="data":")[\w=]+(?=")

import sys
import re
regex = r'(?<="data":")[\w=]+(?=")'
while True:
    text = sys.stdin.readline()
    matches = re.finditer(regex, text)
    for match in matches:
        print ("{match}".format(match = match.group()))

Answer 3

以下脚本包含一些小修改，适用于我：

import fileinput
import re

for the_line in fileinput.input():                                              
    m = re.search(',"data":"(.+?)}]}', the_line)
    if m:
        print (m.group(1))

输出：

aGVsbG8gMzA1Nw=="
aGVsbG8gMzA1OA=="

使用re.search在sys.stdin.readline（）更改时主动过滤

3 个答案: