在每行文本中搜索特定的ID#,然后添加到列表中,在Python中有一个匹配项

时间:2017-05-24 17:54:27

标签: python python-3.x

我正在寻找搜索ID#的文本行的方法。找到ID#后,我希望将整行文本添加到列表中。

这是我到目前为止所做的。

import subprocess
import re

bash = ("curl************************>> ~/Desktop/output.txt")
output = subprocess.check_output([bash], shell=True, stderr=subprocess.STDOUT)
with open('/******/output.txt', 'r') as myfile:
    everything = myfile.read().replace('\n', '')
    brokendownbyline = re.findall(r'{\"EGG\"(.*?)SHELL',str(everything))
for i in brokendownbyline:
print(i)

此代码打印如下:

"addjaid fja fahf ioah fa hdfh ahf 1234 asl kjas kdjf l akdjf"
"alkgad fganf daohdg o aunf g aoh oahf 9876 asl kdfna lk jfds"
"kl asdjfk ajsdfja sfiha flka jlkd jfakjfda ijf 4567 asdkf"
"asdkjfnajs dhfuioahfj a bnfgiuabf 3456asdkl fafaadhaa"
"ajsdfjaod ifjoa isdjfoia jdfoia hdgo iaf4637ads jfajis"

每一行都有一个ID#。我只是在寻找某个ID#s。找到ID#后,我希望将一行或多行文本添加到列表中,其他所有内容都可以忽略。

3 个答案:

答案 0 :(得分:1)

我不确定您为什么要使用bash来执行此操作。您也不能很好地解释您的确切目标,但根据我的解释,对于给定的ID,请说34,您要查找包含ID 34的所有行并将整行添加到一个清单。

这可以很容易地实现:

import os

line_list = []

with open(os.path.join(os.environ["HOMEPATH"], "Desktop/output.txt")) as f:
    for line in f:
        if "34" in line:
            line_list.append(line)

for l in line_list:
    print(l)    

答案 1 :(得分:0)

因此,此处的其他解决方案建议使用in关键字,这将很快,但您无法通过列表与字符串进行比较。

["f","b"] in "foo boo" #type error

相反,我会使用正则表达式进行匹配,然后使用set intersection来比较两者。

#update this list as needed
IDS = ["1234", "4567", "4637"]
rows = ["addjaid fja fahf ioah fa hdfh ahf 1234 asl kjas kdjf l akdjf",
        "alkgad fganf daohdg o aunf g aoh oahf 9876 asl kdfna lk jfds",
        "kl asdjfk ajsdfja sfiha flka jlkd jfakjfda ijf 4567 asdkf",
        "asdkjfnajs dhfuioahfj a bnfgiuabf 3456asdkl fafaadhaa",
        "ajsdfjaod ifjoa isdjfoia jdfoia hdgo iaf4637ads jfajis"]

for row in rows:
  if(set(IDS)&set(re.findall(r"[0-9]+", row))): #& finds intersection
    print(row)

仅打印匹配的行:

addjaid fja fahf ioah fa hdfh ahf 1234 asl kjas kdjf l akdjf
kl asdjfk ajsdfja sfiha flka jlkd jfakjfda ijf 4567 asdkf
ajsdfjaod ifjoa isdjfoia jdfoia hdgo iaf4637ads jfajis

答案 2 :(得分:0)

import subprocess
import re

listed = []
bash = ("curl************************>> ~/Desktop/output.txt")
output = subprocess.check_output([bash], shell=True, stderr=subprocess.STDOUT)
with open('/******/output.txt', 'r') as myfile:
    everything = myfile.read()
    brokendownbyline = re.findall(r'{\"EGG\"(.*?)SHELL',str(everything))
    for i in brokendownbyline:
        if "1234564789" in i:
            listed.append(i)
for j in listed:
    print(j)

这是对我有用的最终解决方案。