在模式之间提取文本 - unix - AIX

时间:2017-12-20 12:06:16

标签: unix sed aix

我想在两种模式之间提取文本。我使用的命令在AIX上不起作用,但在linux上工作。

首先,我查找文件中模式的出现,我有这个

sed -n "/STEP 005450/p" step_100
STEP 005450 ***ACTIVATED*** Thu Oct  5 17:31:05 CEST 2017
STEP 005450 REF R-A493 STEP 000010 ( command  -s <CONTAINER> )
STEP 005450 ***FAILURE*** Thu Oct  5 17:31:05 CEST 2017 Return code : 2
STEP 005450 ***ACTIVATED*** Thu Oct  5 17:33:54 CEST 2017
STEP 005450 REF R-A493 STEP 000010 ( command -s CONT1 )
STEP 005450 ***SUCCESFUL*** Thu Oct  5 17:33:55 CEST 2017

其次,我想在STEP 005450 ***ACTIVATED***STEP 005450 ***FAILURE***之间提取文本的一部分,但下面的命令不起作用,什么也不做。

sed -n "/STEP 005450/p" step_100 | sed -n "/STEP 005450/,/FAILURE/p"
STEP 005450 ***ACTIVATED*** Thu Oct  5 17:31:05 CEST 2017
STEP 005450 REF R-A493 STEP 000010 ( command  -s <CONTAINER> )
STEP 005450 ***FAILURE*** Thu Oct  5 17:31:05 CEST 2017 Return code : 2
STEP 005450 ***ACTIVATED*** Thu Oct  5 17:33:54 CEST 2017
STEP 005450 REF R-A493 STEP 000010 ( command -s CONT1 )
STEP 005450 ***SUCCESFUL*** Thu Oct  5 17:33:55 CEST 2017

通常,我应该有这个

STEP 005450 ***ACTIVATED*** Thu Oct  5 17:31:05 CEST 2017
STEP 005450 REF R-A493 STEP 000010 ( command  -s <CONTAINER> )
STEP 005450 ***FAILURE*** Thu Oct  5 17:31:05 CEST 2017 Return code : 2

感谢您的巡回帮助

1 个答案:

答案 0 :(得分:1)

也许这适用于AIX sed:

sed -n '/STEP 005450/{ /ACTIVATED/,/FAILURE/{ /ACTIVATED/{h;b}; H; /FAILURE/{g;p}; }; }' file

或作为一个命令:

import  io
from google.cloud import vision
from google.cloud.vision import types

def detect_text(file):
    """Detects text in the file."""
    client = vision.ImageAnnotatorClient()

    with io.open(file, 'rb') as image_file:
        content = image_file.read()

    image = types.Image(content=content)

    response = client.text_detection(image=image)
    texts = response.text_annotations
    print('Texts:')

    for text in texts:
        print('\n"{}"'.format(text.description))

        vertices = (['({},{})'.format(vertex.x, vertex.y)
                    for vertex in text.bounding_poly.vertices])

        print('bounds: {}'.format(','.join(vertices)))

file_name = "prescription.jpg"
detect_text(file_name)

输出:

STEP 005450 ***ACTIVATED*** Thu Oct  5 17:31:05 CEST 2017
STEP 005450 REF R-A493 STEP 000010 ( command  -s  )
STEP 005450 ***FAILURE*** Thu Oct  5 17:31:05 CEST 2017 Return code : 2