Linux shell脚本,按输入字符串

时间:2018-04-28 17:11:59

标签: bash shell sh

所以,有一个大文件,我必须使用bash shell脚本进行多次搜索。

文件是这样的:

TITLE and AUTHOR                                                     ETEXT NO.

Aspects of plant life; with special reference to the British flora,      56900
 by Robert Lloyd Praeger

The Vicar of Morwenstow, by Sabine Baring-Gould                          56899
 [Subtitle: Being a Life of Robert Stephen Hawker, M.A.]

Raamatun tutkisteluja IV, mennessä Charles T. Russell                    56898
 [Subtitle: Harmagedonin taistelu]
 [Language: Finnish]

Raamatun tutkisteluja III, mennessä Charles T. Russell                   56897
 [Subtitle: Tulkoon valtakuntasi]
 [Language: Finnish]

Tom Thatcher's Fortune, by Horatio Alger, Jr.                            56896

A Yankee Flier in the Far East, by Al Avery                              56895
 and George Rutherford Montgomery
 [Illustrator: Paul Laune]

Nancy Brandon's Mystery, by Lillian Garis                                56894

Nervous Ills, by Boris Sidis                                             56893
 [Subtitle: Their Cause and Cure]

Pensées sans langage, par Francis Picabia                                56892
 [Language: French]

Helon's Pilgrimage to Jerusalem, Volume 2 of 2, by Frederick Strauss     56891
 [Subtitle: A picture of Judaism, in the century
  which preceded the advent of our Savior]

Fra Tommaso Campanella, Vol. 1, di Luigi Amabile                         56890
 [Subtitle: la sua congiura, i suoi processi e la sua pazzia]
 [Language: Italian]

The Blue Star, by Fletcher Pratt                                         56889

Importanza e risultati degli incrociamenti in avicoltura,                56888
 di Teodoro Pascal
 [Language: Italian]

The Junior Classics, Volume 3: Tales from Greece and Rome, by Various    56887


~ ~ ~ ~ Posting Dates for the below eBooks:  1 Mar 2018 to 31 Mar 2018 ~ ~ ~ ~

TITLE and AUTHOR                                                     ETEXT NO.

The American Missionary, Volume 41, No. 1, January, 1887, by Various     56886

Morganin miljoonat, mennessä Sven Elvestad                               56885
 [Author a.k.a. Stein Riverton]
 [Subtitle: Salapoliisiromaani]
 [Language: Finnish]

"Trip to the Sunny South" in March, 1885, by L. S. D                     56884

Balaam and His Master, by Joel Chandler Harris                           56883
 [Subtitle: and Other Sketches and Stories]

Susien saaliina, mennessä Jack London                                    56882
 [Language: Finnish]

Forged Egyptian Antiquities, by T. G. Wakeling                           56881

The Secret Doctrine, Vol. 3 of 4, by Helena Petrovna Blavatsky           56880
 [Subtitle: Third Edition]

No Posting                                                               56879

First love and other stories, by Iván Turgénieff                         56878

现在我必须用etext no,作者姓名和标题来搜索它。

喜欢如果我通过电子邮件搜索no:like etext 56900: 它应该返回

Aspects of plant life; with special reference to the British flora,      56900

我是shell脚本新手。我只能读取文件。 有了这个:

#!/bin/sh
read -p 'string to search ' searchstring
grep --color searchstring GUTINDEX.ALL | #condition

我不知道应该用什么样的条件来搜索作者姓名或etext no ....

2 个答案:

答案 0 :(得分:1)

正如其他人已经指出的那样,单独使用grep并不是你真正接近这个问题的方法。使用Awk而不是grep可以实现相当大的改进,但对于真实的生产系统,您可以将字段解析为关系数据库,并使用SQL进行搜索。使用数据库索引,搜索将比顺序扫描每个搜索的整个索引文件快得多。

但如果你只局限于grep,这是一次快速而又肮脏的尝试。

author () { grep -E "(by|par|di|mennessä) $@" GUTINDEX.ALL; }
index () { grep " $@\$" GUTINDEX.ALL; }
title () { grep "^$@" GUTINDEX.ALL; }

这声明了三个shell函数,它们通过提供锚表达式(^匹配行的开头,$匹配行尾)或合适的上下文来搜索文件的不同部分。 / p>

将搜索表达式作为命令行参数而不是需要交互式输入通常是一个巨大的可用性改进。现在,您可以使用shell的历史记录机制来调用并可能编辑早期的搜索,并在这些简单的构建块之上构建新脚本。

(顺便说一下,“mennessä”在这里根本不是正确的芬兰本地化。我向Project Gutenberg报告了一个错误。)

答案 1 :(得分:0)

你可以从这样的事情开始,但正如@ tom-fenech所指出的那样,在没有结构化输入的情况下,它相当不可靠。

例如,作者姓名不一致地加前缀,有时出现在"字幕"下,很少出现在"作者"标签

#!/bin/bash

CATALOG=/tmp/s

function usage()
{
    echo "Usage:"
    echo "$0 [etext <key>] [author <id>]"
    exit 1;
}

function process_etext()
{
    local searchKey=$1
    egrep "${searchKey}" ${CATALOG} | awk -F"${searchKey}" '{print $1}'
}

function process_author()
{
    local searchKey=$1
    egrep -b1 "${searchKey}" ${CATALOG} | egrep "[[:digit:]]{5}" 
}


for key in "$@"
do
    key="$1"
    case $key in
    etext|author)
        process_${key} $2
        shift; shift;
        ;;
    *)
        [ -z ${key} ] || usage
        ;;
    esac
done