Question

我有一个像这样的输入文件。

dog bird 123       asdf 456 cloud    sam 4444 barbara
bird sdf asdf
asdf 123 fdsa      cat asdff 1223sdf
aaaa fish ffff       ffff fish aaaa

我有一个像这样的主文件。每行有不同数量的字段，但增量为3（所以3,6,9,12，...字段）

private FirebaseAuth mAuth;
private FirebaseAuth.AuthStateListener mAuthListener;


@Override
protected void onCreate(Bundle savedInstanceState) {
    super.onCreate(savedInstanceState);
    setContentView(R.layout.activity_login);

    mAuth = FirebaseAuth.getInstance();
    mAuthListener = new FirebaseAuth.AuthStateListener() {
        @Override
        public void onAuthStateChanged(@NonNull FirebaseAuth firebaseAuth) {
            FirebaseUser user = firebaseAuth.getCurrentUser();
            if (user != null) {
                // User is already signed in
                launchMainActivity();


            } else {
                // User is not yet signed in
            }
        }
    };

}

@Override
protected void onStart() {
    super.onStart();
    mAuth.addAuthStateListener(mAuthListener);
}

@Override
protected void onStop() {
    super.onStop();
    if (mAuthListener != null) {
        mAuth.removeAuthStateListener(mAuthListener);
    }
}

我希望程序搜索输入文件，并在输入列表匹配时打印出整行。诀窍是我不想检查主文件中的所有列，只是每个三元组的第一列 - 如下所示。

从输入列表中检查第1列或第4列或第7列是否匹配。

所以单词dog匹配第一行的第1列 -
所以单词cat匹配第三行的第4列 -
所以单词bird与第二行的第1列匹配 -
单词fish与第1,4或7列不匹配 - 所以不计算

有意义吗？我已经找到了一种在awk中执行此操作的方法，但它涉及将数组作为参数发送，解析数组非常棘手。

帮助？

Answer 1

尝试：

awk 'FNR==NR{a[$1];next} {for (i=1;i<=NF;i+=3) if ($i in a) {print;next}}'  input main

示例：

$ awk 'FNR==NR{a[$1];next} {for (i=1;i<=NF;i+=3) if ($i in a) {print;next}}'  input main
dog bird 123       asdf 456 cloud    sam 4444 barbara
bird sdf asdf
asdf 123 fdsa      cat asdff 1223sdf

如何运作

FNR==NR{a[$1];next}

如果我们正在读取第一个文件，即带有单词的文件，我们将该单词作为关联数组a中的键。然后，我们跳过其余的命令并跳转到next行重新开始。
for (i=1;i<=NF;i+=3) if ($i in a) {print;next}

对于每第三个字段，我们检查它是否在关联数组a中显示为关键字。如果是，那么我们打印该行并跳转到next行重新开始。

Answer 2

我们可以使用newfangled associative array数据类型来存储搜索关键字，然后运行一个循环来检查主文件的每一行的目标字对照关联数组，以测试该行是否匹配。 / p>

INPUT_FILE='input.txt';
MAIN_FILE='main.txt';

## first read in all words from the input file into an associative array
## assume one word per line
declare -A keys=(); while read -r; do keys["$REPLY"]=1; done <"$INPUT_FILE";

## now read in one line at a time from the main file
while read -r; do
    words=($REPLY); ## word splitting
    ## check for a match in multiple-of-3 words
    for ((i = 0; i < ${#words[@]}; i += 3)); do
        if [[ ${keys["${words[i]}"]} ]]; then
            echo "$REPLY"; ## echo the whole matching line
            break; ## don't need to check anymore
        fi;
    done;
done <"$MAIN_FILE";

输出：

dog bird 123       asdf 456 cloud    sam 4444 barbara
bird sdf asdf
asdf 123 fdsa      cat asdff 1223sdf

Answer 3

$ cat > trois.awk 
BEGIN {                       # in the beginning
    RS="( +|\n)"              # set input record separator to spaces or newline
} 
NR==FNR {                     # for the first or input file only
    a[$1]                     # store the keywords
    next                      # avoid the rest of the code for the first file
} 
(($1 in a) && FNR%3==1) || i%3 {   # if a keyword matches in place 1, 4, 7, ...
    i++                            # or i counter allows printing (2,3, 5,6 ...)
    printf "%s%s", $1, i%3?OFS:ORS # print it pretty
}
$ awk -f trois.awk file1 file2
dog bird 123
bird sdf asdf
cat asdff 1223sdf

简而言之，RS使file2成为单词列表，如果关键字匹配行号mod 3 == 1（1,4,7，...），则启动计数器{{ 1}}并打印下两个单词。

bash脚本搜索寻找匹配的数据行

3 个答案:

如何运作