从两个列文件中解析行

时间:2016-08-04 16:01:36

标签: bash awk sed

我需要很小的帮助。

我希望在bash中解析以下两个文件行(仅来自第二个文件),其中第二列是相同的但是第一列是唯一的:

file1 
111403787651,111915870316631
111408649892,111917070744403
111408653841,111919750018614
111408655467,111917420005028

file2
111403787651,111915870316631
444444444441,111917070744403
222222222222,333333333333333

输出: 仅来自第二个文件

444444444441,111917070744403

感谢

2 个答案:

答案 0 :(得分:1)

awk救援!

$ awk -F, 'NR==FNR{a[$2]=$1; next} $2 in a && $1 != a[$2]' file1 file2
444444444441,111917070744403

答案 1 :(得分:0)

假设我已经正确地阅读了你的意图(一个很大的假设,因为问题中的语言很不精确),以下是本机bash实现,不需要外部工具,并且在输入的情况下发出所需的输出问题:

#!/bin/bash
#      ^^^^ - NOT /bin/sh, as this requires bash-only (indeed, bash-4.x+-only) features

# read first file's contents 
declare -A first=( ) second=( ) # define associative arrays; requires bash 4.0
while IFS=, read -r a b; do     # read columns into variables a and b
  first[$a]=1; second[$b]=1     # set associative-array keys for each
done <file1                     # ...doing the above reading from file1

# iterate through second file's contents
while IFS=, read -r a b; do     # again, read into a and b
  if [[ ${second[$b]} && ! ${first[$a]} ]]; then # if we already saw b, and did not see a
    printf '%s,%s\n' "$a" "$b"                   # ...then emit output.
  fi
done <file2                     # ...doing the above reading from file2

参考文献:

  • BashFAQ #001(“我如何逐行(或逐个字段)读取文件(数据流,变量)?”)
  • BashFAQ #006(“我如何使用[...]关联数组?”)