在一个文件中比较两个文件和相似字符串的打印行

时间:2016-08-28 07:06:13

标签: bash text compression

我有两个文件需要比较,如果file1中的第一列与file2中fisrt列的一部分匹配,则在file3中并排添加它们,下面是一个示例:

File1中:

'Iterate through all story types in the current document 
    For Each  rngStory In ActiveDocument.StoryRanges 

      'Iterate through all linked stories 

      Do 

      With   rngStory.Find 

        .Text =   "find text" 

          .Replacement.Text =  "I'm found" 

          .Wrap = wdFindContinue 

        .Execute Replace:=wdReplaceAll  

        End With 

        'Get next linked story (if any) 

        Set  rngStory = rngStory.NextStoryRange 

      Loop Until  rngStory  Is Nothing 

  Next 

文件2

123123,ABC,2016-08-18,18:53:53
456456,ABC,2016-08-18,18:53:53
789789,ABC,2016-08-18,18:53:53
123123,ABC,2016-02-15,12:46:22

输出:

789789_TTT,567774,223452
123123_TTT,121212,343434
456456_TTT,323232,223344

谢谢..

3 个答案:

答案 0 :(得分:1)

Usin Gnu AWK:

$ awk -F, 'NR==FNR{a[gensub(/([^_]*)_.*/,"\\1","g",$1)]=$0;next} $1 in a{print $0","a[$1]}' file2 file1
123123,ABC,2016-08-18,18:53:53 123123_TTT,121212,343434
456456,ABC,2016-08-18,18:53:53 456456_TTT,323232,223344
789789,ABC,2016-08-18,18:53:53 789789_TTT,567774,223452
123123,ABC,2016-02-15,12:46:22 123123_TTT,121212,343434

说明:

NR==FNR {                                   # for the first file (file2)
    a[gensub(/([^_]*)_.*/,"\\1","g",$1)]=$0 # store to array
    next
} 
$1 in a {                                   # if the key from second file in array
    print $0","a[$1]                        # output
}

答案 1 :(得分:1)

awk解决方案将file2中形成的密钥与file1的第1列相匹配 - 也应该在Solaris上使用/ usr / xpg4 / bin / awk - 我冒昧地假设OP输出的最后一行有错误

file1=$1
file2=$2
AWK=awk
[[ $(uname) == SunOS ]] && AWK=/usr/xpg4/bin/awk
$AWK -F',' '
BEGIN{OFS=","}
# file2 key is part of $1 till underscore 
FNR==NR{key=substr($1,1,index($1,"_")-1); f2[key]=$0; next}
$1 in f2 {print $0, f2[$1]}
' $file2 $file1

测试

123123,ABC,2016-08-18,18:53:53,123123_TTT,121212,343434
456456,ABC,2016-08-18,18:53:53,456456_TTT,323232,223344
789789,ABC,2016-08-18,18:53:53,789789_TTT,567774,223452
123123,ABC,2016-02-15,12:46:22,123123_TTT,121212,343434

答案 2 :(得分:0)

纯粹的bash解决方案

file1=$1
file2=$2
while IFS= read -r line; do
  key=${line%%_*}
  f2[key]=$line
done <$file2
while IFS= read -r line; do
  key=${line%%,*}
  [[ -n ${f2[key]} ]] || continue
  echo "$line,${f2[key]}"
done <$file1