Question

我正在使用bash循环遍历大型输入文件（contents.txt），如下所示：

searchterm1
searchterm2
searchterm3

...如果搜索字词未在代码库中使用，则尝试从文件中删除搜索字词。我试图使用grep和awk，但没有成功。我还想排除图像和常量目录

#/bin/bash
while read a; do
  output=`grep -R $a ../website | grep -v ../website/images | grep -v ../website/constants | grep -v ../website/.git`
  if [ -z "$output" ]
  then echo "$a" >> notneeded.txt
  else echo "$a used $($output | wc -l) times" >> needed.txt
  fi
done < constants.txt

这个想要的效果是两个文件。一个用于显示代码库（needed.txt）中找到的所有搜索词，另一个用于代码库中未找到的搜索词（notneeded.txt）。

needed.txt

   searchterm1 used 4 times
   searchterm3 used 10 times

notneeded.txt

   searchterm2

我也以类似的方式尝试了awk，但我无法按需要循环和输出

Answer 1

不确定，但听起来你正在寻找类似的东西（假设文件名中没有空格）：

awk '
NR==FNR{ terms[$0]; next }
{
    for (term in terms) {
        if ($0 ~ term) {
            hits[term]++
        }
    }
}
END {
    for (term in terms) {
        if (term in hits) {
            print term " used " hits[term] " times" > "needed.txt"
        }
        else {
            print term > "notneeded.txt"
        }
    } 
}
' constants.txt $( find ../website -type f -print | egrep -v '\.\.\/website\/(images|constants|\.git)' )

可能有一些find选项可以使egrep不必要。

循环输入文件并查找是否使用了行

1 个答案: