Question

我想在目录中包含copy个文件，其中包含 inputFile 的所有行。这是一个例子：

INPUTFILE

Line3
Line1
LineX
Line4
LineB

文件1

Line1
Line2
LineX
LineB

file2的

Line100
Line10
LineB
Line4
LineX
Line3
Line1
Line4
Line1

该脚本只能将 file2 复制到目标目录，因为 input2 的所有行都在 file2 中找到，但不在强> file1的即可。

我可以将单个file与inputFile进行比较，如部分here所述，如果脚本没有输出，则手动复制文件。那是;

awk 'NR==FNR{a[$0];next}!($0 in a)' file1 inputFile
Line3
Line4
awk 'NR==FNR{a[$0];next}!($0 in a)' file2 inputFile

保证无需复制 file1 ;但是，替换 file2 将不会产生任何结果，表明在 file2 中找到了 inputFile 的所有行; cp file2 ../distDir/也是如此。

这将花费时间，并希望我可以通过某种方式在for loop中完成。我并不特别关注awk，可以使用任何bash脚本工具。

谢谢，

Answer 1

假设如下：

您需要检查的所有文件都在当前目录中
基本文件也位于当前目录中，名为inputFile
目标路径为../distDir/

您可以运行如下所示的BASH脚本，它基本上遍历所有文件，将它们与基本文件进行比较，并在需要时复制它们。

#!/bin/bash

inputFile="./inputFile"
targetDir="../distDir/"
for file in *; do
  dif=$(awk 'NR==FNR{a[$0];next}!($0 in a)' $file $inputFile)
  if [ "$dif" == "" ]; then
    # File contains all lines, copy
    cp $file $targetDir
  fi
done

Answer 2

bash （ comm + wc 命令）解决方案：

#!/bin/bash

n=$(wc -l inputFile | cut -d' ' -f1)   # number of lines of inputFile
for f in /yourdir/file*
do
    if [[ $n == $(comm -12 <(sort inputFile) <(sort "$f") | wc -l | cut -d' ' -f1) ]]
    then 
        cp "$f" "/dest/${f##*/}" 
    fi
done

comm -12 FILE1 FILE2 - 仅输出显示在两个文件中的行

Answer 3

请您试试，请告诉我这是否对您有所帮助。我在"echo cp " val " destination_path"中写了system，所以你可以删除它的回声，并在你对echo结果感到满意时放入destination_path的实际值（它只会打印例如 - ＆gt; { {1}}）

cp file2 destination_path

也会很快添加解释。

EDIT1：根据OP文件命名，应该通过Input_file进行比较，可以根据请求更改代码。

awk 'function check(array,val,count){
        if(length(array)==count){
           system("echo cp " val " destination_path")
}
}
FNR==NR{
  a[$0];
  next
}
val!=FILENAME{
  check(a,val,count)
}
FNR==1{
  val=FILENAME;
  count=total="";
  delete b
}
($1 in a) && !b[$1]++{
  count++
}
END{
  check(a,val,count)
}
' Input_file file1  file2

说明：添加说明如下。

find -type f -exec awk 'function check(array,val,count){
        if(length(array)==count){
           system("echo cp " val " destination_path")
}
}
FNR==NR{
  a[$0];
  next
}
val!=FILENAME{
  check(a,val,count)
}
FNR==1{
  val=FILENAME;
  count=total="";
  delete b
}
($1 in a) && !b[$1]++{
  count++
}
END{
  check(a,val,count)
}
' Input_file {} +

PS：我在GNU awk中测试/编写了这个。

复制包含输入文件的所有行的文件

3 个答案: