Question

我很好奇，在以下情况下哪个更快。我有大约2MB的输出文件和数千行（我们在15k到50k之间的任何地方）。

我正在寻找文件末尾的一个字符串（最后10行）左右。我多次执行此操作，有时使用相同的文件的最后10行，以及多个文件。

我很好奇以下哪一项是最快最有效的：

tail最后10行，将它们保存为变量。当我需要grep或检查字符串时，echo该变量和输出上的grep
每次我需要grep时，首先tail输出文件然后pipe和grep输出
放弃上述任何一项，每次只需grep整个文件。

选项1）

if [ -f "$jobFile".out ]; then
{
  output=$(tail -n 10 "$jobFile".out)
  !((echo "$output" | grep -q "Command exited with non-zero status" ) 
    || (echo "$output" | grep -q "Error termination via Lnk1e")) 
    && continue
  {
    output "$(grep $jobID $curJobsFile)"
    sed -i "/$jobID/d" "$jobIDsWithServer"
  }
fi

选项2）

if [ -f "$jobFile".out ]; then
{
  !((tail -n 10 "$jobFile".out | grep -q "Command exited with non-zero status" ) 
    || (tail -n 10 "$jobFile".out | grep -q "Error termination via Lnk1e")) 
    && continue
  {
    output "$(grep $jobID $curJobsFile)"
    sed -i "/$jobID/d" "$jobIDsWithServer"
  }
fi

选项3）

if [ -f "$jobFile".out ]; then
{
  !((grep -q "Command exited with non-zero status" "$jobFile".out) 
    || (grep -q "Error termination via Lnk1e" "$jobFile".out)) 
    && continue
  {
    output "$(grep $jobID $curJobsFile)"
    sed -i "/$jobID/d" "$jobIDsWithServer"
  }
fi

Answer 1

选项2使用尾部两次，因此可能会略微慢于1.两者都会比选项3快很多。

你可以做的另一件事是：

if [ -f "$jobFile".out ]; then
{
  !(tac "$jobFile".out | 
    grep -E -m1 -q "(Command exited with non-zero status|Error termination via Lnk1e)")
    && continue
  {
    output "$(grep $jobID $curJobsFile)"
    sed -i "/$jobID/d" "$jobIDsWithServer"
  }
fi

这将输出文件in reverse order，grep将在第一次匹配后停止。此外，它还会同时搜索这两个搜索字词，如果与第一个字词不匹配，则可以避免两次grep。

Answer 2

为什么不这样：

if tail -f "$jobfile.out" \ 
    | grep -F -e "Command exited with non-zero status" -e "Error termination via Lnk1e"
then
   output "$(grep $jobID $curJobsFile)"
   sed -i "/$jobID/d" "$jobIDsWithServer"
fi

通过这种方式，你可以实时地搜索尾部的输出，直到找到你想要的东西。

使用grep中的-F标志可以在不使用正则表达式时更快。

哪个更快？ `echo`一个变量，或`tail`一个长输出文件，或者'grep`整个事情

2 个答案: