我一直在尝试编写一个Grep表达式,该表达式将遍历目录中的所有文本文件,并且仅返回包含我要查找的所有模式的文件。输入文件示例如下:
A 29 LIJ uniteresting_numbers uniteresting_numbers uniteresting_numbers
A 30 RTX uniteresting_numbers uniteresting_numbers uniteresting_numbers <=B
A 31 BRN uniteresting_numbers uniteresting_numbers uniteresting_numbers <=B
A 32 SJY uniteresting_numbers uniteresting_numbers uniteresting_numbers <=B
A 33 MRT uniteresting_numbers uniteresting_numbers uniteresting_numbers
A 34 MUY uniteresting_numbers uniteresting_numbers uniteresting_numbers
A 35 OOP uniteresting_numbers uniteresting_numbers uniteresting_numbers
我希望能够搜索目录中的所有.txt文件,并仅返回包含以下全部的文件:
A 30 RTX uniteresting_numbers uniteresting_numbers uniteresting_numbers <=B
A 31 BRN uniteresting_numbers uniteresting_numbers uniteresting_numbers <=B
A 32 SJY uniteresting_numbers uniteresting_numbers uniteresting_numbers <=B
如果这三个都不存在,我希望跳过该文件。我会知道每种情况下我要寻找的两位数字和三个字母代码。我想输入那些作为变量供用户输入。我要查找的是文件,其中所有我感兴趣的两位数字和三个字母代码的末尾都有<= B。
Here is the code I have thus far:
echo What do you want to name your output file?
read myoutput
for file in *.txt; do
if grep -q "RTX$(printf '\t')*[0-9]$(printf '\t')*[0-9]$(printf '\t')*[0-9]" <"$file"; then
if grep -q "BRN$(printf '\t')*[0-9]$(printf '\t')*[0-9]$(printf '\t')*[0-9]" <"$file"" <"$file"; then
if grep -q "SJY$(printf '\t')*[0-9]$(printf '\t')*[0-9]$(printf '\t')*[0-9]" <"$file"" <"$file"; then
echo "$file" >>"$myoutput".txt
else
echo not found
fi
fi
fi
done
注意,我没有添加用户输入三个字母代码和两个数字的部分。这不应该太糟糕。在输入数据中,有一个制表符分隔每个列。现在,我可以一路搜索到最终标签和<= B。
我没有运气尝试过这个
echo What do you want to name your output file?
read myoutput
for file in *.txt; do
if grep -q "RTX$(printf '\t')*[0-9]$(printf '\t')*[0-9]$(printf '\t')*[0-9]$(printf '\t')$(printf '<=B')" <"$file"; then
if grep -q "BRN$(printf '\t')*[0-9]$(printf '\t')*[0-9]$(printf '\t')*[0-9]$(printf '\t')$(printf '<=B')" <"$file"" <"$file"; then
if grep -q "SJY$(printf '\t')*[0-9]$(printf '\t')*[0-9]$(printf '\t')*[0-9]*$(printf '\t')$(printf '<=B')*" <"$file"" <"$file"; then
echo "$file" >>"$myoutput".txt
else
echo not found
fi
fi
fi
done
任何帮助将不胜感激。在某些情况下,我要查找的行将超过三行。有没有一种简单的方法可以修改它以查找n个<= B行? 非常感谢大家!
编辑: 我按照建议搬到了awk
为此,我输入以下内容:
#!/bin/bash
echo What do you want to name your output file?
read myoutput
for file in *.txt; do
if awk '/30/ && /RTX/ && /B/' "$file"; then
echo it worked
fi
done
短语“成功”出现了6次。我正在测试此脚本的迷你目录中有6个文件。这些文件中只有3个实际上与awk模式匹配。如何在“然后”之后获取仅对包含awk模式的文件执行的代码?我根据此处的教程尝试了以下方法:https://www.thegeekstuff.com/2010/02/awk-conditional-statements
#!/bin/bash
echo What do you want to name your output file?
read myoutput
for file in *.txt; do
$ awk '{
if ($2 =="30" || $3 == "RTX" || $7 == "B")
echo it worked
}' "$file"
done
我没有成功。感谢您的指导!
答案 0 :(得分:1)
尽管可能与您的方法不同,请尝试以下操作:
myoutput="myoutput.txt"
for f in *.txt; do
awk -v output="$myoutput" -v numbers="30 31 32" -v strings="RTX BRN SJY" '
BEGIN {
split(numbers, num)
split(strings, str)
delete matched
}
{
for (n in num) {
if (match($0, "^A\t" num[n] "\t" str[n] "\t[0-9]+\t[0-9]+\t[0-9]+\t<=B$")) {
matched[n]++
}
}
}
END {
for (n in num) {
if (!matched[n]) {
exit
}
}
print FILENAME >> output
} ' "$f"
done
您可以将shell变量numbers
和strings
分配给用户想要的任意长度的变量。