我有一个名为 hitlist.txt 的txt文件(通常会更新),其中包含一个单词/字符串列表,我想grep
一个目录,反对...喜欢:
# This is just a comment and will not be part of the search
* Blah - this is a category
foo
bar
sibilance
# A new category
* Meh - another category
snakefish
sex panther
我的列表通常是> 100个字符串,每个字符串都在它自己的行上。今天,由于截止日期,我只是通过列表并为每个单词执行以下命令:
find -iname "*" -type f -print0 | xargs -0 -HniI "foo" >> results.txt
如上面的命令所示,我对文件路径和名称以及包含匹配文本的行感兴趣。文件中有多个类别列表(由*
表示),我希望能够针对一个,多个或所有类别运行我的脚本。
我还希望能够关闭-i
标志(区分大小写)作为选项。我有一个脚本递归查找/列出目录中的所有文件,以及我上面使用的命令。最后,如果需要,可以完全更改 hitlist 格式。
答案 0 :(得分:1)
设置ghl()
( g 代表 h l ist )shell函数来完成工作,(取决于 GNU grep
的{{1}}开关,再加上一个-o
循环),输出是一个来自 hitlist.txt (或sed
)的单词列表:
<filename>
将# usage ghl <glob> <filename>
ghl() { grep -o '\* '"$1"' -' "$2" | grep -o '[[:alpha:]]*' | \
while read x ; do \
sed -n '/\* '"$1"'/{:show ;n;/^[^ ]/{p;b show;}}' "$2" ; \
done ; }
的单词列表输出与“ghl
”通配符(与 Blah 类别匹配)加载到.*ah
,加上一些< em> ad hoc grep -f -
process substitution生成输入文本:
bash
输出:
ghl '.*ah' hitlist.txt | grep -i -f - <(echo bar) <(echo foo) <(echo Foo)
上面的第二个/dev/fd/63:bar
/dev/fd/62:foo
/dev/fd/61:Foo
可以根据需要传递开关(参见grep
)。示例,同样的事情,但区分大小写(即删除man grep
开关):
-i
输出,(注意缺少大写项目):
ghl '.*ah' hitlist.txt | grep -f - <(echo bar) <(echo foo) <(echo Foo)
由于/dev/fd/63:bar
/dev/fd/62:foo
已经有了处理递归搜索的选项,其余的只是根据需要添加开关。
答案 1 :(得分:0)
你的问题非常模糊,但我想象这或多或少都是你想要的。
awk -v cat='Blah|Meh' 'NR==FNR && /^#/ { next } # Skip comments
NR==FNR && /^\*/ { if ($0~cat) c=1; else c=0; next }
NR==FNR { if(c) a[$0]=1; next }
lower($0) in a { print FILENAME ":" FNR ":" $0 }' Hits.txt files to search
弄清楚如何有选择地禁用lower()
并绑定它以从Hits.txt
读取find
以外的文件名列表应该是相当明显的。
答案 2 :(得分:0)
这就是我最终的结果:
命中列表格式:
# MEH
never,going,to give,you up
# blah
word to,your,mother
脚本:
# Set defaults
OUTPUT_FILE="hits.txt"
HITLIST_FILE="hitlist.txt"
# Hold on to the args
ARGLIST=($*)
# Declare any functiions
help ()
{
echo "--------------------------------- Luffa --------------------------------"
echo "Usage: luffa.sh [DIRTOSCRUB]"
echo ""
echo "Searches DIRTOSCRUB for category specific words in $HITLIST_FILE."
echo ""
echo "EXAMPLE: luffa.sh dirtoscrub"
echo ""
echo "--help display this help and exit"
echo "--version display version information and exit"
}
version ()
{
echo "luffa.sh v1.0"
}
process ()
{
if [ ${#FILEARG} -lt 1 ] # check for proper number of args
then
echo "ERROR: Specify directory to be searched."
help
exit 1
else
SEARCH_DIR=${ARGLIST[0]}
fi
echo ""
echo "--------------------------------------------------------- Luffa ---------------------------------------------------" | tee -a "$OUTPUT_FILE"
echo "search command: find [DIRTOSCRUB] -type f -print0 | xargs -0 grep -HniI --color=always $word | tee -a ../hits.txt | more" | tee -a "$OUTPUT_FILE"
echo
echo " .,,:::::." | tee -a "$OUTPUT_FILE"
echo " .,,::::~:::::.." | tee -a "$OUTPUT_FILE"
echo " ,,::::~~~~~~::~~:::." | tee -a "$OUTPUT_FILE"
echo " ,:,:~:~~~~~~~~~~~~~~::." | tee -a "$OUTPUT_FILE"
echo " ,,:::~:~~~~~~~~~~~~~~~~~~," | tee -a "$OUTPUT_FILE"
echo " .,,::::~~~~~~~~~~~~~~~~~~~~~~::" | tee -a "$OUTPUT_FILE"
echo " .,::~:~~~~~=~~~~=~~~~~~~~~~~=~~~~." | tee -a "$OUTPUT_FILE"
echo " ,::::~~:~~~=~~~~~~~~=~~=~~~===~~~~~~." | tee -a "$OUTPUT_FILE"
echo " ..:::~~~~=~~=~~~~~~=~~~~=~~===~~==~~~~~~," | tee -a "$OUTPUT_FILE"
echo " .,:::~~~~~~~~~~~~~~~~=~=~~~=~====~===~~~~~~~." | tee -a "$OUTPUT_FILE"
echo " .,::~~~~~~~~~~~~~~=~=~~~~~=~======~=~~~~=~=~~~:" | tee -a "$OUTPUT_FILE"
echo " ..,::~:~~~~~~=~~~=~~~~~~~~=~====+======~===~~~~~~~." | tee -a "$OUTPUT_FILE"
echo " ..,:,:~~~~~~=~::~~=~=~~~=~~=~=~=~======~~~==~~~~~~::." | tee -a "$OUTPUT_FILE"
echo " ,,.::~:=~~~~~~~~~~~~=~=~===~~~====+==~=====~~~~~::,." | tee -a "$OUTPUT_FILE"
echo " ,,,,:I++=:~==~=~~~~~~=~:==~=~+~====~=~===~~~~:~::,:" | tee -a "$OUTPUT_FILE"
echo " .,:+++?77+?=~~~~=~~=~=~~=~~+=~+~~+====~=~~~:::::,::," | tee -a "$OUTPUT_FILE"
echo " ..++++?++?II?=~~=~~~=~~~====~===~=====~~~:~::::::::,." | tee -a "$OUTPUT_FILE"
echo " ..=++?++++++???7+~~~~~~~~+~=~=====~~~~~~~~~::::~:::,,.." | tee -a "$OUTPUT_FILE"
echo " .=+++++++++++++++===:~~=~==+~~=~=~~:~~=~:~:::~::::,,.." | tee -a "$OUTPUT_FILE"
echo " .++++++?++++++?++=?~:~~~~===~==~==~~~~~:::::::::,,,..." | tee -a "$OUTPUT_FILE"
echo " ..=?+++++??+++++++===~::~~~~~~=~~~~~~:~~:::::,:,,,,,." | tee -a "$OUTPUT_FILE"
echo " ...=+?+++++++++=====~:,::,~:::~~~~~:~~~~::::~::::,,,,.." | tee -a "$OUTPUT_FILE"
echo " .=+++++++++++===~==::::,::~~,,,::~~~~~~::::::~:,:,,.." | tee -a "$OUTPUT_FILE"
echo " ..++++++++++=+===~,.,,:::,:~~~~~,.,:~:~::::::,::,:,.." | tee -a "$OUTPUT_FILE"
echo " ...++?++++++++=+=~~. ..,,,,,:,~,::~,:::,:,:,~::::,,.." | tee -a "$OUTPUT_FILE"
echo " .++++++++?++====~. ...,,:,~::~=::,::,:,:::,,,,.." | tee -a "$OUTPUT_FILE"
echo ".++?+++++?++++==~.. .,.:,,:::~,:,,,:::::,,,." | tee -a "$OUTPUT_FILE"
echo "++++++?+???==~=. ...,::~~~:,,:,:::,,." | tee -a "$OUTPUT_FILE"
echo "?+++?????+==~. ..,,,,::,:,,,,,." | tee -a "$OUTPUT_FILE"
echo "+?+++??+==~. ..,,,,,,,,." | tee -a "$OUTPUT_FILE"
echo "+I???+==~. ..,,.." | tee -a "$OUTPUT_FILE"
echo "??++==~." | tee -a "$OUTPUT_FILE"
echo "+===~." | tee -a "$OUTPUT_FILE"
echo "+=~." | tee -a "$OUTPUT_FILE"
echo "~" | tee -a "$OUTPUT_FILE"
echo "--------------------------------------------------------------------------------------------------------------------------" | tee -a "$OUTPUT_FILE"
echo "" | tee -a "$OUTPUT_FILE"
# Loop through hitlist
while read -re hitList || [[ -n "$hitList" ]]
do
# If first character is "#" it's a comment, or line is blank, skip
if [ "$(echo $hitListWords | head -c 1)" != "#" ]; then
if [ ! -z "$hitListWords" -a "$hitListWords" != "" ]; then
# Parse comma delimited category specific hitlist
IFS=',' read -ra categoryWords <<< "$hitListWords"
# Search for occurences/hits for the hitList word
for categoryWord in "${categoryWords[@]}"; do
echo "---------------------------------------------------" | tee -a "$OUTPUT_FILE"
echo "$category - \"$categoryWord"\" | tee -a "$OUTPUT_FILE"
echo "---------------------------------------------------" | tee -a "$OUTPUT_FILE"
eval 'find "$SEARCH_DIR" -type f -print0 | xargs -0 grep -HniI "$categoryWord" >> "$OUTPUT_FILE"'
eval 'find "$SEARCH_DIR" -type f -print0 | xargs -0 grep -HniI --color=always "$categoryWord" | more'
echo "" | tee -a "$OUTPUT_FILE"
done
fi
else
category="$(echo "$hitListWords" | cut -d "#" -f 2)"
fi
done < "$HITLIST_FILE"
exit $?
}
# Process the options
while [[ "${ARGLIST[0]}" == -* ]]; do
OPTION="${ARGLIST[0]}"
NUM_OPTS=1;
case $OPTION in
--version)
version
exit 0
;;
--help)
help
exit 0
;;
*)
help
exit 1
;;
esac
ARGLIST=(${ARGLIST[@]:$NUM_OPTS})
done
FILEARG=${ARGLIST[@]}
process