在我的shell脚本的开头,我有一个FOR循环来扫描文件夹以查看是否有任何文件,如果是,我需要处理它们中的每一个。每个文件的过程需要一些时间(比如几分钟),具体取决于文件夹中的文件数量。
问题是:在处理每个文件的过程中,可能会有新文件进入该文件夹,但我的测试表明新文件没有被拾取和处理。那么,有没有办法检测FOR循环处理过程中出现的新文件?
我考虑过定期检查文件夹中的新文件,但我不想再次重新处理现有文件,更重要的是,因为这只是在脚本的开头,所以我不要不希望FOR循环重复太多次。感谢。****
for aFile in "$mydir"/*
do
// some tasks that may take 30 secs or so to finish for each file
done
答案 0 :(得分:1)
这样的事情:
#!/bin/sh -xe
# create some dummy files to start with
touch filea
touch fileb
function analyzeFile() {
echo "analyzing $1"
sleep 10 # dummy for the real stuff you need to do
}
declare stillGettingSomething
declare -A alreadyAnalyzed
stillGettingSomething=true
while [ $stillGettingSomething ]; do
stillGettingSomething=false # prevent endless looping
for i in ./file*; do
# idea: see also http://superuser.com/questions/195598/test-if-element-is-in-array-in-bash
if [[ ${alreadyAnalyzed[$i]} ]]; then
echo "$i was already analyzed before; skipping it immediately"
continue
fi
alreadyAnalyzed[$i]=true # Memorize the file which we visited
stillGettingSomething=true # We found some new file; we have to run another scan iteration later on
analyzeFile $i
# create some new files for the purpose of demonstration
echo "creating file $i-latecreate"
touch $i-latecreate
done
done
此脚本的结果是
+ declare stillGettingSomething
+ declare -A alreadyAnalyzed
+ stillGettingSomething=true
+ '[' true ']'
+ stillGettingSomething=false
+ for i in './file*'
+ [[ -n '' ]]
+ alreadyAnalyzed[$i]=true
+ stillGettingSomething=true
+ analyzeFile ./filea
+ echo 'analyzing ./filea'
analyzing ./filea
+ sleep 10
+ echo 'creating file ./filea-latecreate'
creating file ./filea-latecreate
+ touch ./filea-latecreate
+ for i in './file*'
+ [[ -n '' ]]
+ alreadyAnalyzed[$i]=true
+ stillGettingSomething=true
+ analyzeFile ./fileb
+ echo 'analyzing ./fileb'
analyzing ./fileb
+ sleep 10
+ echo 'creating file ./fileb-latecreate'
creating file ./fileb-latecreate
+ touch ./fileb-latecreate
+ '[' true ']'
+ stillGettingSomething=false
+ for i in './file*'
+ [[ -n true ]]
+ echo './filea was already analyzed before; skipping it immediately'
./filea was already analyzed before; skipping it immediately
+ continue
+ for i in './file*'
+ [[ -n '' ]]
+ alreadyAnalyzed[$i]=true
+ stillGettingSomething=true
+ analyzeFile ./filea-latecreate
+ echo 'analyzing ./filea-latecreate'
analyzing ./filea-latecreate
+ sleep 10
它背后的想法是使用一个关联数组,它记忆那些已经处理过的文件。如果文件已经处理过,则下次我们跳过它时会跳过该文件。只要我们在扫描迭代中获得至少一个新文件,我们就会这样做。
这是上面编码的清理变体,修剪了演示目的编码,试图尽可能接近原始要求。
#!/bin/sh
function analyzeFile() {
echo "analyzing $1"
sleep 10 # dummy for the real stuff you need to do
}
declare stillGettingSomething
declare -A alreadyAnalyzed
stillGettingSomething=true
while [ $stillGettingSomething ]; do
stillGettingSomething=false # prevent endless looping
for i in "$mydir"/*; do
if [[ ${alreadyAnalyzed[$i]} ]]; then
echo "$i was already analyzed before; skipping it immediately"
continue
fi
alreadyAnalyzed[$i]=true # Memorize the file which we visited
stillGettingSomething=true # We found some new file; we have to run another scan iteration later on
analyzeFile $i
done
done
答案 1 :(得分:1)
这是一个有趣的问题,有很多方法可以解决它。一种方法是以某种方式跟踪哪些文件已完成,然后在每次循环迭代时处理第一个未完成的文件,例如,
cd "$mydir"
# make a donedir to put placeholder dummy files
mkdir donedir
while true; do
# find first file with no corresponding dummy file in donedir
newfile=`find * -maxdepth 0 -type f |
sed 's/.*/[ ! -f "../donedir/&" ] \&\& echo "&"/' |
sh | head -n1`
# break out of the loop if there aren't any
[ "$newfile" = "" ] && break
# do your thing with $newfile...
# record that you're done with $newfile
touch "donedir/$newfile"
done
更有效的策略是在完成后将每个文件移动到donedir:
cd "$mydir"
mkdir donedir
while true; do
# find first file
newfile=`find * -maxdepth 0 -type f | head -n1`
# break out of the loop if there aren't any
[ "$newfile" = "" ] && break
# do your thing with $newfile...
# done with $newfile...
mv "$newfile" donedir
done
还可以跟踪哪些文件已完成,例如EagleRainbow建议的关联数组,但该方法的缺点是1.不必要的复杂性,以及2.跟踪哪些文件已完成不是&#39 ; t在脚本的不同运行中自动保留。