这是gist of my code,我一般认为我将项目存储在DropBox中:
〜/升降梭箱/公共/滴/ xx.xx.xx /不管
日期总是2个字符,2个字符和2个字符,点分开。在该文件夹中可以有更多文件夹和更多文件,这就是为什么当我使用find
时,我没有设置depth
并允许它以递归方式扫描。
https://gist.github.com/anonymous/ad51dc25290413239f6f
下面是一个缩短版本的要点,它不会按原样运行,我不相信,虽然假设你安装了DropBox并且路径位置有文件,但要点就会运行我成立了。
General workflow:
SIZE="+250k" # For `find` this is the value in size I am looking for files to be larger than
# Location where I store the output to `find` to process that file further later on.
TEMP="/tmp/drops-output.txt"
Next I rm the tmp file and touch a new one.
I will then cd into
DEST=/Users/$USER/Dropbox/Public/drops
Perform a quick conditional check to make sure that I am working where I want to be,
with all my values as variables, I could mess up easily and not be working where I
thought I would be.
# Conditional check: is the current directory the one I want to be the working directory?
if [ "$(pwd)" = "${DEST}" ]; then
echo -e "Destination and current working directory are equal, this is good!:\n $(pwd)\n"
fi
The meat of step one is the `find` command
# Use `find` to locate a subset of files that are larger than a certain size
# save that to a temp file and process it. I believe this could all be done in
# one find command with -exec or similar but I can't figure it out
find . -type f -size "${SIZE}" -exec ls -lh {} \; >> "$TEMP"
Inside $TEMP will be a data set that looks like this:
-rw-r--r--@ 1 me staff 61K Dec 28 2009 /Users/me/Dropbox/Public/drops/12.28.09/wor-10e619e1-120407.png
-rw-r--r--@ 1 me staff 230K Dec 30 2009 /Users/me/Dropbox/Public/drops/12.30.09/hijack-loop-d6250496-153355.pdf
-rw-r--r--@ 1 me staff 49K Dec 31 2009 /Users/me/Dropbox/Public/drops/12.31.09/mt-5a819185-180538.png
The trouble is, not all files will contains no spaces, though I have done all I can to make sure variables are quoted
and wrapped in parens or braces or quotes where applicable.
With the results in /tmp I run:
# Number of results located as a result of the find `command` above
RESULTS=$(wc -l "$TEMP" | awk '{print $1}')
echo -e "Located: [$RESULTS] total files greater than or equal to $SIZE\n"
# With a result set found via `find`, now use awk to print out the sorted list of file
# sizes and paths.
echo -e "SIZE DATE FILE PATH"
#awk '{print "["$5"] ", $9, $10}' < "$TEMP" | sort -n
awk '{for(i=5;i<=NF;i++) {printf $i " "} ; printf "\n"}' "$TEMP" | sort -n
With the changes to awk from how I had it originally, my result now looks like this:
751K Oct 21 19:00 ./10.21.14/netflix-67-190039.png
760K Sep 14 19:07 ./01.02.15/logos/RCA_old_logo.jpg
797K Aug 21 03:25 ./08.21.14/girl-88-032514.zip
916K Sep 11 21:47 ./09.11.14/small-shot-4d-214727.png
I want it to look like this:
SIZE FILE PATH
========================================
751K ./10.21.14/netflix-67-190039.png
760K ./01.02.15/logos/RCA_old_logo.jpg
797K ./08.21.14/girl-88-032514.zip
916K ./09.11.14/small-shot-4d-214727.png
# All Done
if [ "$?" -ne "0" ]; then
echo "find of drop files larger than $SIZE completed without errors.\n"
exit 1
fi
在获得一些新信息之前,原帖在下面,根据新的信息,我尝试了一些新的策略,并留下了上面的脚本和信息。
我有一个简单的脚本,Mac OS X,它在目录上执行查找并找到所有类型为file且大小超过+ SIZE的文件
然后通过&gt;&gt;
将它们附加到文件中从那里开始,我有一个基本上包含ls -la列表的文件,因此我使用awk通过此命令获取文件大小和文件名:
# With a result set found via `find`, now use awk to print out the sorted list of file
# sizes and paths.
echo -e "SIZE FILE PATH"
awk '{print "["$5"] ", $9, $10}' < "$TEMP" | sort -n
所有工作都按照我的要求进行,但我在上面的代码中得到了一些文件名截断。整个文件大约有30行,我把它固定在这一行。我想如果我投入一个不同的内部字段9月会修复它。我可以使用\ t,因为它不能成为Mac OS X文件名中的\ t。
我认为这只是引用,但我似乎无法看到如果是这样的话。这是返回数据的示例,通常我得到大约50个结果。我填入此文件的第一个文件名截断:
[1.0M] ./11.26.14/Bruna Legal
[1.4M] ./12.22.14/card-88-082636.jpg
[1.6M] ./12.22.14/thrasher-8c-082637.jpg
[11M] ./01.20.15/td-6e-225516.mp3
Bruna Legal是&#34; Bruna Legal Name.pdf&#34;在文件系统上。
答案 0 :(得分:2)
您可以避免解析ls
命令的输出,并使用find
操作与printf
完成整个工作,例如:
find /tmp -type f -maxdepth 1 -size +4k 2>/dev/null -printf "%kKB %f\n" |
sort -nrk1,1
在我的示例中,它输出大于4千字节的每个文件。问题是find
命令无法打印格式为MB的格式化输出。另外,数字排序对我来说不适用于数字周围的方括号,所以我省略了它们。在我的测试中它产生:
140KB +~JF7115171557203024470.tmp
140KB +~JF3757415404286641313.tmp
120KB +~JF8126196619419441256.tmp
120KB +~JF7746650828107924225.tmp
120KB +~JF7068968012809375252.tmp
120KB +~JF6524754220513582381.tmp
120KB +~JF5532731202854554147.tmp
120KB +~JF4394954996081723171.tmp
24KB +~JF8516467789156825793.tmp
24KB +~JF3941252532304626610.tmp
24KB +~JF2329724875703278852.tmp
16KB 578829321_2015-01-23_1708257780.pdf
12KB 575998801_2015-01-16_1708257780-1.pdf
8KB adb.log
编辑,因为我注意到%k
不够准确,因此您可以使用%s
以字节打印并使用{{1转换为KB o MB喜欢:
awk
它产生:
find /tmp -type f -maxdepth 1 -size +4k 2>/dev/null -printf "%sKB %f\n" |
sort -nrk1,1 |
awk '{ $1 = sprintf( "%.2f", $1 / 1024) } { print }'