Bash脚本按年排序数百万个项目太慢

时间:2019-01-18 01:09:55

标签: bash cygwin

我有一个约有150万个小文件的文件夹。我的父母目录中也有重复的内容。所以有点混乱。

它们的格式都包含文件名中的日期

我正在尝试按年份将它们分类到父文件夹中。

这是我目前拥有的。但是每秒只执行3次。有什么可以在下面做但是更快的事情吗?我有SAS磁盘,32GB ram和Xeon 3.2GHz。 Windows 2012 r2

#!/bin/bash

for f in * ; do
    if [[ $f == *_D????98* ]]
    then
    if ! [[ -e ../../1998/$f ]]
    then
            mv $f ../../1998/$f
    fi
    elif [[ $f == *_D????99* ]]
    then
    if ! [[ -e ../../1999/$f ]]
    then
            mv $f ../../1999/$f
    fi
    elif [[ $f == *_D????00* ]]
    then
    if ! [[ -e ../../2000/$f ]]
    then
            mv $f ../../2000/$f
    fi
    elif [[ $f == *_D????01* ]]
    then
    if ! [[ -e ../../2001/$f ]]
    then
            mv $f ../../2001/$f
    fi
    elif [[ $f == *_D????02* ]]
    then
    if ! [[ -e ../../2002/$f ]]
    then
            mv $f ../../2002/$f
    fi
    elif [[ $f == *_D????03* ]]
    then
    if ! [[ -e ../../2003/$f ]]
    then
            mv $f ../../2003/$f
    fi
    elif [[ $f == *_D????04* ]]
    then
    if ! [[ -e ../../2004/$f ]]
    then
            mv $f ../../2004/$f
    fi
    elif [[ $f == *_D????05* ]]
    then
    if ! [[ -e ../../2005/$f ]]
    then
            mv $f ../../2005/$f
    fi
    elif [[ $f == *_D????06* ]]
    then
    if ! [[ -e ../../2006/$f ]]
    then
            mv $f ../../2006/$f
    fi
    elif [[ $f == *_D????07* ]]
    then
    if ! [[ -e ../../2007/$f ]]
    then
            mv $f ../../2007/$f
    fi
    elif [[ $f == *_D????08* ]]
    then
    if ! [[ -e ../../2008/$f ]]
    then
            mv $f ../../2008/$f
    fi
    elif [[ $f == *_D????09* ]]
    then
    if ! [[ -e ../../2009/$f ]]
    then
            mv $f ../../2009/$f
    fi
    elif [[ $f == *_D????10* ]]
    then
    if ! [[ -e ../../2010/$f ]]
    then
            mv $f ../../2010/$f
    fi
    elif [[ $f == *_D????11* ]]
    then
    if ! [[ -e ../../2011/$f ]]
    then
            mv $f ../../2011/$f
    fi
    elif [[ $f == *_D????12* ]]
    then
    if ! [[ -e ../../2012/$f ]]
    then
            mv $f ../../2012/$f
    fi
    elif [[ $f == *_D????13* ]]
    then
    if ! [[ -e ../../2013/$f ]]
    then
            mv $f ../../2013/$f
    fi
    elif [[ $f == *_D????14* ]]
    then
    if ! [[ -e ../../2014/$f ]]
    then
            mv $f ../../2014/$f
    fi
    elif [[ $f == *_D????15* ]]
    then
    if ! [[ -e ../../2015/$f ]]
    then
            mv $f ../../2015/$f
    fi
    elif [[ $f == *_D????16* ]]
    then
    if ! [[ -e ../../2016/$f ]]
    then
            mv $f ../../2016/$f
    fi
    elif [[ $f == *_D????17* ]]
    then
    if ! [[ -e ../../2017/$f ]]
    then
            mv $f ../../2017/$f
    fi
    elif [[ $f == *_D????18* ]]
    then
    if ! [[ -e ../../2018/$f ]]
    then
            mv $f ../../2018/$f
    fi
    fi
done
wmic bios get serialnumber

1 个答案:

答案 0 :(得分:0)

似乎存在一些瓶颈因素。片段for f in *将 一次创建所有文件的列表,将消耗一些内存。 可能会对发布的代码进行一些改进,但我们不能指望发生重大变化, 因为bash不是一种省时的语言。 我建议考虑其他语言。 如果Perl可用,请尝试以下操作:

perl -e '
opendir(D, ".") or die;
while ($f = readdir(D)) {                   # it reads file by file
    next if $f eq "." or $f eq "..";        # skip parent dir and current dir
    if ( -f $f && $f =~ /_D.{4}(\d{2})/) {  # find the matching file
        $yy = $1;                           # extract year
        $year = $yy + 1900;
        if ($yy < 70) {
            $year += 100;
        }
        if (-d "../../$year" && ! -e "../../$year/$f") {
            rename $f, "../../$year/$f";    # move to the desired destination
        }
    }
}'

在我的基准环境中,它的运行速度比发布的脚本快50或100倍。希望这会有所帮助。