解析文件以复制符合我标准的文件

时间:2014-07-14 17:49:27

标签: regex string bash

我有不同日期和不同时间要复制和删除的文件,标准是文件名称的一部分。我正在考虑使用Bash和正则表达式来组合正则表达式中的各种变量,并简单地使用mv。但也许我解析文件的某种循环是一个更好的主意。

说我有一个名为*TuesdayThursdayMonday_1800-1900.txt*

的文件

现在让我们说$dayofweekMonday

我希望标准为:

在_之前必须存在

*$dayofweek* 当前时间必须大于破折号(1800)的剩余时间,且当前时间必须小于破折号右侧(1900)。

如果这一切都是真的,请对文件执行mv。

1 个答案:

答案 0 :(得分:3)

# Function checkfilename:
#   Usage: checkfilename filename dayofweek [time]
#     filename format: dayname..._timestart-timeend.extension
#     (Underscores can optionally appear between the daynames.)
#   Checks if filename contains dayofweek before the (last) underscore
#   and that time is within the time range after the (last) underscore.
#   If time is not given, the current time is used.
#   Code notes:
#     ${var#patt} Removes patt from beginning of $var.
#     ${var%patt} Removes patt from end of $var.
#     10#num interprets num as decimal even if it begins with a 0.

checkfilename() {
  local file day time days days2 times tstart tend

  file="$1"  # filename
  day="$2"   # day of week

  # Check if the first part of the filename contains day.
  days=${file%_*} # just the days
  days2=${days/$day/} # Remove day from the days.
  # If days == days2 then days didn't contain day; return failure.
  if [ "$days" == "$days2" ]; then return 1; fi

  # Get time from 3rd parameter or from date command
  if (($# >= 3)); then time=10#"$3"
  else time=10#$(date +%H%M); fi  # get time in HHMM format

  times=${file##*_}; times=${times%.*}   # just the times
  tstart=10#${times%-*}; tend=10#${times#*-}

  # If second time is less than first time, add 2400
  ((tend < tstart)) && ((tend+=2400))
  # If current time is less than first time, add 2400
  ((time < tstart)) && ((time+=2400))

  # Check if time is between tstart and tend; return result.
  ((tstart <= time && time <= tend))
  return $?
}

file="TuesdayThursdayMonday_2300-0018.txt"
dayofweek="Thursday"
checkfilename "$file" "$dayofweek" 0005 && echo yep

如果文件名也有要提取的前缀,可以这样做:

file="1A_Monday_1800-1900.mp4"

ext=${file##*.}           # remove from front longest  string matching *.
file=${file%.*}           # remove from back  shortest string matching .*
prefix=${file%%_*}        # remove from back  longest  string matching _*
days=${file#*_}           # remove from front shortest string matching *_
days=${days%%_*}          # remove from back  longest  string matching _*
times=${file##*_}         # remove from front longest  string matching *_

echo $file
echo $ext
echo $prefix
echo $days
echo $times

请注意,在匹配模式中,“*”匹配任意数字的任何数字。 “.”与实际期间匹配,“_”与实际下划线匹配。其他人是“?”,匹配任何单个字符,[abcd]匹配任何一个包含的字符,[^abcd](或[!abcd])匹配任何字符除外其中一个包含的字符。

${var#patt}扩展为$var最短 patt匹配从前面移除。
从{em>前面移除最长 ${var##patt}匹配后,$var扩展为patt
<{1}}从 end 中移除最短 ${var%patt}匹配后展开至$var
从{em> end 移除最长 patt匹配后,${var%%patt}展开为$var

一种完全不同的方法,使用IFS(输入字段分隔符)shell变量而不是参数扩展,将下划线和句点上的字段拆分为数组。

patt