我可以从CLI缓存Linux上的命令输出吗?

时间:2012-08-10 10:56:31

标签: bash caching command-line command-line-interface

我正在寻找' cacheme '命令的实现,' memoizes '输出ARGV中的任何内容。如果它从未运行它,它将运行它并稍微记住输出。如果它运行它,它只会复制文件的输出(甚至更好,输出和错误分别为& 1和& 2)。

让我们假设有人写了这个命令,它会像这样工作。

$ time cacheme sleep 1    # first time it takes one sec
real   0m1.228s
user   0m0.140s
sys    0m0.040s

$ time cacheme sleep 1    # second time it looks for stdout in the cache (dflt expires in 1h)
#DEBUG# Cache version found! (1 minute old)

real   0m0.100s
user   0m0.100s
sys    0m0.040s

这个例子有点傻,因为它没有输出。理想情况下,它将在 sleep-1-and-echo-hello-world.sh 等脚本上进行测试。

我创建了一个小脚本,在/ tmp /中使用完整命令名和用户名的哈希创建一个文件,但我很确定已存在的东西。

你知道这些吗?

7 个答案:

答案 0 :(得分:4)

通过将expiry age添加为可选参数,在某种程度上改进了解决方案。

#!/bin/sh
# save as e.g. $HOME/.local/bin/cacheme
# and then chmod u+x $HOME/.local/bin/cacheme
VERBOSE=false
PROG="$(basename $0)"
DIR="${HOME}/.cache/${PROG}"
mkdir -p "${DIR}"
EXPIRY=600 # default to 10 minutes
# check if first argument is a number, if so use it as expiration (seconds)
[ "$1" -eq "$1" ] 2>/dev/null && EXPIRY=$1 && shift
[ "$VERBOSE" = true ] && echo "Using expiration $EXPIRY seconds"
CMD="$@"
HASH=$(echo "$CMD" | md5sum | awk '{print $1}')
CACHE="$DIR/$HASH"
test -f "${CACHE}" && [ $(expr $(date +%s) - $(date -r "$CACHE" +%s)) -le $EXPIRY ] || eval "$CMD" > "${CACHE}"
cat "${CACHE}"

答案 1 :(得分:2)

这个简单的shell脚本(未测试)怎么样?

#!/bin/sh

mkdir -p cache

cachefile=cache/cache

for i in "$@"
do
    cachefile=${cachefile}_$(printf %s "$i" | sed 's/./\\&/g')
done

test -f "$cachefile" || "$@" > "$cachefile"
cat "$cachefile"

答案 2 :(得分:1)

我在ruby中提出的解决方案是这样的。有人看到任何优化吗?

#!/usr/bin/env ruby

VER = '1.2'
$time_cache_secs = 3600
$cache_dir = File.expand_path("~/.cacheme")

require 'rubygems'
begin
  require 'filecache'           # gem install ruby-cache
rescue Exception => e
  puts 'gem filecache requires installation, sorry. trying to install myself'
  system  'sudo gem install -r filecache'
  puts  'Try re-running the program now.'
  exit 1
end

=begin
  # create a new cache called "my-cache", rooted in /home/simon/caches
  # with an expiry time of 30 seconds, and a file hierarchy three
  # directories deep
=end
def main
  cache = FileCache.new("cache3", $cache_dir, $time_cache_secs, 3)
  cmd = ARGV.join(' ').to_s   # caching on full command, note that quotes are stripped
  cmd = 'echo give me an argment' if cmd.length < 1

  # caches the command and retrieves it
  if cache.get('output' + cmd)
    #deb "Cache found!(for '#{cmd}')"
  else
    #deb "Cache not found! Recalculating and setting for the future"
    cache.set('output' + cmd, `#{cmd}`)
  end
  #deb 'anyway calling the cache now'
  print(cache.get('output' + cmd))
end

main

答案 3 :(得分:1)

我为bash实现了一个简单的缓存脚本,因为我想speed up plotting from piped shell command in gnuplot。它可用于缓存任何命令的输出。只要参数相同并且参数中传递的文件没有改变,就使用缓存。系统负责清理。

#!/bin/bash

# hash all arguments
KEY="$@"

# hash last modified dates of any files
for arg in "$@"
do
  if [ -f $arg ]
  then
    KEY+=`date -r "$arg" +\ %s`
  fi
done

# use the hash as a name for temporary file
FILE="/tmp/command_cache.`echo -n "$KEY" | md5sum | cut -c -10`"

# use cached file or execute the command and cache it
if [ -f $FILE ]
then
  cat $FILE
else
  $@ | tee $FILE
fi

您可以将脚本命名为cache,设置可执行标记并将其放入PATH。然后只需使用cache为任何命令添加前缀即可使用它。

答案 4 :(得分:1)

我创建了一个memoization utility for Bash,它的工作方式正如您所描述的那样。它专门用于缓存Bash函数,但显然你可以在函数中包含对其他命令的调用。

它处理许多更简单的缓存机制遗漏的边缘情况行为。它报告原始调用的退出代码,分别保留stdout和stderr,并在输出中保留任何尾随空格($()命令替换将截断尾随空格。)

演示:

# Define function normally, then decorate it with bc::cache
$ maybe_sleep() {
  sleep "$@"
  echo "Did I sleep?"
} && bc::cache maybe_sleep

# Initial call invokes the function
$ time maybe_sleep 1
Did I sleep?

real    0m1.047s
user    0m0.000s
sys     0m0.020s

# Subsequent call uses the cache
$ time maybe_sleep 1
Did I sleep?

real    0m0.044s
user    0m0.000s
sys     0m0.010s

# Invocations with different arguments are cached separately
$ time maybe_sleep 2
Did I sleep?

real    0m2.049s
user    0m0.000s
sys     0m0.020s

还有一个显示缓存开销的基准函数:

$ bc::benchmark maybe_sleep 1
Original:       1.007
Cold Cache:     1.052
Warm Cache:     0.044

所以你可以看到读/写开销(在我的机器上,使用tmpfs)大约是1/20秒。此基准测试实用程序可以帮助您确定是否值得缓存特定的调用。

答案 5 :(得分:0)

这里有一个实现:https://bitbucket.org/sivann/runcached/src 缓存可执行路径,输出,退出代码,记住参数。可配置的到期时间。在bash,C,python中实现,选择适合你的任何东西。

答案 6 :(得分:0)

solution from error上得到了改进:

  • 将管道输出到“ tee”命令中,以便实时查看并存储在缓存中。
  • 通过使用“ script --flush --quiet / dev / null --command $ CMD”来保留颜色(例如在“ ls --color”之类的命令中)。
  • 也避免使用脚本调用“ exec”
  • 使用bash和[[
    #!/usr/bin/env bash

    CMD="$@"
    [[ -z $CMD ]] && echo "usage: EXPIRY=600 cache cmd arg1 ... argN" && exit 1

    # set -e -x

    VERBOSE=false
    PROG="$(basename $0)"

    EXPIRY=${EXPIRY:-600}  # default to 10 minutes, can be overriden
    EXPIRE_DATE=$(date -Is -d "-$EXPIRY seconds")

    [[ $VERBOSE = true ]] && echo "Using expiration $EXPIRY seconds"

    HASH=$(echo "$CMD" | md5sum | awk '{print $1}')
    CACHEDIR="${HOME}/.cache/${PROG}"
    mkdir -p "${CACHEDIR}"
    CACHEFILE="$CACHEDIR/$HASH"

    if [[ -e $CACHEFILE ]] && [[ $(date -Is -r "$CACHEFILE") > $EXPIRE_DATE ]]; then
        cat "$CACHEFILE"
    else
        script --flush --quiet --return /dev/null --command "$CMD" | tee "$CACHEFILE"
    fi