Question

我有一个像这样的bash脚本

data_generator_that_never_guits | while read data 
do
 an_expensive_process_with data
done

第一个过程连续生成事件（以不规则的间隔），需要在它们可用时进行处理。这个脚本的一个问题是read on消耗了一行输出;由于处理非常昂贵，我希望它能够使用当前可用的所有数据。另一方面，如果有新数据可用，则必须立即开始处理。简而言之，我想做这样的事情

data_generator_that_never_guits | while read_all_available data 
do
 an_expensive_process_with data
done

如果没有数据可供使用，命令read_all_available将等待，或将所有当前可用数据复制到变量。如果数据不是由实线组成，那就完全没了问题。基本上，我正在寻找一个读取模拟，它将读取整个管道缓冲区而不是从管道中读取一行。

对于你们之间的好奇，我有一个需要触发源文件重建的构建脚本的问题的背景会发生变化。我想避免经常触发重建。请不要建议我使用grunt，gulp或其他可用的构建系统，它们不能很好地用于我的目的。

谢谢！

Answer 1

我认为在我更好地了解子壳是如何工作之后我找到了解决方案。这个脚本看起来像我需要的那样：

data_generator_that_never_guits | while true 
do
 # wait until next element becomes available
 read LINE
 # consume any remaining elements — a small timeout ensures that 
 # rapidly fired events are batched together
 while read -t 1 LINE; do true; done
 # the data buffer is empty, launch the process
 an_expensive_process
done

有可能将所有读取行收集到一个批次中，但此时我并不真正关心它们的内容，所以我没有费心去解决这个问题：）

于2014年9月25日添加

这是一个最终的子程序，以防有一天对某人有用：

flushpipe() {
 # wait until the next line becomes available
 read -d "" buffer
 # consume any remaining elements — a small timeout ensures that 
  # rapidly fired events are batched together
 while read -d "" -t 1 line; do buffer="$buffer\n$line"; done
 echo $buffer   
}

要像这样使用：

data_generator_that_never_guits | while true 
do
 # wait until data becomes available
 data=$(flushpipe)
 # the data buffer is empty, launch the process
 an_expensive_process_with data
done

Answer 2

像read -N 4096 -t 1这样的东西可以做到这一点，或者可能read -t 0带有额外的逻辑。有关详细信息，请参阅Bash参考手册。否则，你可能不得不从Bash转移到例如的Perl。

用bash异步消耗管道

2 个答案: