我想在创建一个循环时提供一些帮助,这个循环会占用我的文件扩展名.tar.gz 将其解压缩并使用grep -a>>搜索其中的文件(扩展名为.tlg) output.text。
在outout.text中,我将需要匹配的数据以及文件的名称和它来自的父tar
已执行此搜索,我希望删除未删除的文件,并在下一个tar文件中继续进行预处理,直到检查完所有的tars为止。
由于我没有这个
的磁盘空间,我无法解决所有问题任何人都可以提供帮助 ?
感谢
答案 0 :(得分:0)
你可以循环焦油,提取它们,然后grep它们;这样的事情应该有效:
match="somestring"
mkdir out/
for i in *.tar.gz; do
mkdir out/${i} # create outdir
tar -C out/${i} -xf ${i} # extract to sub-dir with same name as tar;
# this will show up in grep output
cd out
grep -r ${match} ${i} >> ../output.text
cd ..
rm -rf out/${i} # delete untarred files
done
要小心,因为$ i变量的内容被传递给rm -rf并且有能力删除东西。
答案 1 :(得分:0)
为避免创建临时文件,您可以使用GNU tar的--to-stdout
选项。
下面的代码注意路径中的空格和其他字符可能会混淆shell:
#! /usr/bin/perl
use warnings;
use strict;
sub usage { "Usage: $0 pattern tar-gz-file ..\n" }
sub output_from {
my($cmd,@args) = @_;
my $pid = open my $fh, "-|";
warn("$0: fork: $!"), return unless defined $pid;
if ($pid) {
my @lines = <$fh>;
close $fh or warn "$0: $cmd @args exited " . ($? >> 8);
wantarray ? @lines : join "" => @lines;
}
else {
exec $cmd, @args or die "$0: exec $cmd @args: $!\n";
}
}
die usage unless @ARGV >= 2;
my $pattern = shift;
foreach my $tgz (@ARGV) {
chomp(my @toc = output_from "tar", "-ztf", $tgz);
foreach my $tlg (grep /\.tlg\z/, @toc) {
my $line = 0;
for (output_from "tar", "--to-stdout", "-zxf", $tgz, $tlg) {
++$line;
print "$tlg:$line: $_" if /$pattern/o;
}
}
}
样品运行:
$ ./grep-tlgs hello tlgs.tar.gz tlgs/another.tlg:2: hello tlgs/file1.tlg:2: hello tlgs/file1.tlg:3: hello tlgs/third.tlg:1: hello
$ ./grep-tlgs ^ tlgs.tar.gz tlgs/another.tlg:1: blah blah tlgs/another.tlg:2: hello tlgs/another.tlg:3: howdy tlgs/file1.tlg:1: whoah tlgs/file1.tlg:2: hello tlgs/file1.tlg:3: hello tlgs/file1.tlg:4: good-bye tlgs/third.tlg:1: hello tlgs/third.tlg:2: howdy
$ ./grep-tlgs ^ xtlgs.tar.gz tar: xtlgs.tar.gz: Cannot open: No such file or directory tar: Error is not recoverable: exiting now tar: Child returned status 2 tar: Exiting with failure status due to previous errors ./grep-tlgs: tar -ztf xtlgs.tar.gz exited 2 at ./grep-tlgs line 14.