zgrep不像grep那样停止脚本

时间:2017-04-25 17:49:40

标签: bash ubuntu unix

我是Bash脚本的新手,我正在尝试让这个脚本逐行读取文件a.txt.gz并检查该行的第二个值是否存在于b.txt.gz文件中太

我不知道为什么zgrep不会结束脚本,读完a.txt.gz后它会卡在闪烁的指针上

这是代码(测试)

zcat /home/tdq/Bash/a.txt.gz | while read p0 p1
do
if zgrep -q -e "[A-Za-z0-9=;._|()\t]*${p1}[A-Za-z0-9=;._|()\t]*" /home/tdq/Bash/b.txt.gz; then
    echo "FOUND"
fi

我运行time ./test时的结果与我预期的一样,但脚本不会结束,这是输出

FOUND
FOUND
FOUND

我尝试使用grep,而不是FOUND,但它可以结束脚本。

zcat /home/tdq/Bash/a.txt.gz | while read p0 p1
do
    if grep -q -e "[A-Za-z0-9=;._|()\t]*${p1}[A-Za-z0-9=;._|()\t]*" /home/tdq/Bash/b.txt.gz; then
        echo "FOUND"
    fi
done

我运行time ./test

时的结果
real    0m9.361s
user    0m6.660s
sys 0m2.196s
tdq@td:~/bash$

有人可以帮助我,非常感谢

a.txt.gz(tab tabperate)

1   rs367896724
2   rs540431307
3   rs555500075
4   rs548419688

b.txt.gz(标签分开)

1   10177   rs367896724 A   AC  100 PASS    AC=2130;AF=0.425319;AN=5008;NS=2504;DP=103152;EAS_AF=0.3363;AMR_AF=0.3602;AFR_AF=0.4909;EUR_AF=0.4056;SAS_AF=0.4949;AA=|||unknown(NO_COVERAGE);VT=INDEL GT  1|0 0|1 0|1
2   10177   rs540431307 A   AC  100 PASS    AC=2130;AF=0.425319;AN=5008;NS=2504;DP=103152;EAS_AF=0.3363;AMR_AF=0.3602;AFR_AF=0.4909;EUR_AF=0.4056;SAS_AF=0.4949;AA=|||unknown(NO_COVERAGE);VT=INDEL GT  1|0 0|1 0|1
3   10177   rs555500075 A   AC  100 PASS    AC=2130;AF=0.425319;AN=5008;NS=2504;DP=103152;EAS_AF=0.3363;AMR_AF=0.3602;AFR_AF=0.4909;EUR_AF=0.4056;SAS_AF=0.4949;AA=|||unknown(NO_COVERAGE);VT=INDEL GT  1|0 0|1 0|1
4   10177   rs548419688 A   AC  100 PASS    AC=2130;AF=0.425319;AN=5008;NS=2504;DP=103152;EAS_AF=0.3363;AMR_AF=0.3602;AFR_AF=0.4909;EUR_AF=0.4056;SAS_AF=0.4949;AA=|||unknown(NO_COVERAGE);VT=INDEL GT  1|0 0|1 0|1

基本上,我必须在a.txt.gz和b.txt.gz中检查rsxxxxx是否相互匹配

c.txt.gz

10084625    rs123
10026407    rs456

d.txt.gz(这是原始文件)

514786698   10084625    491891820   4   12951   0.986   562 421
5221808     495944      1573768     4   664     0.261062   59   2
539535670   10026407    556933170   3   \N  \N  \N  \N

输出文件(c.txt.gz + d.txt.gz = e.txt.gz)

514786698   10084625    491891820   4   12951   0.986   562 421
5221808 \N  \N  \N  \N
539535670   10026407    556933170   3   \N  \N  \N  \N

预期输出文件(c.txt.gz + d.txt.gz = e.txt.gz)

514786698   10084625    491891820   4   12951   0.986   562 421
539535670   10026407    556933170   3   \N  \N  \N  \N

所以它在d.txt.gz中写下了不在c.txt.gz中的行(第二行 - 495944)

1 个答案:

答案 0 :(得分:1)

使用awk和进程替换:

image.imageOrientation

对于您编辑的数据和预期输出:

NSArray *paths =
NSSearchPathForDirectoriesInDomains(NSDocumentDirectory,    
NSUserDomainMask, YES);
NSString *documentsDirectory = [paths objectAtIndex:0];
NSString *getImagePath = [documentsDirectory stringByAppendingPathComponent:@"porfileImage.png"];
UIImage *img = [UIImage imageWithContentsOfFile:getImagePath];
if (img == nil) {
    img = [UIImage imageNamed:@"user"];
}
NSLog(@"Orien: %ld",(long)img.imageOrientation);
self.porfileImage.image = [UIImage imageWithCGImage:[img CGImage] scale:[img scale] orientation: UIImageOrientationUp];