how to use pipeline in linux?

时间:2016-04-15 15:00:34

标签: linux

I have a software working in linux, like test_sf

test_sf --input test.fastq --output test.results

It will output a file named test.results

but I have test1.fastq.gz, test2.fastq.gz, how can use these two files instead of unzip it?

zcat test1.fastq.gz | test_st --input --output test1.results
zcat test1.fastq.gz | test_st --input --output test1.results

These two commands did not work.

Note:This is just a toy command to show my question.

1 个答案:

答案 0 :(得分:2)

You have this command:

test_sf --input test.fastq --output test.results

You would ideally run this, but you can't because your program doesn't support compressed input:

test_sf --input test.fastq.gz --output test.results # probably fails

So you need to use zcat to unzip the file, and pipe it in. Some programs understand - to be a magic filename meaning stdin, in which case you can do this:

zcat test.fastq | test_sf --input - --output test.results # might work

If your program also does not understand - as special, you can use this in Bash:

test_sf --input <(zcat test.fastq.gz) --output test.results # should work

What this does is invoke your program with a command line actually like this:

test_sf --input /dev/XXX --output test.results

Where XXX is some special filename which actually is a pipe where zcat will write. This way, so long as your program supports reading serially from a file (not requiring random access to the input), it will almost certainly work. This last technique is described further here: https://unix.stackexchange.com/questions/101143/how-can-i-stream-data-to-a-program-that-expects-to-read-data-from-a-file-that-is