如何在bash中回显*表示的实际文件名

时间:2017-07-10 15:09:38

标签: bash

假设我有一个文件列表,例如:

C92LDANXX_s8_1_A01_0337_SL152928.fastq.gz
C92LDANXX_s8_1_A02_0242_SL152929.fastq.gz
C92LDANXX_s8_2_A01_0337_SL152928.fastq.gz
C92LDANXX_s8_2_A02_0242_SL152929.fastq.gz

我有一个for循环:

for sample in {0337,0242}
do
    f1=*_1_*_${sample}_*.fastq.gz
    f2=*_2_*_${sample}_*.fastq.gz
    echo $f1
    echo $f2
done

它将回显为:

*_1_*_0337_*.fastq.gz
*_2_*_0337_*.fastq.gz
..

我的问题是,由于$ f1和$ f2是唯一的,如何回显$ f1和$ f2的完整实际名称,例如:

C92LDANXX_s8_1_A01_0337_SL152928.fastq.gz
C92LDANXX_s8_2_A01_0337_SL152928.fastq.gz 
..

1 个答案:

答案 0 :(得分:1)

Your own code should actually work as-given for the specific filenames below. However, it has numerous quoting-related bugs. The below both fixes those bugs, and creates empty files with your given names (inside a temporary directory) to demonstrate that the glob expansion does actually work.


tempdir=$(mktemp -d test.XXXXXX)
touch "$tempdir"/{C92LDANXX_s8_1_A01_0337_SL152928,C92LDANXX_s8_1_A02_0242_SL152929,C92LDANXX_s8_2_A01_0337_SL152928,C92LDANXX_s8_2_A02_0242_SL152929}.fastq.gz

cd "$tempdir" || { rm -rf "$tempdir"; exit 1; }

for sample in {0337,0242}; do
    f1=( *_1_*_"${sample}"_*.fastq.gz )
    if [ -e "${f1[0]}" ] || [ -L "${f1[0]}" ]; then
      echo "f1 matches for $sample:"
      printf '  %q\n' "${f1[@]}"
    else
      echo "No f1 matches for $sample found"
    fi

    f2=( *_2_*_"${sample}"_*.fastq.gz )
    if [ -e "${f2[0]}" ] || [ -L "${f2[0]}" ]; then
      echo "f2 matches for $sample:"
      printf '  %q\n' "${f2[@]}"
    else
      echo "No f2 matches for $sample found"
    fi
done

rm -rf -- "$tempdir"

...properly emits the output:

f1 matches for 0337:
  C92LDANXX_s8_1_A01_0337_SL152928.fastq.gz
f2 matches for 0337:
  C92LDANXX_s8_2_A01_0337_SL152928.fastq.gz
f1 matches for 0242:
  C92LDANXX_s8_1_A02_0242_SL152929.fastq.gz
f2 matches for 0242:
  C92LDANXX_s8_2_A02_0242_SL152929.fastq.gz

Note:

  • All variable expansions are quoted. This means that they expand to their literal contents -- otherwise, filenames containing characters in IFS would be split into multiple pieces, and filenames containing glob character literals would be glob-expanded.
  • Glob results are stored in an array, and those arrays are expanded with "${arrayname[@]}". See the BashGuide on Arrays.
  • "${f1[0]}" expands to the first element in array f1. If this either exists (test -e) or is a link (test -L), then we know that the glob actually expanded, and thus that at least one match exists in the current directory.
  • printf '%q\n' prints filenames in eval-safe form (so the content could be copied-and-pasted back into the shell, and would be treated as the filename's exact value). To instead print it in literal form (followed by a newline), use printf '%s\n' instead. Note that on most UNIX filesystems, filenames are allowed to contain newline literals, so storing them unescaped in newline-separated (as opposed to NUL-delimeted) form is not ideal.