为每个子类别单独存储txt文件

时间:2014-08-07 12:45:58

标签: bash

我有几个实验。每个实验都有几个复制文件。我想以下列方式将所有这些复制文件放入一个文本文件中。

让我们说有3个实验,每个实验有2个重复文件。(实验和复制数量可以超过这个)

/home/data/study1/EXP1_30/EXP1_replicate_1_30.txt
/home/data/study1/EXP1_30/EXP1_replicate_2_30.txt
/home/data/study1/EXP1_60/EXP1_replicate_1_60.txt
/home/data/study1/EXP1_60/EXP1_replicate_2_60.txt
/home/data/study1/EXP2_30/EXP2_replicate_1_30.txt
/home/data/study1/EXP2_30/EXP2_replicate_2_30.txt
/home/data/study1/EXP2_60/EXP2_replicate_1_60.txt
/home/data/study1/EXP2_60/EXP2_replicate_2_60.txt
/home/data/study1/EXP3_30/EXP3_replicate_1_30.txt
/home/data/study1/EXP3_30/EXP3_replicate_2_30.txt
/home/data/study1/EXP3_60/EXP3_replicate_1_60.txt
/home/data/study1/EXP3_60/EXP3_replicate_2_60.txt

输出file1.txt看起来像

/home/data/study1/EXP1/EXP1_replicate_1_30.txt,/home/data/study1/EXP1/EXP1_replicate_2_30.txt \
/home/data/study1/EXP2/EXP2_replicate_1_30.txt,/home/data/study1/EXP2/EXP2_replicate_2_30.txt \
/home/data/study1/EXP3/EXP3_replicate_1_30.txt,/home/data/study1/EXP3/EXP3_replicate_2_30.txt

输出file2.txt看起来像

/home/data/study1/EXP1/EXP1_replicate_1_60.txt,/home/data/study/EXP1/EXP1_replicate_2_60.txt \
/home/data/study1/EXP2/EXP2_replicate_1_60.txt,/home/data/study1/EXP2/EXP2_replicate_2_60.txt \
/home/data/study1/EXP3/EXP3_replicate_1_60.txt,/home/data/study1/EXP3/EXP3_replicate_2_60.txt

....

我的for循环代码:

ID=(30 60)
exp=("EXP1" "EXP2" "EXP3")

d=""
for  txtfile in /home/data/study1/${exp[0]}/${exp[0]}*_${ID[0]}.txt
do
    printf "%s%s" "$d" "$txtfile" 
    d=","
done
printf " \\" 
printf "\n" 

d=""
for txtfile in /home/data/study1/${exp[1]}/${exp[1]}*_${ID[0]}.txt
do

    printf "%s%s" "$d" "$txtfile" 
    d=","
done
printf " \\" 
printf "\n" 

d=""
for txtfile in /home/data/study1/${exp[2]}/${exp[2]}*_${ID[0]}.txt
do

    printf "%s%s" "$d" "$txtfile" 
    d=","
done          

我正在为每个实验使用带有索引号的循环,并且复制非常耗时。有什么简单的方法吗?

3 个答案:

答案 0 :(得分:1)

我认为这样做符合你的要求:

#!/bin/bash

ids=( 30 60 )
dir=/home/data/study1

# join glob on comma, add slash at end
# modified from http://stackoverflow.com/a/3436177/2088135
join() { local IFS=,; echo "$* "'\'; } #' <- to fix syntax highlighting

i=0
for id in "${ids[@]}"; do
    s=$(for exp in "$dir"/EXP*"$id"; do join "$exp/"*"$id".txt; done)
    # trim off final slash and output to file
    echo "${s%?}" > file$((++i)).txt
done

输出(请注意,在测试时,我设置了dir=.):

$ cat file1.txt 
./EXP1_30/EXP1_replicate_1_30.txt,./EXP1_30/EXP1_replicate_2_30.txt \
./EXP2_30/EXP2_replicate_1_30.txt,./EXP2_30/EXP2_replicate_2_30.txt \
./EXP3_30/EXP3_replicate_1_30.txt,./EXP3_30/EXP3_replicate_2_30.txt 
$ cat file2.txt 
./EXP1_60/EXP1_replicate_1_60.txt,./EXP1_60/EXP1_replicate_2_60.txt \
./EXP2_60/EXP2_replicate_1_60.txt,./EXP2_60/EXP2_replicate_2_60.txt \
./EXP3_60/EXP3_replicate_1_60.txt,./EXP3_60/EXP3_replicate_2_60.txt

答案 1 :(得分:0)

您可以使用以下bash脚本:

#!/bin/bash 

i=0; n=0; files=""
sort -t_ -k5 files.txt | while read line ; do
    files="$files $line"
    i=$((i+1))
    if [ $((i%6)) -eq 0 ] ; then
        n=$((n+1))
        cat $files > "$n.txt"
        files=""
    fi
done

答案 2 :(得分:0)

您还可以使用子shell并使用以下命令从命令行(dat/experiment.txt中的数据)执行:

$ ( first=0; cnt=0; grep 30 dat/experiment.txt | sort | while read line; do \
[ "$first" = 0 ] && first=1 || { [ "$cnt" = 0 ] && echo ' \'; }; echo -n $line; \
((cnt++)); [ "$cnt" = 1 ] && echo -n ","; [ "$cnt" = 2 ] && cnt=0; done; \
echo "" ) >outfile1.txt

$ ( first=0; cnt=0; grep 60 dat/experiment.txt | sort | while read line; do \
[ "$first" = 0 ] && first=1 || { [ "$cnt" = 0 ] && echo ' \'; }; echo -n $line; \
((cnt++)); [ "$cnt" = 1 ] && echo -n ","; [ "$cnt" = 2 ] && cnt=0; done; \
echo "" ) >outfile2.txt

不可否认,一个班轮最终比原先预期的更长,以匹配您的续行 - 完全。如果省略outfiles中的行连续,则该行将减少为(例如):

$ (cnt=0; grep 30 dat/experiment.txt | sort | while read line; do echo -n $line; \
((cnt++)); [ "$cnt" = 1 ] && echo -n ","; [ "$cnt" = 2 ] && echo "" && cnt=0; \ 
done ) >outfile1.txt

<强>输出:

$ cat outfile1.txt
/home/data/study1/EXP1_30/EXP1_replicate_1_30.txt,/home/data/study1/EXP1_30/EXP1_replicate_2_30.txt \
/home/data/study1/EXP2_30/EXP2_replicate_1_30.txt,/home/data/study1/EXP2_30/EXP2_replicate_2_30.txt \
/home/data/study1/EXP3_30/EXP3_replicate_1_30.txt,/home/data/study1/EXP3_30/EXP3_replicate_2_30.txt \

$ cat outfile2.txt
/home/data/study1/EXP1_60/EXP1_replicate_1_60.txt,/home/data/study1/EXP1_60/EXP1_replicate_2_60.txt \
/home/data/study1/EXP2_60/EXP2_replicate_1_60.txt,/home/data/study1/EXP2_60/EXP2_replicate_2_60.txt \
/home/data/study1/EXP3_60/EXP3_replicate_1_60.txt,/home/data/study1/EXP3_60/EXP3_replicate_2_60.txt \