我在阅读CSV文件并选择具有特定列浮点值的行时遇到了一种奇怪的行为。
这是输入文件的摘录。
ben@truc:$ head summary.fasta.csv
scf7180000753635;170043549;XP_001849446.1;27.72;184;2e-13;74.7
scf7180000753636;340728919;XP_003402759.1;25.78;322;8e-19;93.6
scf7180000753642;328716306;XP_003245892.1;33.51;191;7e-27;119
scf7180000753642;512919417;XP_004929373.1;43.18;132;1e-23;108
scf7180000753642;512914080;XP_004928052.1;40.16;127;5e-21;94.7
scf7180000753664;328696819;XP_003240139.1;37.99;179;2e-23;107
scf7180000753664;328696819;XP_003240139.1;26.67;30;2e-23;25.4
scf7180000753664;328703138;XP_003242103.1;31.65;218;1e-20;99.4
scf7180000753669;383855900;XP_003703448.1;68.92;74;2e-23;102
scf7180000753669;380030611;XP_003698937.1;72.06;68;3e-22;99.8
这是我的shell脚本代码:
#!/bin/sh
echo "extracting the values"
# prepare output files
echo "" > "40_sequence_identity.csv"
echo "" > "60_sequence_identity.csv"
echo "" > "80_sequence_identity.csv"
while read -r line; do
#debug: check if line is correclty read
echo $line
#attribute each CSV column value to a variable
query=`echo $line | cut -d ';' -f1`
gi=`echo $line | cut -d ';' -f2`
refseq=`echo $line | cut -d ';' -f3`
seq_identity=`echo $line | cut -d ';' -f4`
align_length=`echo $line | cut -d ';' -f5`
evalue=`echo $line | cut -d ';' -f6`
score=`echo $line | -d ';' -f7`
#debug: check if cut command is OK
echo "seqidentity:"$seq_identity
# test float value of column 4, if superior to a threshold, write the line in a specific line
if [ $( echo "$seq_identity >= 40" | bc ) ]; then
echo "$line" >> "40_sequence_identity.csv"
fi
if [ $( echo "$seq_identity >= 60" | bc ) ]; then
echo "$line" >> "60_sequence_identity.csv"
fi
if [ $( echo "$seq_identity >= 80" | bc ) ]; then
echo "$line" >> "80_sequence_identity.csv"
fi
done < "summary.fasta.csv"
echo "DONE!"
这是奇怪的输出。
extracting the values
scf7180000753635;170043549;XP_001849446.1;27.72;184;2e-13;74.7
./create_project_directories.sh: 1: ./create_project_directories.sh: -d: not found
seqidentity:27.72
scf7180000753636;340728919;XP_003402759.1;25.78;322;8e-19;93.6
./create_project_directories.sh: 1: ./create_project_directories.sh: -d: not found
seqidentity:25.78
scf7180000753642;328716306;XP_003245892.1;33.51;191;7e-27;119
./create_project_directories.sh: 1: ./create_project_directories.sh: -d: not found
seqidentity:33.51
scf7180000753642;512919417;XP_004929373.1;43.18;132;1e-23;108
./create_project_directories.sh: 1: ./create_project_directories.sh: -d: not found
seqidentity:43.18
scf7180000753642;512914080;XP_004928052.1;40.16;127;5e-21;94.7
./create_project_directories.sh: 1: ./create_project_directories.sh: -d: not found
seqidentity:40.16
scf7180000753664;328696819;XP_003240139.1;37.99;179;2e-23;107
./create_project_directories.sh: 1: ./create_project_directories.sh: -d: not found
seqidentity:37.99
scf7180000753664;328696819;XP_003240139.1;26.67;30;2e-23;25.4
./create_project_directories.sh: 1: ./create_project_directories.sh: -d: not found
seqidentity:26.67
首先,3个输出文件(blast_summary_superior_40_sequence_identity.csv ...)包含所有行,就好像测试不起作用一样。 其次,文件解析似乎没问题,但是这个奇怪的消息:-d:not found,来自无处。虽然它出现在'echo'之前,显示$ seqidentity的值并且可能与cut命令有关。
知道为什么我有这样的输出? 当我在控制台中手动执行命令时,这是有效的。 但不是在执行整个脚本时。
感谢您的帮助。
答案 0 :(得分:1)
您收到错误:-d: not found
因为第17行命令不完整
score=`echo $line | -d ';' -f7`
所以它应该是:
score=$(echo $line | cut -d ';' -f7)