我有这样的数据
>sp|Q96A73|P33MX_HUMAN Putative monooxygenase p33MONOX OS=Homo sapiens OX=9606 GN=KIAA1191 PE=1 SV=1
RNDDDDTSVCLGTRQCSWFAGCTNRTWNSSAVPLIGLPNTQDYKWVDRNSGLTWSGNDTCLYSCQNQTKGLLYQLFRNLFCSYGLTEAHGKWRCADASITNDKGHDGHRTPTWWLTGSNLTLSVNNSGLFFLCGNGVYKGFPPKWSGRCGLGYLVPSLTRYLTLNASQITNLRSFIHKVTPHR
>sp|P13674|P4HA1_HUMAN Prolyl 4-hydroxylase subunit alpha-1 OS=Homo sapiens OX=9606 GN=P4HA1 PE=1 SV=2
VECCPNCRGTGMQIRIHQIGPGMVQQIQSVCMECQGHGERISPKDRCKSCNGRKIVREKKILEVHIDKGMKDGQKITFHGEGDQEPGLEPGDIIIVLDQKDHAVFTRRGEDLFMCMDIQLVEALCGFQKPISTLDNRTIVITSHPGQIVKHGDIKCVLNEGMPIYRRPYEKGRLIIEFKVNFPENGFLSPDKLSLLEKLLPERKEVEE
>sp|Q7Z4N8|P4HA3_HUMAN Prolyl 4-hydroxylase subunit alpha-3 OS=Homo sapiens OX=9606 GN=P4HA3 PE=1 SV=1
MTEQMTLRGTLKGHNGWVTQIATTPQFPDMILSASRDKTIIMWKLTRDETNYGIPQRALRGHSHFVSDVVISSDGQFALSGSWDGTLRLWDLTTGTTTRRFVGHTKDVLSVAFSSDNRQIVSGSRDKTIKLWNTLGVCKYTVQDESHSEWVSCVRFSPNSSNPIIVSCGWDKLVKVWNLANCKLK
>sp|P04637|P53_HUMAN Cellular tumor antigen p53 OS=Homo sapiens OX=9606 GN=TP53 PE=1 SV=4
IQVVSRCRLRHTEVLPAEEENDSLGADGTHGAGAMESAAGVLIKLFCVHTKALQDVQIRFQPQL
我试图获取每个部分中的K数,所以我试图获取的输出就是这样
K R
Q96A73 7 11
P13674 17 13
Q7Z4N8 11 11
P04637 2 4
我一直在尝试使用
cat mydata.txt | grep -v '^>' | grep -i -e [k] |wc -l
例如,如果我们看第一个
K R KK RR
Q96A73 7 11 0 0
P13674 17 13 1 2
Q7Z4N8 11 11 1 0
P04637 2 4 0 0
答案 0 :(得分:2)
使用Perl,
perl -F"\|" -lne ' BEGIN{print "ID K R"} s/(K|R)/$kv{$1}++/ge; if(not /^>/ ) { print "$x $kv{K} $kv{R}" ;%kv=() } $x=$F[1] '
带有输入
$ cat KR.txt
>sp|Q96A73|P33MX_HUMAN Putative monooxygenase p33MONOX OS=Homo sapiens OX=9606 GN=KIAA1191 PE=1 SV=1
RNDDDDTSVCLGTRQCSWFAGCTNRTWNSSAVPLIGLPNTQDYKWVDRNSGLTWSGNDTCLYSCQNQTKGLLYQLFRNLFCSYGLTEAHGKWRCADASITNDKGHDGHRTPTWWLTGSNLTLSVNNSGLFFLCGNGVYKGFPPKWSGRCGLGYLVPSLTRYLTLNASQITNLRSFIHKVTPHR
>sp|P13674|P4HA1_HUMAN Prolyl 4-hydroxylase subunit alpha-1 OS=Homo sapiens OX=9606 GN=P4HA1 PE=1 SV=2
VECCPNCRGTGMQIRIHQIGPGMVQQIQSVCMECQGHGERISPKDRCKSCNGRKIVREKKILEVHIDKGMKDGQKITFHGEGDQEPGLEPGDIIIVLDQKDHAVFTRRGEDLFMCMDIQLVEALCGFQKPISTLDNRTIVITSHPGQIVKHGDIKCVLNEGMPIYRRPYEKGRLIIEFKVNFPENGFLSPDKLSLLEKLLPERKEVEE
>sp|Q7Z4N8|P4HA3_HUMAN Prolyl 4-hydroxylase subunit alpha-3 OS=Homo sapiens OX=9606 GN=P4HA3 PE=1 SV=1
MTEQMTLRGTLKGHNGWVTQIATTPQFPDMILSASRDKTIIMWKLTRDETNYGIPQRALRGHSHFVSDVVISSDGQFALSGSWDGTLRLWDLTTGTTTRRFVGHTKDVLSVAFSSDNRQIVSGSRDKTIKLWNTLGVCKYTVQDESHSEWVSCVRFSPNSSNPIIVSCGWDKLVKVWNLANCKLK
>sp|P04637|P53_HUMAN Cellular tumor antigen p53 OS=Homo sapiens OX=9606 GN=TP53 PE=1 SV=4
IQVVSRCRLRHTEVLPAEEENDSLGADGTHGAGAMESAAGVLIKLFCVHTKALQDVQIRFQPQL
$ perl -F"\|" -lne ' BEGIN{print "ID K R"} s/(K|R)/$kv{$1}++/ge; if(not /^>/ ) { print "$x $kv{K} $kv{R}" ;%kv=() } $x=$F[1] ' KR.txt
ID K R
Q96A73 8 11
P13674 17 13
Q7Z4N8 11 11
P04637 2 4
$
OP已更新了问题。请立即检查
$ perl -F"\|" -lne ' BEGIN{print "ID K R"} if(not /^>/) { s/(K|R)/$kv{$1}++;$1/ge;s/(KK|RR)/$kv{$1}++/ige; print "$x $kv{K} $kv{R} ",$kv{KK}?$kv{KK}:0," ",$kv{RR}?$
kv{RR}:0 ;%kv=() } $x=$F[1] ' KR.txt
ID K R
Q96A73 7 11 0 0
P13674 17 13 1 2
Q7Z4N8 11 11 0 1
P04637 2 4 0 0
$
答案 1 :(得分:2)
import { BrowserModule } from '@angular/platform-browser';
import { NgModule, NO_ERRORS_SCHEMA } from '@angular/core';
import { AppRoutingModule } from './app-routing.module';
import { AppComponent } from './app.component';
import { ReactiveFormsModule, FormsModule } from '@angular/forms';
import { HttpClientModule } from '@angular/common/http';
import { LoginComponent} from './LoginComponent/LoginComponent.component';
import { FleshScreenComponent} from './FleshScreenComponent/FleshScreenComponent.component';
@NgModule({
declarations: [
AppComponent,
LoginComponent,
FleshScreenComponent,
],
imports: [
],
entryComponents:[
LoginComponent, PaymentComponent
],
schemas: [ NO_ERRORS_SCHEMA ],
providers: [ServicesService ,CommonStorageService, {provide: MatDialogRef, useValue: {} }],
bootstrap: [FleshScreenComponent],
})
export class AppModule { }
答案 2 :(得分:1)
请您尝试以下。
awk -F'|' '/^>/{val=$2;next} {print val,gsub(/[kK]/,""),gsub(/[rR]/,"")}' Input_file
如果您也想获取标头的输出,请尝试执行以下操作。
awk -F'|' 'BEGIN{print " K R"}/^>/{val=$2;next} {print val,gsub(/[kK]/,""),gsub(/[rR]/,"")}' Input_file
EDT1: 根据OP的注释,如果我们想获取KK
或kk
的2个连续出现的次数,然后尝试执行以下操作。
awk -F'|' '/^>/{val=$2;next} {print val,gsub(/kk|KK/,""),gsub(/rr|RR/,"")}' Input_file
EDIT2: :要获得k
,kk
,r
,rr
个计数,请使用以下计数。>
awk -F'|' '/^>/{val=$2;next} {line=$0;print val,gsub(/[kK]/,""),gsub(/[rR]/,""),gsub(/kk|KK/,"",line),gsub(/rr|RR/,"",line)}' Input_file
带有标头:
awk -F'|' '
BEGIN{
print " k/K\tr/R\tkk/KK\trr/RR"
}
/^>/{
val=$2
next
}
{
line=$0
print val,gsub(/[kK]/,""),gsub(/[rR]/,""),gsub(/kk|KK/,"",line),gsub(/rr|RR/,"",line)
}' OFS="\t" Input_file
输出如下。
k/K r/R kk/KK rr/RR
Q96A73 7 11 0 0
P13674 17 13 1 2
Q7Z4N8 11 11 0 1
P04637 2 4 0 0