我想要做的是估计每个肽的分数,即行
我的代码如下:
import csv, math
def train_data(fname):
#load csv training files
peptide= []
allele= []
score = []
with open (fname) as train:
reader = csv.DictReader(train, delimiter='\t')
for row in reader:
peptide.append(row['peptide'])
allele.append(row['allele'])
score.append(row['score'])
return [peptide, allele, score]
def ff():
peptide, allele, score = train_data('sample.txt')
p={'A':(0.074+0.077)/2, 'R':(0.052+0.053)/2, 'N':(0.045+0.044)/2, 'D':(0.054+0.051)/2, 'C':(0.025+0.022)/2, 'Q':(0.034+0.035)/2, 'E':(0.054+0.056)/2, 'G':(0.074+0.074)/2, 'H':(0.026+0.025)/2, 'I':(0.068+0.064)/2, 'L':(0.099+0.096)/2, 'K':(0.058+0.058)/2, 'M':(0.025+0.024)/2, 'F':(0.047+0.048)/2, 'P':(0.039+0.041)/2, 'S':(0.057+0.059)/2, 'T':(0.051+0.053)/2, 'W':(0.013+0.014)/2, 'Y':(0.032+0.033)/2, 'V':(0.073+0.072)/2}
for i in range(len(peptide)):
# peptide[i]=list(peptide[i])
peptide.append(peptide[i])
for j in range(len(peptide[i])):
print(peptide[2][j])
#est_score+=p[peptide[i][j]]
print ('---')
print(peptide[2][1])
if __name__=='__main__':
ff()
当我运行此代码时,我得到的输出是所有的肽值,即 peptide [i] [j] ,用于循环中的print stmt但是我的想要只获得 peptide [2] [j] 值。 在循环之外它工作正常。 print(肽[2] [1]) 使o / p完全正常,即值' A '
我的csv文件是这样的:
peptide score allele
AAAGAEAGKATTEEQ 0.190842 DRB1_0101
AAAGAEAGKATTEEQ 0.006301 DRB1_0301
AAAGAEAGKATTEEQ 0.066851 DRB1_0401
AAAGAEAGKATTEEQ 0.006344 DRB1_0405
AAAGAEAGKATTEEQ 0.035130 DRB1_0701
AAAGAEAGKATTEEQ 0.006288 DRB1_0802
AAAGAEAGKATTEEQ 0.176268 DRB1_0901
AAAGAEAGKATTEEQ 0.042555 DRB1_1101
AAAGAEAGKATTEEQ 0.114855 DRB1_1302
AAAGAEAGKATTEEQ 0.006377 DRB1_1501
AAAGAEAGKATTEEQ 0.006296 DRB3_0101
AAAGAEAGKATTEEQ 0.006313 DRB4_0101
AAAGAEAGKATTEEQ 0.070413 DRB5_0101
我想要做的是估计每个肽的分数,即行 并非所有行都使用: 的 est_score + = P [肽[i] [j]
答案 0 :(得分:1)
import csv, math
p={'A':(0.074+0.077)/2, 'R':(0.052+0.053)/2, 'N':(0.045+0.044)/2, 'D':(0.054+0.051)/2, 'C':(0.025+0.022)/2, 'Q':(0.034+0.035)/2, 'E':(0.054+0.056)/2, 'G':(0.074+0.074)/2, 'H':(0.026+0.025)/2, 'I':(0.068+0.064)/2, 'L':(0.099+0.096)/2, 'K':(0.058+0.058)/2, 'M':(0.025+0.024)/2, 'F':(0.047+0.048)/2, 'P':(0.039+0.041)/2, 'S':(0.057+0.059)/2, 'T':(0.051+0.053)/2, 'W':(0.013+0.014)/2, 'Y':(0.032+0.033)/2, 'V':(0.073+0.072)/2}
def train_data(fname):
#load csv training files
peptide= []
allele= []
score = []
with open (fname) as train:
reader = csv.DictReader(train, delimiter='\t')
for row in reader:
peptide.append(row['peptide'])
allele.append(row['allele'])
score.append(row['score'])
return [peptide, allele, score]
def ff():
peptide, allele, score = train_data('peptide.txt')
for i in range(len(peptide)):
est_score = 0
for char in peptide[i]:
est_score += p[char]
print("est_score: " + str(est_score), "\t: read_score: " + str(score[i]) )
print ('---')
print(peptide[2][1])
if __name__=='__main__':
ff()
est_score始终相同,因为在您提供的文件中,每行的肽相同。这打印:
est_score: 0.9625000000000001 : read_score: 0.190842
---
est_score: 0.9625000000000001 : read_score: 0.006301
---
est_score: 0.9625000000000001 : read_score: 0.066851
---
est_score: 0.9625000000000001 : read_score: 0.006344
---
est_score: 0.9625000000000001 : read_score: 0.035130
---
est_score: 0.9625000000000001 : read_score: 0.006288
---
est_score: 0.9625000000000001 : read_score: 0.176268
---
est_score: 0.9625000000000001 : read_score: 0.042555
---
est_score: 0.9625000000000001 : read_score: 0.114855
---
est_score: 0.9625000000000001 : read_score: 0.006377
---
est_score: 0.9625000000000001 : read_score: 0.006296
---
est_score: 0.9625000000000001 : read_score: 0.006313
---
est_score: 0.9625000000000001 : read_score: 0.070413
---
A
答案 1 :(得分:0)
对我而言,它只打印peptide[2][j]
,但它打印了很多次,这就是你想要的吗?
A
A
A
G
A
E
A
G
K
A
T
T
E
E
Q
---
A
A
A
G
A
E
A
G
K
A
T
T
E
E
Q
---
A
A
A
G
A
E
A
G
K
A
T
T
E
E
Q
---
A
A
A
G
A
E
A
G
K
A
T
T
E
E
Q
---
A
A
A
G
A
E
A
G
K
A
T
T
E
E
Q
---
A
A
A
G
A
E
A
G
K
A
T
T
E
E
Q
---
A
A
A
G
A
E
A
G
K
A
T
T
E
E
Q
---
A
A
A
G
A
E
A
G
K
A
T
T
E
E
Q
---
A
A
A
G
A
E
A
G
K
A
T
T
E
E
Q
---
A
A
A
G
A
E
A
G
K
A
T
T
E
E
Q
---
A
A
A
G
A
E
A
G
K
A
T
T
E
E
Q
---
A
A
A
G
A
E
A
G
K
A
T
T
E
E
Q
---
A
A
A
G
A
E
A
G
K
A
T
T
E
E
Q
---
A
python2和python3都给了我相同的结果。