我的代码没有给我所需的输出:
import re
f = open("cub.txt")
cub = f.read()
f.close()
r = re.compile("([A-A]{1})\s([A-A,'A'])\s([-+]?[0-9]*\.?[0-9]*)")
matches = re.findall(r, cub)
print (matches)
幼崽文本文件的内容:
UUU F 0.45 16.8 ( 45768) UCU S 0.18 14.1 ( 38296) UAU Y 0.40 11.8 ( 32211) UGU C 0.40 8.8 ( 23851)
UUC F 0.55 20.2 ( 54936) UCC S 0.20 15.7 ( 42683) UAC Y 0.60 17.8 ( 48342) UGC C 0.60 13.3 ( 36075)
UUA L 0.08 7.0 ( 19129) UCA S 0.15 11.6 ( 31442) UAA * 0.41 0.8 ( 2046) UGA * 0.59 1.1 ( 2986)
UUG L 0.13 12.6 ( 34146) UCG S 0.07 5.2 ( 14079) UAG Q 0.01 0.5 ( 1281) UGG W 1.00 12.0 ( 32616)
CUU L 0.13 12.4 ( 33708) CCU P 0.27 15.3 ( 41672) CAU H 0.40 9.5 ( 25885) CGU R 0.10 5.4 ( 14682)
CUC L 0.18 16.8 ( 45753) CCC P 0.30 17.0 ( 46097) CAC H 0.60 14.4 ( 39081) CGC R 0.19 10.4 ( 28305)
CUA L 0.06 6.0 ( 16211) CCA P 0.28 15.7 ( 42767) CAA Q 0.27 12.1 ( 33018) CGA R 0.10 5.3 ( 14339)
CUG L 0.41 38.5 (104699) CCG P 0.14 7.8 ( 21091) CAG Q 0.72 32.6 ( 88743) CGG R 0.18 9.7 ( 26453)
AUU I 0.35 16.8 ( 45653) ACU T 0.25 13.3 ( 36078) AAU N 0.43 16.9 ( 46039) AGU S 0.14 11.2 ( 30390)
AUC I 0.46 22.0 ( 59906) ACC T 0.31 16.5 ( 44951) AAC N 0.57 22.5 ( 61099) AGC S 0.26 20.2 ( 54867)
AUA I 0.18 8.8 ( 23805) ACA T 0.30 16.1 ( 43884) AAA K 0.44 27.3 ( 74256) AGA R 0.22 12.2 ( 33289)
AUG M 1.00 23.2 ( 62972) ACG T 0.14 7.7 ( 20943) AAG K 0.56 34.3 ( 93393) AGG R 0.21 11.7 ( 31945)
GUU V 0.21 13.1 ( 35593) GCU A 0.29 20.8 ( 56528) GAU D 0.50 25.3 ( 68683) GGU G 0.18 11.4 ( 30898)
GUC V 0.22 13.6 ( 36917) GCC A 0.32 22.9 ( 62202) GAC D 0.50 24.9 ( 67783) GGC G 0.31 19.7 ( 53631)
GUA V 0.12 7.8 ( 21277) GCA A 0.26 19.0 ( 51713) GAA E 0.43 31.0 ( 84178) GGA G 0.27 17.6 ( 47765)
GUG V 0.45 28.2 ( 76624) GCG A 0.13 9.1 ( 24768) GAG E 0.57 40.9 (111123) GGG G 0.25 16.0 ( 43513)
期望的输出:
{'A': {'GCA': '0.26', 'GCC': '0.32', 'GCU': '0.29', 'GCG': '0.13'}, 'C': {'UGC': '0.60', 'UGU': '0.40'}, 'E': {'GAG': '0.57', 'GAA': '0.43'}, 'D': {'GAU': '0.50', 'GAC': '0.50'}, 'G': {'GGU': '0.18', 'GGG': '0.25', 'GGA': '0.27', 'GGC': '0.31'}, 'F': {'UUU': '0.45', 'UUC': '0.55'}, 'I': {'AUA': '0.18', 'AUC': '0.46', 'AUU': '0.35'}, 'H': {'CAC': '0.60', 'CAU': '0.40'}, 'K': {'AAG': '0.56', 'AAA': '0.44'}, '*': {'UAA': '0.41', 'UGA': '0.59'}, 'M': {'AUG': '1.00'}, 'L': {'CUU': '0.13', 'CUG': '0.41', 'CUC': '0.18', 'CUA': '0.06', 'UUG': '0.13', 'UUA': '0.08'}, 'N': {'AAU': '0.43', 'AAC': '0.57'}, 'Q': {'CAA': '0.27', 'CAG': '0.72', 'UAG': '0.01'}, 'P': {'CCU': '0.27', 'CCG': '0.14', 'CCA': '0.28', 'CCC': '0.30'}, 'S': {'UCU': '0.18', 'AGC': '0.26', 'UCG': '0.07', 'UCC': '0.20', 'UCA': '0.15', 'AGU': '0.14'}, 'R': {'CGA': '0.10', 'CGC': '0.19', 'AGA': '0.22', 'AGG': '0.21', 'CGG': '0.18', 'CGU': '0.10'}, 'T': {'ACC': '0.31', 'ACA': '0.30', 'ACG': '0.14', 'ACU': '0.25'}, 'W': {'UGG': '1.00'}, 'V': {'GUC': '0.22', 'GUA': '0.12', 'GUG': '0.45', 'GUU': '0.21'}, 'Y': {'UAC': '0.60', 'UAU': '0.40'}}
答案 0 :(得分:0)
# Read in the file to a single line.
with open('cub.txt', 'r') as myfile:
data=myfile.read().replace('\n', '')
# split into individual items.
data = data.split(" ")
result = {}
for index, item in enumerate(data):
# Check if item is a codon.
if item.isalpha() and len(item) == 3:
# Key is next item after codon.
key = data[index+1]
# Makes an internal hash if not done already, and inserts the result.
if key not in result:
result[key] = {}
result[key][item] = data[index+2]
print(result)
请记住,这不会保持顺序(因为您希望结果是哈希值(不保持顺序)。
要维持订单,您可以使用OrderedDict
:
from collections import OrderedDict
# Read in the file to a single line.
with open('cub.txt', 'r') as myfile:
data=myfile.read().replace('\n', '')
# split into individual items.
data = data.split(" ")
result = {}
for index, item in enumerate(data):
# Check if item is a codon.
if item.isalpha() and len(item) == 3:
# Key is next item after codon.
key = data[index+1]
# Makes an internal hash if not done already, and inserts the result.
if key not in result:
result[key] = {}
result[key][item] = data[index+2]
# Order keys.
ordered_result = OrderedDict(sorted(result.items(), key=lambda t: t[0]))
# Order internal hashes.
for key, value in ordered_result.items():
ordered_result[key] = OrderedDict(sorted(value.items(), key=lambda t: t[0]))
print(ordered_result)