我正在处理这个 CS50 问题集,它告诉我们匹配人的 DNA
这是我几乎完成的代码:
import re, csv, sys
def main(argv):
# Open csv file
csv_file = open(sys.argv[1], 'r')
people = csv.reader(csv_file)
nucleotide = next(people)[1:]
# Open dna sequences file
txt_file = open(sys.argv[2], 'r')
dna_file = txt_file.read()
str_repeat = {}
str_list = find_STRrepeats(str_repeat, nucleotide, dna_file)
match_dna(people, str_list)
def find_STRrepeats(str_list, nucleotide, dna):
for STR in nucleotide:
groups = re.findall(rf'(?:{STR})+', dna)
if len(groups) == 0:
str_list[STR] = 0
else:
total = max(len(i)/len(STR) for i in groups)
str_list[STR] = int(total)
return str_list
def match_dna(people, str_list):
for row in people:
# Get people name in people csv
person = row[0]
# Get all dna value of each people
data = row[1:]
# If all value in dict equal with all value in data, print the person
if str_list.values() == data:
print(person)
sys.exit(0)
print("No match")
if __name__ == "__main__":
main(sys.argv[1:])
所以,我一直坚持使用 match_dna
函数。我对如何比较字典中的值感到困惑:str_list
与列表中的值:people
。
str_list = {'AGATC': 4, 'AATG': 1, 'TATC': 5}
data = ['4', '1', '5']
我的代码有什么应该改变的吗?或者也许是一种比较这两种不同结构的简单方法?谢谢。
答案 0 :(得分:0)
str_list = {'AGATC': 4, 'AATG': 1, 'TATC': 5}
data = ['4', '1', '5']
for data_item in data:
for key,values in str_list.items():
# list data '4','1','5' are in string
# and dictonery value 4,1,5 are in integer form
# hence you need to compare the same data type
if values == int(data_item):
print(key)
在您提供的第二个剪辑中。列表数据,即“data” '4','1','5' 在字符串和字典值中,即,“str_list” 4,1,5 是整数形式,因此您需要通过转换来比较相同的数据类型将数据列表为整数。你可以查看我上面的代码供你参考。