我有一堆列表,我有一个函数可以将参考列表与所有其他列表的匹配内容进行比较。
ABCC = ['TRIM29', 'IGL@', 'DOCK6', 'SVEP1', 'S100A11', 'EPHA2', 'KLHL7', 'ANXA3', 'NAB1', 'CELF2', 'EDNRB', 'PLAGL1', 'IL6ST', 'S100A8', 'CKLF', 'TIPARP', 'CDH3', 'MAP3K8', 'LYST', 'LEPR', 'FHL2', 'ARL4C', 'IL1RN', 'ESR1', 'CD93', 'ATP2B4', 'KAT2B', 'ELOVL5', 'SCD', 'SPTBN1', 'AKAP13', 'LDLR', 'ADRB2', 'LTBP4', 'TGM2', 'TIMP3', 'RAN', 'LAMA3', 'ASPH', 'ID4', 'STX11', 'CNN2', 'EGR1']
ACC = ['GULP1', 'PREPL', 'FHL1', 'METTL7A', 'TRIM13', 'YPEL5', 'PTEN', 'FAM190B', 'GSN', 'UBL3', 'PTGER3', 'COBLL1', 'EPB41L3', 'KLF4', 'BCL2L2', 'CYLD', 'SLK', 'ENSA', 'SKAP2', 'NR3C2', 'MAF', 'NDEL1', 'EZR', 'PCDH9', 'KIAA0494', 'CITED2', 'MGEA5', 'RUFY3', 'ALDH3A2', 'N4BP2L2', 'EPS15', 'TSPAN5', 'SNRPN', 'SSBP2', 'ELOVL5', 'C5orf4', 'FOXN3', 'ABCA5', 'SEC62', 'PELI1', 'MYCBP2', 'USP15', 'TACC1', 'SHMT1', 'RNF103', 'CDC14B', 'SYNE1', 'NDN', 'PHKB', 'EIF1', 'TROVE2', 'MBD4', 'GAB1']
BEC1 = ['LMNA', 'NHP2L1', 'IDS', 'ATP6V0B', 'ENSA', 'TBCB', 'NDUFA13', 'TOLLIP', 'PLEKHB2', 'MBOAT7', 'C16orf13', 'PGAM1', 'MIF', 'ACTR1A', 'OAZ1', 'GNAS', 'ARF1', 'MAPKAPK3', 'LCMT1', 'ATP6V1D', 'FLOT1', 'PRR13', 'COX5B', 'PGP', 'CYB561', 'CNIH4', 'COX6B1', 'NDUFB2', 'PFDN2', 'GPR172A', 'RTN4', 'GAPDH', 'MAPK13', 'FKBP8', 'PTGER3', 'BSCL2', 'TUBG1', 'FAM162A', 'GDI1', 'SPTLC2', 'YWHAZ', 'BCAP31', 'OSBPL1A', 'ATP6AP1', 'CALM1', 'PEX16', 'MYCBP2']
ARN = ['NCAM1', 'SLC11A2', 'RPL35A', 'PDLIM5', 'RPL31', 'NFIB', 'GYG2', 'IGHG1', 'NAAA']
lists = ([("ABCC", ABCC), ("ACC", ACC), ("BEC1", BEC1), ("ARN", ARN)])
def sort_by_matches(ref, lists):
reference = set(ref)
lists = sorted([[len(reference.intersection(set(l))), name, l] for name, l in lists], key=lambda x: (x[0], -len(x[2])), reverse=True)
for matches, name, a_list in lists:
print("Matches {} in {}".format(matches, name))
如何使用.upper()
来大写引用列表的名称。
def sort_by_matches(ACC, lists)
应该提供与
def sort_by_matches(acc, lists)
我试过这个,但没效果。
def matches(ref, lists):
ref = ref[0].upper()
reference = set(ref)
lists = sorted([[len(reference.intersection(set(l))), name, l] for name, l in lists], key=lambda x: (x[0], -len(x[2])), reverse=True)
for matches, name, a_list in lists:
print("Gene Matches {} in {}".format(matches, name))
NameError: name 'acc' is not defined
答案 0 :(得分:2)
我认为您正在寻找*/
。但你必须将ref作为字符串传递,例如虽然它总是更好地使用字典数据结构。
eval()
输出:
Matches 53 in ACC Matches 3 in BEC1 Matches 1 in ABCC
变量区分大小写。
答案 1 :(得分:2)
这是您的代码的修改版本,允许您将列表名称作为字符串传递给sort_by_matches
。为了便于访问列表,我们将它们放入一个词典中。
ABCC = ['TRIM29', 'IGL@', 'DOCK6', 'SVEP1', 'S100A11', 'EPHA2', 'KLHL7', 'ANXA3', 'NAB1', 'CELF2', 'EDNRB', 'PLAGL1', 'IL6ST', 'S100A8', 'CKLF', 'TIPARP', 'CDH3', 'MAP3K8', 'LYST', 'LEPR', 'FHL2', 'ARL4C', 'IL1RN', 'ESR1', 'CD93', 'ATP2B4', 'KAT2B', 'ELOVL5', 'SCD', 'SPTBN1', 'AKAP13', 'LDLR', 'ADRB2', 'LTBP4', 'TGM2', 'TIMP3', 'RAN', 'LAMA3', 'ASPH', 'ID4', 'STX11', 'CNN2', 'EGR1']
ACC = ['GULP1', 'PREPL', 'FHL1', 'METTL7A', 'TRIM13', 'YPEL5', 'PTEN', 'FAM190B', 'GSN', 'UBL3', 'PTGER3', 'COBLL1', 'EPB41L3', 'KLF4', 'BCL2L2', 'CYLD', 'SLK', 'ENSA', 'SKAP2', 'NR3C2', 'MAF', 'NDEL1', 'EZR', 'PCDH9', 'KIAA0494', 'CITED2', 'MGEA5', 'RUFY3', 'ALDH3A2', 'N4BP2L2', 'EPS15', 'TSPAN5', 'SNRPN', 'SSBP2', 'ELOVL5', 'C5orf4', 'FOXN3', 'ABCA5', 'SEC62', 'PELI1', 'MYCBP2', 'USP15', 'TACC1', 'SHMT1', 'RNF103', 'CDC14B', 'SYNE1', 'NDN', 'PHKB', 'EIF1', 'TROVE2', 'MBD4', 'GAB1']
BEC1 = ['LMNA', 'NHP2L1', 'IDS', 'ATP6V0B', 'ENSA', 'TBCB', 'NDUFA13', 'TOLLIP', 'PLEKHB2', 'MBOAT7', 'C16orf13', 'PGAM1', 'MIF', 'ACTR1A', 'OAZ1', 'GNAS', 'ARF1', 'MAPKAPK3', 'LCMT1', 'ATP6V1D', 'FLOT1', 'PRR13', 'COX5B', 'PGP', 'CYB561', 'CNIH4', 'COX6B1', 'NDUFB2', 'PFDN2', 'GPR172A', 'RTN4', 'GAPDH', 'MAPK13', 'FKBP8', 'PTGER3', 'BSCL2', 'TUBG1', 'FAM162A', 'GDI1', 'SPTLC2', 'YWHAZ', 'BCAP31', 'OSBPL1A', 'ATP6AP1', 'CALM1', 'PEX16', 'MYCBP2']
ARN = ['NCAM1', 'SLC11A2', 'RPL35A', 'PDLIM5', 'RPL31', 'NFIB', 'GYG2', 'IGHG1', 'NAAA']
lists = dict([("ABCC", ABCC), ("ACC", ACC), ("BEC1", BEC1), ("ARN", ARN)])
def sort_by_matches(ref, lists):
reference = set(lists[ref.upper()])
found = sorted([[len(reference.intersection(set(l))), name, l] for name, l in lists.items()],
key=lambda x: (x[0], -len(x[2])), reverse=True)
for matches, name, _ in found:
print("Matches {} in {}".format(matches, name))
# test
for ref in ('ABCC', 'acc', 'bEc1', 'Arn'):
print(ref)
sort_by_matches(ref, lists)
<强>输出强>
ABCC
Matches 43 in ABCC
Matches 1 in ACC
Matches 0 in ARN
Matches 0 in BEC1
acc
Matches 53 in ACC
Matches 3 in BEC1
Matches 1 in ABCC
Matches 0 in ARN
bEc1
Matches 47 in BEC1
Matches 3 in ACC
Matches 0 in ARN
Matches 0 in ABCC
Arn
Matches 9 in ARN
Matches 0 in ABCC
Matches 0 in BEC1
Matches 0 in ACC
我们可以通过将列表保存到lists
字典作为集合来提高效率。我不会在这里重复这些列表定义,因为它们保持不变。
lists = dict([("ABCC", set(ABCC)), ("ACC", set(ACC)), ("BEC1", set(BEC1)), ("ARN", set(ARN))])
def sort_by_matches(ref, lists):
reference = lists[ref.upper()]
found = sorted([[len(reference.intersection(l)), name, l] for name, l in lists.items()],
key=lambda x: (x[0], -len(x[2])), reverse=True)
for matches, name, _ in found:
print("Matches {} in {}".format(matches, name))
如果你不想打印匹配数为零的行,我们只需要一个if
声明:
for matches, name, _ in found:
if matches:
print("Matches {} in {}".format(matches, name))
答案 2 :(得分:0)
在python中,变量名称区分大小写。
如果您定义ACC = ['some', 'list']
,则无法使用acc
。