我制作了一个脚本来解析来自不同样本的一些blast文件。因为我想知道所有样本都有的基因,所以我创建了一个列表和一个字典来计算它们。我还从字典中生成了一个json文件。现在我想删除那些计数小于100的基因,因为这是样本的数量,无论是从字典还是从json文件,但我不知道如何。 这是代码的一部分:
###to produce a dictionary with the genes, and their repetitions
for extracted_gene in matches:
if extracted_gene in matches_counts:
matches_counts[extracted_gene]+=1
else:
matches_counts[extracted_gene]=1
print matches_counts #check point
#if matches_counts[extracted_gene]==100:
#print extracted_gene
#to convert a dictionary into a txt file and format it with json
with open('my_gene_extraction_trial.txt', 'w') as file:
json.dump(matches_counts,file, sort_keys=True, indent=2, separators=(',',':'))
print 'Parsing has finished'
我曾尝试过不同的方法: a)忽略else语句但是它会给我一个空的dict b)尝试仅打印值为100的那些,但不打印 c)我阅读了关于json的文档,但我只能看到如何按对象删除元素,而不是按值删除。 我可以帮助我解决这个问题吗?这让我很生气!
答案 0 :(得分:1)
这应该是这样的:
# matches (list) and matches_counts (dict) already defined
for extracted_gene in matches:
if extracted_gene in matches_counts:
matches_counts[extracted_gene] += 1
else: matches_counts[extracted_gene] = 1
print matches_counts #check point
# Create a copy of the dict of matches to remove items from
counts_100 = matches_counts.copy()
for extracted_gene in matches_counts:
if matches_counts[extracted_gene] < 100:
del counts_100[extracted_gene]
print counts_100
如果您仍然遇到错误,请告诉我。