Question

我制作了一个脚本来解析来自不同样本的一些blast文件。因为我想知道所有样本都有的基因，所以我创建了一个列表和一个字典来计算它们。我还从字典中生成了一个json文件。现在我想删除那些计数小于100的基因，因为这是样本的数量，无论是从字典还是从json文件，但我不知道如何。这是代码的一部分：

 ###to produce a dictionary with the genes, and their repetitions
for extracted_gene in matches:
    if extracted_gene in matches_counts:
        matches_counts[extracted_gene]+=1
    else:
        matches_counts[extracted_gene]=1
print matches_counts #check point
#if matches_counts[extracted_gene]==100:
    #print extracted_gene
#to convert a dictionary into a txt file and format it with json

with open('my_gene_extraction_trial.txt', 'w') as file:
    json.dump(matches_counts,file, sort_keys=True, indent=2, separators=(',',':'))

print 'Parsing has finished'

我曾尝试过不同的方法： a）忽略else语句但是它会给我一个空的dict b）尝试仅打印值为100的那些，但不打印 c）我阅读了关于json的文档，但我只能看到如何按对象删除元素，而不是按值删除。我可以帮助我解决这个问题吗？这让我很生气！

Answer 1

这应该是这样的：

# matches (list) and matches_counts (dict) already defined
for extracted_gene in matches:
    if extracted_gene in matches_counts:
        matches_counts[extracted_gene] += 1 
    else: matches_counts[extracted_gene] = 1

print matches_counts #check point

# Create a copy of the dict of matches to remove items from
counts_100 = matches_counts.copy()

for extracted_gene in matches_counts:
    if matches_counts[extracted_gene] < 100: 
        del counts_100[extracted_gene] 

print counts_100

如果您仍然遇到错误，请告诉我。

基于其值从json文件中删除数据

1 个答案: