prolog逻辑门聚合递归优化

时间:2014-06-09 12:40:43

标签: prolog query-optimization aggregation

我正在尝试实现逻辑门类型聚合操作。我在编写一个在合理的时间内执行计算的实现时遇到了麻烦。我认为我的逻辑工作,但它很慢,我不认为它需要。我认为应该可以通过使用许多'findall's或cut'来做到这一点。

我有一个大约10,000列和70行的表。行对应于样本,列对应于探针。表中的每个值都是1或0(样本中探针的状态)。

蛋白质的多个探针代码。 (多对一关系)所以我想通过Logical OR操作将探针列聚合到蛋白质列。

除此之外,一些蛋白质是蛋白质复合物或蛋白质组的一部分。除含有蛋白质外,蛋白质复合物和蛋白质组都可以含有蛋白质复合物或蛋白质组。所以它们可以是一种递归关系。我想将蛋白质组建模为OR门和蛋白质复合物作为AND门。总的来说,我将蛋白质,蛋白质和复合物称为“实体”。

总而言之,我想要一个谓词,我可以询问一个蛋白质或实体是否在一个快速起作用的样本中打开或关闭。

如果其他一些谓词不清楚,那么我可以告诉你他们做了什么。

protein(Sample, Reactome_Id, State):-
    setof(Sample, Probe^samples(Sample, Probe, ProbeValue), Samples), 
    %sample/3 is a set of facts that correspond to the described table
    member(X, Samples), %used to generate Sample Id's %this seems wasteful 
    protein_reactome_Id_to_Uniprot_Id(Reactome_Id, UniprotId), % a set of facts matching two types of id
    %used to generate uniprot ids
    findall(Value, uniProt_Sample_Probes(UniprotId,X,_,Value),Vs),
    Vs = [_|_],     %Check list is not empty already
    delete(Vs,0,ListOfOnes),
    (ListOfOnes=[]-> (State is 0, write('OFF'));(State is 1,write('ON'))).
    %As this is an or I think I should just be able to find a single 1 and cut for the  on case and if this is not possible to say it is off.

%if a (simple) entity is a protein set and its state is on
%this is a base case where an entity does not have complexs or sets inside it
state_of_entity(Entity,State,Sample):-
    all_children_proteins(Entity), %checks that all children are of type protein
    type(Entity, protein_set),
    child_component(Entity,Child), %generates the children of an entity
    protein(Sample,Child,1),
    State is 1,!.

 %if a (simple) entity is a protein set and it's state if off
 %this is a base case where an entity does not have complexs or sets inside it
 %I find all proteins for a sample, this is a list of values, I delete all the
 %zeros and the remaining list will unify with the empty list.
 state_of_entity(Entity,State,Sample):-
     all_children_proteins(Entity),
     type(Entity, protein_set),
     child_component(Entity,Child),
 bagof(Value, Value^protein(Sample,Child,Value),Vs),
 delete(Vs,0,ListOfOnes),ListOfOnes=[],
 State is 0,!.

%if a (simple) entity is a complex and is off
%this is a base case where an entity does not have complexs or sets inside it
state_of_entity(Entity,State,Sample):-
    all_children_proteins(Entity),
    type(Entity, complex),
    child_component(Entity,Child),
    protein(Sample,Child,0),
    State is 0,!.

%if a (simple) entity is a complex and is on.
%this is a base case where an entity does not have complexs or sets inside it
%I find all protein in a sample, this is a list of values, I delete all the
%zeros and the remaining list will unify with the empty list.
state_of_entity(Entity,State,Sample):-
    all_children_proteins(Entity),
    type(Entity, complex),
    child_component(Entity,Child),
    bagof(Value, Value^protein(Sample,Child,Value),Vs),
    delete(Vs,1,ListOfZeros),ListOfZeros=[],
    State is 1,!.

%if a complex with components is off
%recursive case
state_of_entity(Entity,State,Sample):-
    type(Entity, complex),
    child_component(Entity,Child),
    (state_of_entity(Child,0,Sample);
    protein(Sample,Child,0)), %if it has any proteins as input as well as other      components
    State is 0,!.

%if a complex with components is on
%recursive case
state_of_entity(Entity,State,Sample):-
    type(Entity, complex),
    child_component(Entity,Child),
    bagof(Value, Value^state_of_entity(Child,Value,Sample),Vs),%if it has component inputs
    bagof(Value2, Value2^protein(Sample,Child,Value2),Vs2),%if it has protein inputs
    append(Vs, Vs2, Vs3),
    delete(Vs3,1,ListOfZeros),ListOfZeros=[],%delete all the ones, the list of zeros will be empty if all inputs are on
  State is 1,!.

%if a protein set with components is on
%recursive case
state_of_entity(Entity,State,Sample):-
    type(Entity, protein_set),
    child_component(Entity,Child),
    (state_of_entity(Child,1,Sample);
    protein(Sample,Child,1)), %if it has any proteins as input as well as other entities
    State is 1,!.

%if a protein set with components is off
%recursive case
state_of_entity(Entity,State,Sample):-
    type(Entity, protein_set),
    child_component(Entity,Child),
    bagof(Value, Value^state_of_entity(Child,Value,Sample),Vs), %if it has entity inputs
    bagof(Value2, Value2^protein(Sample,Child,Value2),Vs2), %if it has protein inputs
    append(Vs, Vs2, Vs3), %join the list of inputs together
    delete(Vs3,0,ListOfOnes),ListOfOnes=[], %delete all the zeros, the list of 1's will be empty if all inputs are off
    State is 0,!.

更新 我最终得到了这个,因为蛋白质位可以按照我想要的方式工作。

samples(Samples):-
    setof(Sample_in, Probe^samples(Sample_in, Probe, ProbeValue), Samples).
sample(Sample):-
    once(samples(Samples)), %why do I need this?!
    member(Sample, Samples).

protein_stack(Sample, Reactome_Id, State):-
        (
            protein_reactome_Id_to_Uniprot_Id(Reactome_Id, UniprotId),
            uniProt_Sample_Probes(UniprotId, Sample, Probe, 1),
            !,
            State is 1
        ;
            State is 0
        ).

protein_good(Sample, Reactome_Id,State):-
    sample(Sample), 
    protein_reactome_Id_to_Uniprot_Id(Reactome_Id, _),
    protein_stack(Sample, Reactome_Id,State).

1 个答案:

答案 0 :(得分:2)

让我们采用protein/3的第一条规则。

  • Reactome_IdUniprotId之间的关系是否唯一?如果是,请在setof(Sample ...), member(X, Samples)之前移动它,并在它之后进行切割。否则,您会尝试为setof(...), member(X, Samples)的每个结果满足它。更重要的是,绿色切割有助于提高性能。

  • 该规则有一个目的,即查看Vs中是否至少有一个值为1。您不应该生成Vs的所有成员,然后搜索值1,但在满足uniProt_Sample_Probes(UniprotId, X, _, Value)时找到第一个成员时停止。

    protein(Sample, Reactome_Id, State):-
            (
                protein_reactome_Id_to_Uniprot_Id(Reactome_Id, UniprotId),
                setof(Sample, Probe^samples(Sample, Probe, ProbeValue), Samples), 
                member(X, Samples),
                uniProt_Sample_Probes(UniprotId, X, _, 1),
                !,
                State is 1, write('ON))
            ;
                State is 0, write('OFF')
            ).
    

可以使用相同的模式优化其他规则:

state_of_x(X, State) :- Goal, !, State = 1.
state_of_x(X, State) :- State = 0.

或更简洁,

state_of_x(X, 1) :- Goal, !.
state_of_x(X, 0).