Question

我在尝试从斯坦福NLP注释的句子中提取信息时遇到了一些问题。我是NLP图书馆的新手，作为protege！这是我正在解析的句子： String myTranslation =＆＃34;来自急诊室的患者已经通过garat（PS18，PEEP5，FiO2 0.6）通气，住院治疗。监测血压77/40，心率113和不可检测的SPO2。在真空中通气，然后在给药后进行气管内插管：咪达唑仑8mg，Fentanest 100，Nimbex 10mg。管直径7,5不带孔。从TET分泌的Aspire，放入机械通气VT 450ml，FR 14，FiO2 0.6。继续超声引导右颈内静脉插管并要求进行胸部X光检查＆＃34 ;;

这是我现在使用的算法：

List<CoreMap> sentences = processor.getAnnotatedSentences(text);
    if (sentences != null && ! sentences.isEmpty()) {
        for(CoreMap sentence : sentences)
        {
            System.out.println("The first sentence is:");
            System.out.println(sentence.toShorterString())
            SemanticGraph sg = sentence.get(SemanticGraphCoreAnnotations.BasicDependenciesAnnotation.class);
            for(IndexedWord iWord: sg.vertexListSorted())
            {
                String value = iWord.originalText();
                String pos = iWord.get(CoreAnnotations.PartOfSpeechAnnotation.class);
               //In the creation of an entity frame I consider only verbs and nouns
                if(!(pos.startsWith("NN")||pos.startsWith("VB")))
                    continue;
                //Here I check if the term I am analyzing is a medical term (checking with the UMLS methathesaurus)
                List<Entity> entities = matcher.getEntities(value);
                Set<String> set = null;
                //for (Ev ev: entities.get(0).getEvSet()) {
                //   set = el.getSemanticTypeSet(ev.getConceptInfo().getCUI());
                //}
                //String semanticType = set.toString();
                if(entities.isEmpty())
                    continue;
                else
                {
                   //then I collect the typed dependency
                    Collection<TypedDependency> deps = sg.typedDependencies();
                    for(TypedDependency dep: deps)
                    {
                        System.out.println(dep.toString());
                    }
                    Frame frame = new Frame(value,"",pos);
                    //And fetch the other words I think I should relate to the one I am analyizing : for instance, if I am analyzing patient, looking at the sentence, I should associate with it: emergency room, ventilated, garat(...)
                    Set<IndexedWord> descWords = sg.descendants(iWord);
                    //Here i try to merge compound words like emergency and room into a single one
                    addCompoundNamesToFrame("compound", frame, descWords, deps);
                    //addCompoundNamesToFrame("nummod", frame, descWords, deps);
                    for(IndexedWord desc : descWords)
                    {
                        if(desc.originalText().equals(value))
                            continue;
                        //"Stopword removal"
                        if(!filter(desc))
                        {
                            /*List<Entity> subEntities = matcher.getEntities(desc.word());
                            if(subEntities.isEmpty())
                                continue;*/
                        // I add the extracted informations to the frame
                            frame.addInfo(desc.originalText(), null);
                        }
                    }
                    System.out.println("Frame:"+frame);
                }

             }
        }
    }
private void addCompoundNamesToFrame(String relName, Frame frame, Set<IndexedWord> descWords, Collection<TypedDependency> deps)
{
    for(TypedDependency dep : deps)
    {
        if(dep.reln().getShortName().equals(relName))
        {
            IndexedWord d = dep.dep();
            IndexedWord g = dep.gov();
            if(!filter(g)&&!filter(d))
            {
                frame.addInfo(d.originalText()+" "+g.originalText(), null);
            }
            descWords.remove(d);
            descWords.remove(g);
        }
    }
}

现在，我用第一句话输出显示我的问题：框架：术语：患者，[信息：急诊室，信息：来，信息：通风，信息：garat，信息:(，信息：PS18，信息：PEEP5，信息：FiO2，信息：0.6，信息:)] 框架：术语：紧急，[信息：急诊室]

我的第一个问题是我想把garat（PS18，PEEP5，FiO2 0.6）作为一个单词来看待依赖我不知道如何做到这一点。

现在，对于第二句话，我得到了这些：框架：期限：插管，[信息：气管] 框架：术语：管理，[信息::信息：Midazolan，日期：，信息：8mg，信息：Fentanest，信息：100，信息：Nimbex，信息：10mg] 框架：期限：FENTANEST，[信息：100] 帧：期限：Nimbex，[信息：10毫克]

从这些依赖项：

 BasicDependencies=-> Ventilated/VBN (root)
  -> vacuum/NN (nmod)
    -> in/IN (case)
  -> and/CC (cc)
  -> proceeded/VBD (conj)
    -> then/RB (advmod)
    -> intubation/NN (nmod)
      -> to/TO (case)
      -> endotracheal/JJ (amod)
    -> administration/NN (nmod)
      -> after/IN (case)
      -> 8mg/NN (nmod)
        -> of/IN (case)
        -> :/: (punct)
        -> Midazolan/JJ (amod)
        -> ,/, (punct)
        -> Fentanest/NN (conj)
          -> 100/CD (nummod)
        -> ,/, (punct)
        -> Nimbex/NNP (conj)
          -> 10mg/CD (nummod)
  -> ./. (punct)

当我想得到的是：框架：术语：插管，[信息：气管插管，信息：咪达唑仑8mg，信息：Fentantest 100，Nimbex 10mg] 因此，我想将药物的名称与剂量结合起来，并将它们视为适当词语的重要术语！另外，我组合复合词的方式很好，但有时我需要将复合词与nummod结合起来，我不知道如何重新处理单词，再将它作为一个单词添加到集合中，以添加有关它的更多信息。例如：框架：术语：压力，[信息：血压，信息：心率，信息：77/40，日期：，信息：113，信息：和，信息：不可检测，信息：SPO2，信息：监控，] bool压力是两个令牌bool和压力的组合，在我合并它们以在我的框架中添加单个信息后，我需要重新处理它以添加与其相关的77/40 nummod信息，但不知道怎么样！

现在，我确实意识到这根本不是微不足道的。我向您展示了我的示例，以便您能够理解我的问题，我不会要求您完全解决我的问题，但即使是一些参考或示例或信息也可能非常有用！

从nlp数据构建一个protege框架

0 个答案: