Wordnet查询返回例句

时间:2016-02-24 11:34:15

标签: mysql sql words wordnet lexicon

我有一个用例,我有一个词,我需要知道以下事项:

  1. 该词的同义词(只是同义词就足够了)
  2. 每个意义所包含的单词的所有意义 - 在该意义上与该单词匹配的同义词,在该意义上的示例句子(如果有),该意义的词性。
  3. 示例 - this query link。单词carry的屏幕截图:

    enter image description here

    对于每种感觉,我们都有词性(如V),与该感觉匹配的同义词,(如第一种意义上的transport,{{1} },在第二种意义上的pack等),在第一种意义上包含该词的例句(takeThis train is carrying nuclear waste等,第二种意义上的carry the suitcase to the car等感觉,等等。)。

    如何从Wordnet MySQL database执行此操作?我运行了这个查询,它返回了单词的含义列表:

    I always carry money

    如何获得每种感觉的同义词,例句,词性和特定于该意义的同义词?我查询了SELECT a.lemma, c.definition FROM words a INNER JOIN senses b ON a.wordid = b.wordid INNER JOIN synsets c ON b.synsetid = c.synsetid WHERE a.lemma = 'carry';vframesentences表,查看了包含vframesentencemaps等占位符的示例句子,并根据%s列我尝试将它们与wordid匹配表,但得到了错误的结果。

    编辑:

    对于单词words,如果我运行这些查询,我会正确地得到同义词和意义:

    carry

    所以我现在需要的是一种在41种感官中找到单词1. select * from words where lemma='carry' //yield wordid as 21354 2. select * from senses where wordid=21354 //yield 41 sysnsetids, like 201062889 3. select * from synsets where synsetid=201062889 //yields the explanation "serve as a means for expressing something" 4. select * from senses where synsetid=20106288` /yields all matching synonyms for that sense as wordids, including "carry" - like 21354, 29630, 45011 5. select * from words where wordid=29630 //yields 'convey' 的例句的方法。我该怎么做?

1 个答案:

答案 0 :(得分:2)

您可以从samples表中获取句子。 E.g:

SELECT sample FROM samples WHERE synsetid = 201062889;

的产率:

  

玛丽的画带着母爱

     

他的声音充满了愤怒

所以你可以按如下方式扩展你的查询:

SELECT 
    a.lemma AS `word`,
    c.definition,
    c.pos AS `part of speech`,
    d.sample AS `example sentence`,
    (SELECT 
            GROUP_CONCAT(a1.lemma)
        FROM
            words a1
                INNER JOIN
            senses b1 ON a1.wordid = b1.wordid
        WHERE
            b1.synsetid = b.synsetid
                AND a1.lemma <> a.lemma
        GROUP BY b.synsetid) AS `synonyms`
FROM
    words a
        INNER JOIN
    senses b ON a.wordid = b.wordid
        INNER JOIN
    synsets c ON b.synsetid = c.synsetid
        INNER JOIN
    samples d ON b.synsetid = d.synsetid
WHERE
    a.lemma = 'carry'
ORDER BY a.lemma , c.definition , d.sample;

注意:带有GROUP_CONCAT的子选择会将每个感觉的同义词作为逗号分隔列表返回到单行中,以减少行数。您可以考虑在单独的查询中返回这些内容(或者作为此查询的一部分,但在其他所有内容重复的情况下),如果愿意的话。

<强>更新 如果你真的需要同义词作为结果中的行,以下将会这样做但我不推荐它:同义词和例句都属于特定的定义,因此每个例句的同义词集将被复制。例如。如果特定定义有4个例句和5个同义词,那么结果将只有4 x 5 = 20行。

SELECT 
    a.lemma AS `word`,
    c.definition,
    c.pos AS `part of speech`,
    d.sample AS `example sentence`,
    subq.lemma AS `synonym`
FROM
    words a
        INNER JOIN
    senses b ON a.wordid = b.wordid
        INNER JOIN
    synsets c ON b.synsetid = c.synsetid
        INNER JOIN
    samples d ON b.synsetid = d.synsetid
        LEFT JOIN
    (SELECT 
        a1.lemma, b1.synsetid
    FROM
        senses b1
    INNER JOIN words a1 ON a1.wordid = b1.wordid) subq ON subq.synsetid = b.synsetid
        AND subq.lemma <> a.lemma
WHERE
    a.lemma = 'carry'
ORDER BY a.lemma , c.definition , d.sample;