SQL:如何查询一组数据并计算给定字符串列表中匹配的字符串数

时间:2015-10-29 01:44:40

标签: sql postgresql

我建立了一个简单的问答系统。

在我的数据库中,有三个表:

.module {
    width: 100%;
    height: auto;
    background-color: #ffffff;
    margin-bottom: 20px;
}
.module-header {
    width: 100%;
    height: 45px;
    background-color: #3B6C8E;
    border-left: 1px solid #3B6C8E;
    border-right: 1px solid #3B6C8E;
}
.module-header h4 {
    font-family: RobotoRegular, 'Helvetica Neue', Helvetica, sans-serif;
    font-size: 14px;
    color: #ffffff;
    text-shadow: 1px 1px 1px rgba(0,0,0,0.004);
    line-height: 45px;
    padding-left: 10px;
    margin: 0;
}
.module-padding {
    padding: 10px;
}
.module-body {
    display: block;
    background-color: #F5F5F5;
}
.module-body p {color: #222222;}

.table { width: 100%; margin-bottom: 0; display: block; }
.table thead { width: 100%; background-color: rgb(51, 51, 51); border-left: 1px solid rgb(51, 51, 51); border-right: 1px solid rgb(51, 51, 51); }
.table thead th { color: #ffffff; font-family: RobotoRegular, 'Helvetica Neue', Helvetica, sans-serif; font-weight: normal; font-size: 14px; }
.table tbody { width: 100%; }
.table tbody tr { border: 1px solid #ccc; }
.table tbody td { font-family: RobotoRegular, 'Helvetica Neue', Helvetica, sans-serif; font-weight: normal; font-size: 12px; }

.width5 { width: 5%; }
.width10 { width: 10%; }
.width15 { width: 15%; }
.width20 { width: 20%; }
.width30 { width: 30%; }
.width45 { width: 45%; }
.width60 { width: 60%; }
.width65 { width: 65%; }
.width63 { width: 63%; }
.width70 { width: 70%; }
.width75 { width: 75%; }

现在我有一个问题:

question (
  id         int
  question varchar(200)
  answer_id  int  /* foreign key mapping to answer.id */
);

answer (
  id  int
  answer    varchar(500)
)

question_elements (
    id    int
    seq   int    /*vocabulary in question location */
    question_id    int  /** foreign key mapping to question.id */
    vocabulary  varchar(40)
)

所以在表格问题中,记录是:

What approach should a company adopt when its debt ratio is higher than 50% and wanna continue to get funding ?

表格question_elements

question {
  id: 1,
  question:"What approach should a company adopt when its debt ratio is higher than 50% and wanna continue to get funding ?",
  answer_id:1
}

现在,当用户输入:

question_elements [
  {
    id: 1,
    seq: 1,
    question_id: 1,
    vocabulary: "what"
  },
  {
    id: 2,
    seq: 2,
    question_id: 1,
    vocabulary: "approach"
  },
  {
    id: 3,
    seq: 3,
    question_id: 1,
    vocabulary: "should"
  },
  {
    id: 4,
    seq: 4,
    question_id: 1,
    vocabulary: "a"
  },
  {
    id: 5,
    seq: 5,
    question_id: 1,
    vocabulary: "company"
  },
  {
    id: 6,
    seq: 6,
    question_id: 1,
    vocabulary: "adopt"
  },
  {
    id: 7,
    seq: 7,
    question_id: 1,
    vocabulary: "when"
  },
  ....
  ....
  {
    id: 19,
    seq: 19,
    question_id: 1,
    vocabulary: "get"
  },
  {
    id: 20,
    seq: 20,
    question_id: 1,
    vocabulary: "funding"
  }
]

我的想法是将上述语句拆分为字符串列表,并执行SQL查询,以便通过给出上面的字符串列表来计算表question_elements中的匹配字符串。

PostgreSQL中的SQL语句是什么?

2 个答案:

答案 0 :(得分:0)

如果我理解得很好,你想要这样的东西:

WITH answer AS (
    SELECT 
        'What action does a company should do when it wanna get more funding' AS a
),
question AS (
    SELECT 'what' AS q
    UNION ALL SELECT 'should'
    UNION ALL SELECT 'a'
    UNION ALL SELECT 'company'
    UNION ALL SELECT 'do'
    UNION ALL SELECT 'when'
)
SELECT COUNT(result)
FROM (
    SELECT unnest(string_to_array(CAST(a AS VARCHAR),' ')) AS result
    FROM answer
) AS tbaux
WHERE result IN (select CAST(q AS VARCHAR) FROM question);

没有Text Capitalization,以及一些解释:

SELECT COUNT(result)
FROM (                                                 --count how many lines have in the subquery
    SELECT unnest(string_to_array(CAST(a AS VARCHAR),' ')) AS result        --this break the user input in one word per line, excluding ' '
    FROM answer
) AS tbaux                                                                  --name of the subquery
WHERE upper(result) IN (select upper(CAST(q AS VARCHAR)) FROM question);    --upper turns lowercase letters in uppercase, only the line who match will remain to the COUNT()

这会计算用户输入中有多少单词在问题表中(在您的情况下为question_elements

http://sqlfiddle.com/#!15/9eecb7db59d16c80417c72d1e1f4fbf1/4095/0

答案 1 :(得分:0)

不需要CREATE TABLE `sessions` ( `sess_id` VARBINARY(128) NOT NULL PRIMARY KEY, `sess_data` BLOB NOT NULL, `sess_time` INTEGER UNSIGNED NOT NULL, `sess_lifetime` MEDIUMINT NOT NULL ) COLLATE utf8_bin, ENGINE = InnoDB; 表。

question_elements