"question_id": 58640
"tags": ["polls", "fun", "quotes"]
"title": "Great programming quotes"
"question_id": 184618
"tags": ["polls", "fun", "comment"]
"title": "What is the best comment in source code you have ever encountered?"
"question_id": 3734102
"tags": ["c++", "linux", "exit-code"]
"title": "Why cant' I return bigger values from main function ?"
"question_id": 2349378
"tags": ["communication", "terminology", "vocabulary"]
"title": "New programming jargon you coined?"
"question_id": 3723817
"tags": ["open-source", "project-management", "failure", "fail"]
"title": "How to make an open source project fail"
"question_id": 3699150
"tags": ["testing", "interview-questions", "job-interview"]
"title": "Interview question please help"
这只是一个文本提取我使用SO API得到的一些问题。
为了使此查询能够,我想使用SQLite来存储数据。
我应该如何存储标记列?
由于SO的限制是五个标签,我可以使用五列tag1,tag2 ......,但我认为可以做更优雅的事情。可以扩展到任意数量标签的东西,也可以处理基本查询,如
select title from table where tag has "c++" and "boost" but not "c"
答案 0 :(得分:2)
这是一个多对多的关系:问题有多个标签,标签可以出现在多个问题中。这意味着您必须创建三个表,一个用于问题,一个用于标记,另一个用于表格之间的链接。生成的查询如下所示:
SELECT title FROM question
INNER JOIN question_tag_link USING (question_id)
INNER JOIN tag USING (tag_id)
WHERE tag_name IN('c++', 'boost')
AND NOT EXISTS(
SELECT * FROM tag t1
WHERE t1.tag_name = 'c'
AND t1.question_id = question.question_id);
不是那么简单,但我认为如果你不想受到限制,这是付出的代价。如果少于64个不同的标签,您可以使用SET字段类型,但是您会失去很大的灵活性(很难添加新标签)。
答案 1 :(得分:1)
select distinct a.QuestionTitle
from
(
select q.QuestionID, QuestionTitle, TagName
from QuestionTags as x
join Question as q on q.QuestionID = x.QuestionID
join Tag as t on t.TagID = x.TagID
where TagName in ('c++', 'boost')
) as a
left join
(
select q.QuestionID, QuestionTitle, TagName
from QuestionTags as x
join Question as q on q.QuestionID = x.QuestionID
join Tag as t on t.TagID = x.TagID
where TagName = 'c'
) as b on b.QuestionID = a.QuestionID
where b.QuestionTitle is null
order by a.QuestionTitle ;