我使用Stanford NLP库分析了3本书。我在页面基础上运行我的分析,对于每本书,这是我得到的输出:
// An array of length P, where P is the total number of pages in the book
// so that pageSentiment[0] represents the sentiment of the page 1.
float[] pageSentiment
// An array of length P, where P is the total number of pages in the book
// so that pageWords[0] represents the number of words in the page 1.
int[] pageWords
// An array of length W, where W is the number of unique words in the book
// where, for example, bookWords[0] has the following values
// word = "then"
// data[0] = {1, 1, 2} => the word "then" occurs 2 times in page 1 (associated to chapter 1)
// data[1] = {1, 2, 1} => the word "then" occurs 1 times in page 2 (associated to chapter 1)
// data[2] = {1, 3, 0} => the word "then" occurs 0 times in page 3 (associated to chapter 1)
// data[3] = {1, 4, 0} => the word "then" occurs 0 times in page 4 (associated to chapter 1)
// data[4] = {2, 5, 3} => the word "then" occurs 3 times in page 5 (associated to chapter 2)
// data[5] = ...
struct WordData { string word; int[,,] data; }
WordData[] bookWords
现在......我必须将所有这些结果存储到SQL数据库中,以便我可以访问它以绘制网页中的图形和统计表格。现在,我想弄清楚的是以灵活的方式存储所有这些值的正确方法,这样我就可以轻松地向数据库发送不同的查询,以获得符合我当前需求的不同输出。例如......我需要能够:
请问有关SQL表结构的任何建议吗?
答案 0 :(得分:1)
只有3张桌子
book
---
book_id
title
...
word
---
word_id
text
...
和包含结果的多对多表
word_2_book
---
word_id
book_id
page_no
chapter_no
word_count
然后只是
select *
from word_2_book wb
where wb.book_id=? and wb.word_id=?
您可以应用任何聚合函数