我有一个单词集及其频率的数据集,例如
w1 w2 w3 freq
a a a 4
a a and 3
a a band 1
a a well 1
a and a 2
我想根据下表得出观察结果:
(w3) not(w3)
(w1,w2) n1 n2
not(w1,w2) n3 n4
其中n1,...,n4是满足条件的观测频率之和。例如,在第一次观察中,w1 = a,w2 = a,w3 = a。我们现在将检查所有观察结果,其中w1 = a,w2 = a,w3 = a。我们只发现一个观察符合该标准并且其频率为4.接着我们做w1 = a,w2 = a,w3!= a并且给出了频率为3,1,1且总和为5的观测值。现在我们将做w1!= a,w2!= a,w3 = a为0且w1!= a,w2!= a,w3!= a为0。
我想要一个表格输出为:
w1 w2 w3 freq n1 n2 n3 n4
a a a 4 4 5 0 0
a a and 3 3 6 0 0
a a band 1
a a well 1
a and a 2
etc.
如何使用sqlite3实现此目的?
答案 0 :(得分:1)
这可以通过相关的标量子查询来完成:
SELECT w1,
w2,
w3,
freq,
(SELECT SUM(freq)
FROM MyLittleTable AS T2
WHERE T2.w1 = T1.w1
AND T2.w2 = T1.w2
AND T2.w3 = T1.w3
) AS n1,
(SELECT SUM(freq)
FROM MyLittleTable AS T2
WHERE T2.w1 = T1.w1
AND T2.w2 = T1.w2
AND T2.w3 != T1.w3
) AS n2,
...
FROM MyLittleTable AS T1