这里的第一个问题所以请温柔:)
我正在努力解决从我们的OLTP数据库中提取信息的问题,这些数据库存储了几种类型的信息,包括给出问题的多项选择答案。这些答案为我们提供了很好的见解,因此我们希望将它存储在我们的Datawarehouse中。
挑战在于所有勾选框的答案都存储为一个整数值。虽然这可能是一个优雅的编程解决方案,并且可以实时显示值,但在处理数据仓库的数据时却没那么有用。
这就是答案数据的存储方式:
customer question answer
----------- ----------- -----------
1 1 6
2 1 2
3 1 62
过了一会儿,我发现它存储了答案的SUM(),其中SUM将是2 ^位置。如下面的示例所示:
question answer_desc position answer_value
----------- -------------------- ----------- ------------
1 a 1 2
1 b 2 4
1 c 3 8
1 d 4 16
1 e 5 32
其中给出了以下答案:
customer 1 will have answers 'a' and 'b' to question 1
customer 2 will have answer 'a' to question 1
customer 3 will have answers 'a', 'b', 'c', 'd' and 'e' to question 1
我已经提出了一个数学公式,用于从答案中提取最高可能的2 ^ n值,从总计中提取它,得到所提供答案的每个勾选框并将其放入函数中:
ALTER FUNCTION [dbo].[ZZ_answers] (@input1 VARCHAR(50),@input2 BIGINT)
RETURNS @Table TABLE
(
inputwaarde varchar(50) null,
waarde int NOT NULL,
exponent int NOT NULL
)
AS
BEGIN
--DECLARE @waarde BIGINT
DECLARE @exponent INTEGER
DECLARE @output INTEGER
DECLARE @waarde BIGINT
SET @waarde = @input2
--SET @waarde = 147848218 -- SELECT
--SET @exponent = 0
WHILE @waarde >0
BEGIN
IF @waarde >0
BEGIN
SET @exponent = (
SELECT
FLOOR(
(
LOG(@waarde)/LOG(2)
)
)
)
SET @waarde = @waarde - (
SELECT
POWER(2,
FLOOR(
(
LOG(@waarde)/LOG(2)
)
)
)
)
INSERT @Table
SELECT rtrim(@input1),@input2,@exponent;
END
END
RETURN
END
我想问一下在填充我的Datawarehouse时使用它的最佳方法是什么。
目前我脑子里有两种方法:
1)从我的答案表中选择所有不同的答案值,并使用上面的函数生成系统当前正在使用的所有可能答案。在填充Datawarehouse时,我会将其作为SSIS包中ETL过程的一部分来实现。尽管如此,这将给出准确的结果,实时处理每个结果会使性能变得很重,从而减慢Datawarehouse的生成。我们的答案表有大约11百万条记录并且还在增长。
2)根据问题表生成包含所有可能答案值的新表。我必须遍历每个问题和勾选框中的可能变化,并为该特定组合提供正确的答案值。不用说,这将是为每个可能的组合生成所有x ^ y答案的繁重操作。然而,这将导致数据表,然后可以将其用于处理数据仓库。每当生成新问题时,我们都需要重新生成答案表。这种可能性很小。
你鼓励使用哪两个?如果我选择选项2,我将如何尽可能有效地循环我的问题?还有其他选择我没有看到吗?
答案 0 :(得分:1)
以下是使用按位运算符提取所需内容的示例:
然后,根据您的需要进行旋转和处理。
SELECT Customer, answer,
answer & POWER(2,0) pos1,
answer & POWER(2,1) pos2,
answer & POWER(2,2) pos3,
answer & POWER(2,3) pos4,
answer & POWER(2,4) pos5,
answer & POWER(2,5) pos6,
answer & POWER(2,6) pos7
from (
SELECT 1 Customer, 6 answer
union all
SELECT 1 Customer, 2 answer
union all
SELECT 1 Customer, 62 answer
) F
答案 1 :(得分:0)
所以下面的回答适用于我的问题。它基于选定的答案。
使用所选答案我能够创建一个带有每个答案可能性的数据透视表:
Select * FROM(
SELECT itemid, answer,
answer & POWER(2,0) pos01,
answer & POWER(2,1) pos02,
answer & POWER(2,2) pos03,
answer & POWER(2,3) pos04,
answer & POWER(2,4) pos05,
answer & POWER(2,5) pos06,
answer & POWER(2,6) pos07,
answer & POWER(2,7) pos08,
answer & POWER(2,8) pos09,
answer & POWER(2,9) pos10,
answer & POWER(2,10) pos11,
answer & POWER(2,11) pos12,
answer & POWER(2,12) pos13,
answer & POWER(2,13) pos14,
answer & POWER(2,14) pos15,
answer & POWER(2,15) pos16,
answer & POWER(2,16) pos17,
answer & POWER(2,17) pos18,
answer & POWER(2,18) pos19,
answer & POWER(2,19) pos20,
answer & POWER(2,20) pos21,
answer & POWER(2,21) pos22,
answer & POWER(2,22) pos23,
answer & POWER(2,23) pos24,
answer & POWER(2,24) pos25,
answer & POWER(2,25) pos26,
answer & POWER(2,26) pos27,
answer & POWER(2,27) pos28,
answer & POWER(2,28) pos29,
answer & POWER(2,29) pos30
from (
SELECT itemid,data1 as answer from Dossieritems where data1 != 0
) P ) pvt
结果:
itemid answer pos01 pos02 pos03 pos04 pos05 pos06 pos07 pos08 pos09 pos10 pos11 pos12 pos13 pos14 pos15 pos16 pos17 pos18 pos19 pos20 pos21 pos22 pos23 pos24 pos25 pos26 pos27 pos28 pos29 pos30
----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- -----------
498 512 0 0 0 0 0 0 0 0 0 512 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
499 512 0 0 0 0 0 0 0 0 0 512 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
500 512 0 0 0 0 0 0 0 0 0 512 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
501 512 0 0 0 0 0 0 0 0 0 512 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
502 512 0 0 0 0 0 0 0 0 0 512 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
503 512 0 0 0 0 0 0 0 0 0 512 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
520 512 0 0 0 0 0 0 0 0 0 512 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
548 512 0 0 0 0 0 0 0 0 0 512 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
549 512 0 0 0 0 0 0 0 0 0 512 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1330 512 0 0 0 0 0 0 0 0 0 512 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1331 512 0 0 0 0 0 0 0 0 0 512 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1332 512 0 0 0 0 0 0 0 0 0 512 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1366 512 0 0 0 0 0 0 0 0 0 512 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1422 512 0 0 0 0 0 0 0 0 0 512 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1238 512 0 0 0 0 0 0 0 0 0 512 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1240 512 0 0 0 0 0 0 0 0 0 512 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1300 512 0 0 0 0 0 0 0 0 0 512 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
259234 333405704 0 0 0 8 0 0 0 0 0 512 1024 2048 4096 0 16384 0 65536 131072 262144 524288 1048576 0 4194304 8388608 16777216 33554432 0 0 268435456 0
259237 536829448 0 0 0 8 0 0 0 0 0 512 1024 2048 4096 0 16384 0 65536 131072 262144 524288 1048576 2097152 4194304 8388608 16777216 33554432 67108864 134217728 268435456 0
259238 333102366 0 2 4 8 16 0 0 0 256 0 1024 2048 4096 8192 0 32768 0 131072 0 524288 1048576 0 4194304 8388608 16777216 33554432 0 0 268435456 0
259239 400211226 0 2 0 8 16 0 0 0 256 0 1024 2048 4096 8192 0 32768 0 131072 0 524288 1048576 0 4194304 8388608 16777216 33554432 67108864 0 268435456 0
259240 333102366 0 2 4 8 16 0 0 0 256 0 1024 2048 4096 8192 0 32768 0 131072 0 524288 1048576 0 4194304 8388608 16777216 33554432 0 0 268435456 0
259245 400211226 0 2 0 8 16 0 0 0 256 0 1024 2048 4096 8192 0 32768 0 131072 0 524288 1048576 0 4194304 8388608 16777216 33554432 67108864 0 268435456 0
259257 333102366 0 2 4 8 16 0 0 0 256 0 1024 2048 4096 8192 0 32768 0 131072 0 524288 1048576 0 4194304 8388608 16777216 33554432 0 0 268435456 0
259263 390741246 0 2 4 8 16 32 64 128 0 0 1024 2048 4096 8192 0 0 0 131072 0 524288 0 0 4194304 0 16777216 33554432 67108864 0 268435456 0
259270 333102366 0 2 4 8 16 0 0 0 256 0 1024 2048 4096 8192 0 32768 0 131072 0 524288 1048576 0 4194304 8388608 16777216 33554432 0 0 268435456 0
259277 333102366 0 2 4 8 16 0 0 0 256 0 1024 2048 4096 8192 0 32768 0 131072 0 524288 1048576 0 4194304 8388608 16777216 33554432 0 0 268435456 0
259279 333102366 0 2 4 8 16 0 0 0 256 0 1024 2048 4096 8192 0 32768 0 131072 0 524288 1048576 0 4194304 8388608 16777216 33554432 0 0 268435456 0
259280 333102366 0 2 4 8 16 0 0 0 256 0 1024 2048 4096 8192 0 32768 0 131072 0 524288 1048576 0 4194304 8388608 16777216 33554432 0 0 268435456 0
259286 390741246 0 2 4 8 16 32 64 128 0 0 1024 2048 4096 8192 0 0 0 131072 0 524288 0 0 4194304 0 16777216 33554432 67108864 0 268435456 0
下一步是Unpivotting结果和(在我的情况下)将tablename转换为可连接的整数:
Select *, CONVERT(int,SUBSTRING(pos,4,2)) as positie FROM(
SELECT itemid, answer,
answer & POWER(2,0) pos01,
answer & POWER(2,1) pos02,
answer & POWER(2,2) pos03,
answer & POWER(2,3) pos04,
answer & POWER(2,4) pos05,
answer & POWER(2,5) pos06,
answer & POWER(2,6) pos07,
answer & POWER(2,7) pos08,
answer & POWER(2,8) pos09,
answer & POWER(2,9) pos10,
answer & POWER(2,10) pos11,
answer & POWER(2,11) pos12,
answer & POWER(2,12) pos13,
answer & POWER(2,13) pos14,
answer & POWER(2,14) pos15,
answer & POWER(2,15) pos16,
answer & POWER(2,16) pos17,
answer & POWER(2,17) pos18,
answer & POWER(2,18) pos19,
answer & POWER(2,19) pos20,
answer & POWER(2,20) pos21,
answer & POWER(2,21) pos22,
answer & POWER(2,22) pos23,
answer & POWER(2,23) pos24,
answer & POWER(2,24) pos25,
answer & POWER(2,25) pos26,
answer & POWER(2,26) pos27,
answer & POWER(2,27) pos28,
answer & POWER(2,28) pos29,
answer & POWER(2,29) pos30
from (
SELECT itemid,data1 as answer from Dossieritems where data1 != 0
) P ) pvt
UNPIVOT
(ans for pos IN
(pos01,pos02,pos03,pos04,pos05,pos06,pos07,pos08,pos09,pos10,pos11,pos12,pos13,pos14,pos15,pos16,pos17,pos18,pos19,pos20,pos21,pos22,pos23,pos24,pos25,pos26,pos27,pos28,pos29,pos30)
)AS unpvt --INNER JOIN Diagsoorten DS ON (DS.Positie = CONVERT(int,SUBSTRING(pos,4,2)))
Where ans != 0
此代码能够在一分钟内处理900万个结果。
最终结果如下:
itemid answer ans pos positie
----------- ----------- ----------- -------------------------------------------------------------------------------------------------------------------------------- -----------
498 512 512 pos10 10
499 512 512 pos10 10
500 512 512 pos10 10
501 512 512 pos10 10
502 512 512 pos10 10
503 512 512 pos10 10
520 512 512 pos10 10
548 512 512 pos10 10
549 512 512 pos10 10
1330 512 512 pos10 10
1331 512 512 pos10 10
1332 512 512 pos10 10
1366 512 512 pos10 10
1422 512 512 pos10 10
1238 512 512 pos10 10
1240 512 512 pos10 10
1300 512 512 pos10 10
259234 333405704 8 pos04 4
259234 333405704 512 pos10 10
259234 333405704 1024 pos11 11
259234 333405704 2048 pos12 12
259234 333405704 4096 pos13 13
259234 333405704 16384 pos15 15
259234 333405704 65536 pos17 17
259234 333405704 131072 pos18 18
259234 333405704 262144 pos19 19
259234 333405704 524288 pos20 20
259234 333405704 1048576 pos21 21
259234 333405704 4194304 pos23 23
259234 333405704 8388608 pos24 24
259234 333405704 16777216 pos25 25
259234 333405704 33554432 pos26 26
259234 333405704 268435456 pos29 29
谢谢ElectricLlama!