有人可以告诉我一般使用',逐步的方法来找出datamart中的非聚合性字段。这是我发现的一个例子:
注意:斜体表示'键',粗体标识'','列'是'引用'
的别名关系架构:
CAL L( COD ,迄今为止,来自:S,TO:S,LEN)
SIM ( SIM ,USER:USER,TRIFF:T,BONUS)
TAR IFF( TARIFF ,CARRIER:CAR)
USE R( USER ,TOWN:TOW,LAST_TARIFF:TAR)
R OAMING_ CA LL( COD:CAL ,FOREIGN_CARRIER:CAR)
P ROMO_ CA LL( COD:CAL ,PROMO_TARIFF:P_TA)
P ROMO_的 TA RIFF(资费:TAR )
TOW N( TOWN ,NATION)
CAR RIER( CARRIER ,NATION)
请求: 为CALL'建立事实架构以下
尺寸: DATE,SIM_FROM,CALLED_CARRIER,FOREIGN_CARRIER,PROMO_TARIFF和
度量:AVG_CALL_LENGTH,NUM_OUTGOING_SIM(作为计数不同的FROM), NUM_INCOMING_SIM(作为计数不同于TO)
现在我可以绘制事实架构,但是我很难找到哪些度量可以聚合哪些维度
编辑: this是我所拥有的事实架构的pdf(抱歉没有使用严格的sintax,但包括阅读笔记)
措施:
Standard [obtained by the operational schema]:
NUM_INCOMING_CALLS = COUNT DISTINCT (TO)
UN-AGGREGABILITIES ->*THIS IS MY ISSUE*
Calculated [obtained by the operational schema, need partial data to add properly]:
AVG_CALL_LENGTH = CL_SUM/CL_COUNT
where
CL_SUM = SUM (LENGTH), CL_COUNT = COUNT(LENGTH)
UN-AGGREGABILITIES ->*THIS IS MY ISSUE*
Derived [can be found as a dimension]:
NUM_OUTGOING_CALLS = COUNT DISTINCT ( FROM )
UN-AGGREGABILITIES ->*THIS IS MY ISSUE*
答案 0 :(得分:-1)
好的,我去问我的老师:他给了我一个简单的算法:
给定模式D {D1,D2,D3,... Dn},对于Mesaure M = count distinct A n
如果A U X - > Di不是微不足道的,D的X子集是
X U A - > D1(真)
X U A - > D2(假)
X U A - > D3(真)
...
X U A - > Dn-1(假)我有NA = {D2,Dn-1}
NA:一组非聚合性