我有以下代码:
data dummy ;
input A $ B $ C $ D $ v1 v2 v3 v4 ;
cards ;
ab ba cf dm 1 2 3 4
ab bc cf dm 5 6 7 8
ab bc cf dm 1 2 3 4
ab bc cg dm 9 0 1 2
ac bd cg dm 3 4 5 6
;run ;
%macro lup;
proc sql noprint;
select distinct compress(a!!"_"!!b!!"_"!!c!!"_"!!d) into :dataset1-:dataset99999
from dummy;
quit;
%put &sqlobs;
data
%do i=1 %to &sqlobs;
&&dataset&i
%end;
;
set dummy;
%do i=1 %to &sqlobs;
if compress(a!!"_"!!b!!"_"!!c!!"_"!!d)="&&dataset&i" then output &&dataset&i;
%end;
run;
%mend;
%lup;
给出以下错误:
SELECT ta.application as koekkoek, ta.ipc, ipc_count/ipc_tot as ipc_share, t3.sfields FROM (
select t1.appln_id as application, t1.ipc_subclass_symbol as ipc, count(t2.appln_id) as ipc_count, sum(ipc_count) over (PARTITION BY application) as ipc_tot
FROM temp.tls209_small t1
CROSS JOIN
(SELECT appln_id, FROM temp.tls209_small group by appln_id ) t2
where t1.appln_id = t2.appln_id
GROUP BY application, ipc
) as ta
CROSS JOIN thesis.ifris_ipc_concordance t3
WHERE ta.ipc LIKE t3.ipc+'%'
AND ta.ipc NOT LIKE t3.not_ipc+'%'
AND t3.not_appln_id NOT IN
(SELECT ipc_subclass_symbol from temp.tls209_small t5 where t5.appln_id = ta.application)
我已尝试过该字段的多种表示法,但BigQuery似乎没有识别出对子查询中其他表的任何引用。
代码的目的是根据一致性表将新技术分类分配给记录:
我有两张桌子:
一个包含应用程序ID,分类和其他一些内容的大表Field 'ta.application' not found.
:
包含一些例外规则tls209_small
的索引表:
最后,我需要为ifris_ipc_concordance
(3亿行)中的每一行分配sfields
标签。规则是第一个表中的tls209
应该与第二个表中的ipc_class_symbol+'%'
类似,但不像ipc
。
此外,not_ipc
值(如果存在)不应与第一个表中的相同appln_id相关联。
这是一个小例子,说这是查询的输入:
not_appln_id
appln_id 1应该得到两次sfields X因为ipc = A,not_ipc匹配A1和A3。 在appln_id 1中出现A3时,不应该分配Y。
在结果中,我还需要单个应用程序appln_id | ipc_class_symbol
1 | A1
1 | A2
1 | A3
1 | C3
sfields | ipc | not_ipc | not_appln_id
X | A | A2 | null
Y | A | null | A3
的份额(1表示328100001,0.5表示32100009等)。
没有最后一个条件(ipc_class_symbol
),查询工作正常:
有关如何让子查询识别应用程序ID(ta.application)的任何建议,或其他将最后一个条件引入查询的方法?
我意识到我对问题的解释可能不是很简单,所以如果有任何不清楚的地方请说清楚,我会尽力澄清问题。
答案 0 :(得分:1)
您正在执行的查询是进行反加入。您可以将其重写为显式连接,但它有点冗长:
UpLinearLayout up = new UpLinearLayout(context, 65);
up.anotherButton(context);
答案 1 :(得分:1)
通过首先生成一个表,我只匹配第一个表中的ipc_class_symbol
到第二个表的ipc
列,但也包括not_ipc
来实现该问题的有效解决方案。来自第二个的{}和not_appln_id
列。此外,使用GROUP_CONCAT
方法添加了分配给每个appln_id的所有ipc类标签的列表。
最后,在Pentium10的帮助下,生成的表格已根据例外规则进行过滤,这也在this question中进行了讨论。
在最终查询中,GROUP BY和JOIN参数需要EACH修饰符以允许处理大表:
SELECT application as appln_id, ipc as ipc_class, ipc_share, sfields as ifris_class FROM (
SELECT * FROM (
SELECT ta.application as application, ta.ipc as ipc, ipc_count/ipc_tot as ipc_share, t3.sfields as sfields, t3.ipc as yes_ipc, t3.not_ipc as not_ipc, t3.not_appln_id as exclude, t4.classes as other_classes FROM (
SELECT t1.appln_id as application, t1.ipc_class_symbol as ipc, count(t2.appln_id) as ipc_count, sum(ipc_count) over (PARTITION BY application) as ipc_tot
FROM thesis.tls209_appln_ipc t1
FULL OUTER JOIN EACH
(SELECT appln_id, FROM thesis.tls209_appln_ipc GROUP EACH BY appln_id ) t2
ON t1.appln_id = t2.appln_id
GROUP EACH BY application, ipc
) AS ta
LEFT JOIN EACH (
SELECT appln_id, GROUP_CONCAT(ipc_class_symbol) as classes FROM [thesis.tls209_appln_ipc]
GROUP EACH BY appln_id) t4
ON ta.application = t4.appln_id
CROSS JOIN thesis.ifris_ipc_concordance t3
WHERE ta.ipc CONTAINS t3.ipc
) as tx
WHERE (not ipc contains not_ipc or not_ipc is null)
AND (not other_classes contains exclude or exclude is null or other_classes is null)
)