早上好 我有以下查询:
SELECT DISTINCT c.cname AS component,
Sum(w.timeworked / 3600) OVER () AS sum_tipo,
Sum(w.timeworked / 3600) OVER (partition BY c.cname) AS sum_by_component
FROM jira.jiraissue j,
jira.worklog w,
jira.project p,
jira.issuetype t,
jira.component c,
jira.nodeassociation na ,
jira.cwd_user u
WHERE w.issueid=j.id
AND j.project=p.id
AND na.source_node_id = j.id
AND na.source_node_entity = 'Issue'
AND na.sink_node_id=c.id
AND t.id=j.issuetype
And w.author= u.lower_user_name
AND w.author in ( select distinct author from jira.worklog where author in (select distinct lower_user_name from jira.cwd_user where display_name in ('Ilanas ejemplo')))
AND p.pname= 'Area Económica'
AND t.pname= 'Peticion'
AND w.startdate >='01/01/2018'
AND w.startdate <='17/10/2018'
我已验证此查询在某些小数据中是不平衡的。
在对表格进行了一些调查之后,我验证了以下内容:
不应有具有相同source_node_id的行(这是用户错误)。我想在sql中做一个独特的或类似的事情,即在同一行中使用相同代码的情况下,只需考虑一行
答案 0 :(得分:0)
您的问题是,如果不对数据做一些“妥协”,就不可能做到这一点。 在您的表中,它表明“重复的行”具有不同的sink_node_id(10328和10320或10326-无法正确看到数字)。
如果您准备进行“折衷”操作,那么这里的查询将对查询进行绝对的最小更改,从而使“折衷”在重复时选择值最大的sink_node_id。
SELECT DISTINCT c.cname AS component,
Sum(w.timeworked / 3600) OVER () AS sum_tipo,
Sum(w.timeworked / 3600) OVER (partition BY c.cname) AS sum_by_component
FROM jira.jiraissue j,
jira.worklog w,
jira.project p,
jira.issuetype t,
jira.component c,
(SELECT source_node_id, source_node_entity
,max(sink_node_id) as sink_node_id
FROM jira.nodeassociation
GROUP BY source_node_id, source_node_entity) na,
jira.cwd_user u
WHERE w.issueid=j.id
AND j.project=p.id
AND na.source_node_id = j.id
AND na.source_node_entity = 'Issue'
AND na.sink_node_id=c.id
AND t.id=j.issuetype
And w.author= u.lower_user_name
AND w.author in ( select distinct author
from jira.worklog
where author
in (select distinct lower_user_name
from jira.cwd_user
where display_name in ('Ilanas ejemplo')))
AND p.pname= 'Area Económica'
AND t.pname= 'Peticion'
AND w.startdate >='01/01/2018'
AND w.startdate <='17/10/2018'
但是请记住,此“技巧”可能会生成错误的数据。但是,如果您急于不能纠正错误的用户输入,则可以使用MIN和MAX聚合函数将报告与此查询进行比较,并确定是否足够好。