Question

编辑（根据要求）：我已经更新了示例数据，以显示我在数据库上运行SELECT时获得的所有REAL数据。我可以确认数据是坏的 - 它包含重复的记录。应用程序中存在错误，数据库没有（question，attempt，track_number）的唯一约束。我正在尝试清理坏数据 - 重复记录。为此，我需要获取那些不良记录的tbl_survey.id（PK）值。

表：

CREATE TABLE tbl_survey(
    id [bigint] IDENTITY(1,1) NOT NULL,
    question [bigint] NOT NULL,
    attempt [bigint] NOT NULL,
    track_number [int] NOT NULL,
CONSTRAINT tbl_survey_id_pk PRIMARY KEY CLUSTERED ([id] ASC)
)

数据：

id      question  attempt track_number  track_number_count
315 8418    2   2
316 8418    1   2
317 8418    2   2
318 8418    2   2
319 8418    1   2
320 8418    1   2
321 8418    1   2
323 8418    1   2
324 8418    1   2
325 8418    1   2
326 8418    1   2
327 8418    2   2
328 8418    1   2
329 8418    1   2
330 8418    1   2
331 8418    1   2
332 8418    1   2
333 8418    1   2
334 8418    1   2
335 8418    1   2
336 8418    1   2
337 8418    1   2
338 8418    1   2
339 8418    1   2
340 8418    1   2
341 8418    1   2
342 8418    1   2
343 8418    1   2
344 8418    1   2
345 8418    1   2
346 8418    1   2
347 8418    1   2
348 8418    1   2
349 8418    1   2
350 8418    2   2
351 8418    1   2
352 8418    2   2
353 8418    1   2
355 8418    1   2
357 8418    1   2
358 8418    1   2
359 8418    1   2
360 8418    1   2
361 8418    1   2
362 8418    1   2
363 8418    1   2
364 8418    1   2
365 8418    1   2
366 8418    1   2
367 8418    1   2
368 8418    1   2
369 8418    1   2
370 8418    1   2
371 8418    1   2
372 8418    1   2
373 8418    1   2
375 8418    1   2
376 8418    1   2
377 8418    2   2
378 8418    1   2
379 8418    2   2

使用上面的MSSQL 2008 R2表和数据，此查询将检索到的数据限制为我想要的行（即上面的数据）：

SELECT
    question,
    attempt,
    track_number,
    COUNT (track_number) AS track_number_count
FROM tbl_survey
WHERE attempt = 8418
GROUP BY
    question,
    attempt,
    track_number
HAVING (COUNT(track_number_count) > 1 )
ORDER BY attempt, question;

如何更改SELECT查询，以便它为每个返回的行提供该表的“id”列？

目前我收到了：

question  attempt  track_number  track_number_count
315       8418     2             2
317       8418     1             2

我想要额外的id列：

id      question  attempt  track_number  track_number_count
476585  315       8418     2             2
476606  317       8418     1             2

我做错了什么？如何显示id列？

感谢。

Answer 1

我认为，一旦获得了所需的详细信息，您就可以将其恢复为符合条件的ID：

SELECT id,      question,  attempt,  track_number,  track_number_count
from 
tbl_survey ts
inner join 
(
    SELECT
    question,
    attempt,
    track_number,
    COUNT (track_number) AS track_number_count
FROM tbl_survey
WHERE attempt = 8418
GROUP BY
    question,
    attempt,
    track_number
HAVING (COUNT(track_number_count) > 1 )
) as matching
on
(ts.question=matching.question and ts.attempt=matching.attempt and ts.track_number=matching.track_number)
ORDER BY ts.attempt, ts.question;

无论如何都是这样的，但我不是百分之百确定它甚至是有意义的。

Answer 2

这对我有用：

SELECT
    MAX(id),
    question,
    attempt,
    track_number,
    COUNT(track_number) AS track_number_count
FROM
    tbl_survey
WHERE
    attempt = 8418
GROUP BY
    question,
    attempt,
    track_number
HAVING
    (COUNT(track_number_count) > 1 )
ORDER BY
    attempt,
    question;

Answer 3

我不确定，我无法让查询在sql小提琴上运行，但它就像你降低了你的小组的粒度，因此产生了较低的数量。

较低的数量，不是＆gt;因此，你错过了行。

尝试将此处的最后一行更改为＆gt; = 1：

HAVING (COUNT(track_number_count) >= 1 )

获取HAVING（COUNT）查询中每条记录的ID

3 个答案: