如何使用PLSQL基于一列中的重复值和另一列中的连续值进行过滤?

时间:2015-05-16 07:24:48

标签: plsql count duplicates

我有一个数据库表,其中包含一个特定列中的一些重复行。我只想显示那些行,但仅当相邻列具有每组匹配重复项的连续数字时。下图显示:

Filter for Duplicates

以上是我到目前为止所提出的内容(下面的列名与上面的图片不同,以避免与定义的SQL函数冲突):

SELECT BIZ_DATE, AMT, COUNT(*) FROM MY_TABLE WHERE TRAN_DATE 
= '03-APR-2000' GROUP BY  AMT, BIZ_DATE, AMT HAVING COUNT(*) > 1; 

这似乎可以很好地在amt列中获取欺骗。

现在,如何让它仅考虑trans_id列中的连续值?

2 个答案:

答案 0 :(得分:1)

PL / SQL并不是严格要求查找共享一个属性但在另一个属性上具有连续值的数据。

使用您的通用表结构,以下内容将找到与相邻的TRANS_ID共享相同BIZ_DATE和AMT的任何事务对。

创建并加载表格后:

<html>
<body>
<?php
    session_start();    
    $submit = $_POST['submit'];
    $term = $_POST['id'];

    //open database
    $connect = mysql_connect("localhost","root","#") or die("Couldn't connect");
    mysql_select_db("caselab") or die("Couldn't connect");  
    $sql = mysql_query("SELECT id FROM users WHERE id='$term'");        
    $count = mysql_num_rows($sql);  
    if($count!=0)
    {
        // output data of each row
        $id = $_POST['id'];
        $name = strip_tags($_POST['name']);
        $email = strip_tags($_POST['email']);
        $address = strip_tags($_POST['address']);
        $contactinfo = $_POST['contactinfo'];

        if($submit)
        {
            //open database
            $connect = mysql_connect("localhost","root","#") or die("Couldn't connect");
            mysql_select_db("caselab") or die("Couldn't connect");  

            // Existence Check
            if($name  && $email && $address && $contactinfo)
            {
                $queryreg = mysql_query ("Update users SET username = '$name', email = '$email' , address = '$address' , contactinfo = '$contactinfo' WHERE id = $id");
                    echo ("Congratulations!! Your changes have been saved !! <a href='payroll.html'>Click to go back to home page</a>");        
            }
            else
                echo("Please fill all the details");
        }   
        mysql_close($connect);
    }
    else
        echo("No such employee. Please try again.<a href='payroll.html'>Click to go back to home page</a> ");
?>
</html>
</body>

可以通过分析函数比较相邻值:

    CREATE TABLE MY_TABLE    (
    BIZ_DATE DATE          NOT NULL,
    NAME     VARCHAR2(200) NOT NULL,
    AMT      NUMBER        NOT NULL,
    TRANS_ID NUMBER        NOT NULL    );

    INSERT INTO MY_TABLE
    (BIZ_DATE, NAME, TRANS_ID, AMT)
    VALUES (TO_DATE('17-MAY-2015', 'DD-MON-YYYY'), 'BOB', 8086, 159);

    INSERT INTO MY_TABLE
    (BIZ_DATE, NAME, TRANS_ID, AMT)
    VALUES (TO_DATE('17-MAY-2015', 'DD-MON-YYYY'), 'BOB', 8085, 159);

    INSERT INTO MY_TABLE
    (BIZ_DATE, NAME, TRANS_ID, AMT)
    VALUES (TO_DATE('17-MAY-2015', 'DD-MON-YYYY'), 'BOB', 9088, 159);

    INSERT INTO MY_TABLE
    (BIZ_DATE, NAME, TRANS_ID, AMT)
    VALUES (TO_DATE('17-MAY-2015', 'DD-MON-YYYY'), 'BOB', 9087, 159);

    INSERT INTO MY_TABLE
    (BIZ_DATE, NAME, TRANS_ID, AMT)
    VALUES (TO_DATE('17-MAY-2015', 'DD-MON-YYYY'), 'BOB', 1111, 159);

    INSERT INTO MY_TABLE
    (BIZ_DATE, NAME, TRANS_ID, AMT)
    VALUES (TO_DATE('17-APR-2015', 'DD-MON-YYYY'),'BOB', 5903, 159);

    INSERT INTO MY_TABLE
    (BIZ_DATE,NAME, TRANS_ID, AMT)
    VALUES (TO_DATE('17-MAR-2015', 'DD-MON-YYYY'),'BOB', 5904, 160);

结果:

SELECT
  BIZ_DATE,
  AMT,
  TRANS_ID,
  PRIOR_TRANS_ID
FROM
  (SELECT
     BIZ_DATE,
     AMT,
     TRANS_ID,
     LAG(TRANS_ID, 1, TRANS_ID)
     OVER (PARTITION BY BIZ_DATE, AMT
       ORDER BY TRANS_ID ASC)
       AS PRIOR_TRANS_ID
   FROM MY_TABLE
   WHERE BIZ_DATE = TO_DATE('17-MAY-2015', 'DD-MON-YYYY'))
WHERE (TRANS_ID - PRIOR_TRANS_ID) = 1;

答案 1 :(得分:0)

根据 你希望如何重新获得&#34;重复&#34;行,就像这样可以解决这个问题:

DECLARE
  CURSOR c IS SELECT * FROM T ORDER BY AMT,NAME,TRANS_ID;
  curr_rec c%ROWTYPE;
  base_rec c%ROWTYPE;
BEGIN
  FOR curr_rec IN c
  LOOP
    IF curr_rec.AMT = base_rec.AMT AND curr_rec.NAME = base_rec.NAME
    THEN
        DBMS_OUTPUT.PUT(base_rec.TRANS_ID);
        DBMS_OUTPUT.PUT(' ');
        DBMS_OUTPUT.PUT_LINE(curr_rec.TRANS_ID);
    ELSE
        base_rec := curr_rec;
    END IF;
  END LOOP;
NULL;
END;

这是对已排序行的简单循环,以根据条件curr_rec.AMT = base_rec.AMT AND curr_rec.NAME = base_rec.NAME(或您需要的任何其他内容)查找伪重复行,只要它与您的前缀匹配即可光标的SORTED子句)

根据您的样本数据,输出为:

8085 8086
8085 8087
8085 8088