如何从MySQL表中删除具有不同日期的冗余记录

时间:2013-09-15 01:42:28

标签: mysql sql database redundancy

我在MySQL数据库中有一个表,其中包含以下列:

itemID      bigint(11)
itemDate    datetime    
attributeID smallint(6)
value       int(9)

编辑:此表存储在主键和关键键为 itemID 的单独表格中唯一定义的项目的属性。

什么是最佳删除的SQL查询(?)从最近的记录开始到最旧的):

  • 此表中的每条记录 = 0(如果存在)(另一条记录具有相同的 itemID 且相同的 attributeID 和具有> 5且 itemDate 是(较旧但也是最近的)或相同的)

  • 此表中的每条记录(如果存在)(另一条记录具有相同的 itemID 且相同的 attributeID 且相同的和< strong> itemDate 是(较旧但也是最近的)或相同的)

也会看到最后的代码

我在PHP脚本中使用它。

基本上,我有冗余数据,因为我很快就没有发现一个错误,因为它没有填充大约100k条目。下面是一个很小的例子:

itemID  itemDate        attributeID value
28  11.09.2013 2:00     4           0
28  11.09.2013 2:00     5           0
28  11.09.2013 2:01     1           0
28  11.09.2013 2:01     2           0
28  11.09.2013 2:01     3           0
28  11.09.2013 2:01     4           0
28  11.09.2013 2:01     5           0
28  11.09.2013 2:02     1           21
28  11.09.2013 2:02     2           11
28  11.09.2013 2:02     3           4
28  11.09.2013 2:02     1           21
28  11.09.2013 2:02     2           11
28  11.09.2013 2:02     3           4
28  11.09.2013 2:02     1           21
28  11.09.2013 2:02     2           12
28  11.09.2013 2:02     3           4
28  13.09.2013 18:54    1           0
28  13.09.2013 18:54    2           0
28  13.09.2013 18:54    3           0
28  13.09.2013 18:55    1           21
28  13.09.2013 18:55    2           12
28  13.09.2013 18:55    3           6

上面应该成为(删除algorythm 的多次迭代后):

itemID  itemDate        attributeID value
28  11.09.2013 2:00     4           0
28  11.09.2013 2:00     5           0
28  11.09.2013 2:01     1           0
28  11.09.2013 2:01     2           0
28  11.09.2013 2:01     3           0
28  11.09.2013 2:02     1           21
28  11.09.2013 2:02     2           11
28  11.09.2013 2:02     3           4
28  11.09.2013 2:02     2           12
28  13.09.2013 18:55    3           6

我希望我能够明确地解决问题,但是,如果我要澄清任何事情,请告诉我。 谢谢你!

更新

我设法找到一个将SQL与php结合起来的解决方案,但我并不喜欢它。我相信使用2个正确的SQL查询可以获得相同的结果,因此,虽然我很满意我有清理数据库的方法,问题仍然如下:如何将下面的代码转换为纯SQL查询。

// Properties
$item_found_count = $item_valid_count = 0;

// Find zero value entries
$query = "SELECT * FROM $db_fb WHERE value = '0'";

if ($result = mysqli_query($connection, $query)) {

    // for each record found
    while($row = $result->fetch_array()) {

        $item_found_count++;    // Count all items found
        $t_itemID = $row['itemID']; $t_itemDate = $row['itemDate']; $t_attributeID = $row['attributeID'];   // Record this data just in case we need it as a 'pointer' to delete the record

        //echo "Entry found: " . $row['itemID'] . " " . $row['itemDate'];

        $query = "SELECT * FROM $db_fb WHERE itemID = $t_itemID AND itemDate < '$t_itemDate' AND attributeID = '$t_attributeID' AND value > '5' ORDER BY itemDate DESC LIMIT 1";
        // If there is such an entry, the current one must be deleted.
        if ($SecondResult = mysqli_query($connection, $query)) {

            while($rowSpec = $SecondResult->fetch_array()) {
            $item_valid_count++;    // Count all items actually deleted

                //echo "<br>-> mark;"; print_r($rowSpec); echo "<br>";

                // Delete if ID, itemDate, attributeID and VALUE coincide
                $q_del = "DELETE FROM $db_fb WHERE itemID = $t_itemID AND itemDate = '$t_itemDate' AND attributeID = '$t_attributeID' AND value = '0'";
                $deleteRes = mysqli_query($connection, $q_del);

            }

        }

        //echo "--------------------------<br><br>";
    }

}

// Select from table where values are identical, attributeID identical, ID identical, itemDates immediately consecutive LIMIT by 2. Delete most recent entry.
$query = "SELECT MAX(itemDate) as itemDate, itemID, attributeID, value, count(*) FROM $db_fb GROUP BY itemID, attributeID, value HAVING count(*) > 1 ORDER BY itemDate DESC";

    if ($ThirdResult = mysqli_query($connection, $query)) {

            while($rowSpec = $ThirdResult->fetch_array()) {
            $item_duplicates_count++;   // Count all items actually deleted

                $t_itemID = $rowSpec['itemID']; $t_itemDate = $rowSpec['itemDate']; $t_attributeID = $rowSpec['attributeID'];   $t_value = $rowSpec['value']; // Record this data just in case we need it as a 'pointer' to delete the record

                //echo "<br>-> mark;"; print_r($rowSpec); echo "<br>";
                $q_del = "DELETE FROM $db_fb WHERE itemID = '$t_itemID' AND itemDate = '$t_itemDate' AND attributeID = '$t_attributeID' AND value = '$t_value'";
                $deleteRes = mysqli_query($connection, $q_del);


            }

        }

echo "Zeroed found: " . $item_found_count . "<br>";
echo "Zeroed valid for deletion: " . $item_valid_count . "<br>";
echo "Zeroed remaining: " . ($item_found_count - $item_valid_count) . "<br>";

echo "Consecutive duplicates: " . $item_duplicates_count;

0 个答案:

没有答案