i have a table "monthly" which contains the column "filename", "sheetname", "project", "task", "owner", "hours", "percentage"
+---------+---------+--------+----------+-------+------+------------+
|fielname |sheetname|project | task | owner | hours|percentage |
+---------+---------+--------+----------+-------+------+-------------+
| file1 | IBM | Website | develop | sam | 5 |25
| file1 | IBM | website | test | sam | 7 |20
| file1 | IBM | support | design | ivan | 2 |7
| file1 | DELL | android | config | peter | 9 |30
| file2 | IBM | Website | develop | sam | 9 |45
| file2 | DELL | android | config | josef | 4 |50
| file2 | DELL | android | config | peter | 3 |70
| file2 | DELL | android | test | mark | 8 |70
| file2 | HP | webapp | code | jack | 10 |65
| file3 | IBM | website | test | sam | 7 |20
| file3 | HP | webapp | code | jack | 10 |65
| file4 | IBM | Website | develop | sam | 9 |45
i want to remove the duplicated rows when the sheetname and project and task and owner and hours and percentage is the same between 2 rows just the filename is different so we remove the second row and we keep the first row.
example :
| file1 | IBM | Website | develop | sam | 5 |25
| file2 | IBM | Website | develop | sam | 9 |45
| file4 | IBM | Website | develop | sam | 9 |45
fil1 and file2 have different values in hours and percentage so we keep it. file2 and file4 has the same values in the other column so we remove the entire row where is file4
thank you for your help
答案 0 :(得分:0)
Here is how you would do it using tSQL but I'm sure it will be very similar to other permutations of SQL:
Sample data:
select * from t_shipment shipment
join t_Pilot pilot on pilot.f_PilotID=shipment.f_Pilot_ID
where pilot.f_ProviderID='12' and shipment.f_ShipmentType=2
and shipment.f_date > DATEADD(yy, DATEDIFF(yy,0,getdate()), 0)
Show sample data:
IF OBJECT_ID('tempdb..#temp') IS NOT NULL
DROP TABLE #temp;
CREATE TABLE #temp
(
fielname VARCHAR(20), sheetname VARCHAR(20), project VARCHAR(20), task VARCHAR(20), owner VARCHAR(20), hours VARCHAR(20), percentage VARCHAR(20)
);
INSERT INTO #temp
VALUES
('file1', 'IBM', 'Website', 'develop', 'sam', '5', '25'
),
('file1', 'IBM', 'website', 'test', 'sam', '7', '20'
),
('file1', 'IBM', 'support', 'design', 'ivan', '2', '7'
),
('file1', 'DELL', 'android', 'config', 'peter', '9', '30'
),
('file2', 'IBM', 'Website', 'develop', 'sam', '9', '45'
),
('file2', 'DELL', 'android', 'config', 'josef', '4', '50'
),
('file2', 'DELL', 'android', 'config', 'peter', '3', '70'
),
('file2', 'DELL', 'android', 'test', 'mark', '8', '70'
),
('file2', 'HP', 'webapp', 'code', 'jack', '10', '65'
),
('file3', 'IBM', 'website', 'test', 'sam', '7', '20'
),
('file3', 'HP', 'webapp', 'code', 'jack', '10', '65'
),
('file4', 'IBM', 'Website', 'develop', 'sam', '9', '45'
);
Removing duplicates using Common Table Expression and Windowing function with SELECT * FROM #temp
assuming we do not use filename field in the ROW_NUMBER()
windowing function
PARTITION BY
Data set without duplicates
;WITH CTE AS (
SELECT #temp.fielname,
#temp.sheetname,
#temp.project,
#temp.task,
#temp.owner,
#temp.hours,
#temp.percentage ,
ROW_NUMBER() OVER (PARTITION BY #temp.sheetname,
#temp.project,
#temp.task,
#temp.owner,
#temp.hours,
#temp.percentage
ORDER BY #temp.fielname,
#temp.sheetname,
#temp.project,
#temp.task,
#temp.owner,
#temp.hours,
#temp.percentage) AS rn
FROM #temp)
DELETE FROM CTE WHERE rn>1