如何使用SQL将文本附加到数据库中的条目

时间:2014-08-04 02:28:36

标签: sql postgresql

如何找到重复项并在重复项中附加一个随机数,以便它们不再重复。

样本表:

primary_id, student_id, student_name
1           80          John Terry
2           81          Didier Drogba
3           80          John Terry
4           82          Frank Lampard
5           80          John Terry

我希望通过在副本的名称后附加一个随机数来消除重复项。例如。在上面的场景中,我想重命名 显示在第3行到112233_DUP_John Terry和第5行到668877_DUP_John Terry的student_name。请注意,副本的第一个条目保持不变。在这种情况下,第1行保持不变。

重命名格式为:6_digit_random_number + _DUP_ + Existing Student Name

到目前为止,我可以使用下面的SQL获取重复项:

SELECT student_id, student_name FROM (select student_id, student_name, count(*) from student
          group by student_id, student_name
          HAVING count(*) > 1 order by count DESC) AS duplicates

我知道我也可以使用SQL生成一个随机数,但我无法弄清楚如何将它附加到重复的条目

正在运行Postgresql数据库

3 个答案:

答案 0 :(得分:3)

首先使用窗口函数而不是组方法获取重复的行,例如

SELECT
  primary_id, student_id, student_name
FROM 
(
  SELECT
    row_number() OVER (PARTITION BY student_id, student_name) AS dup_no,
    primary_id, student_id, student_name
  FROM students
) dup
WHERE dup.dup_no > 1; 

然后将其与UPDATE ... FROM结合使用,只更新重复项:

UPDATE students
SET student_name = to_char(dupstudents.dup_no, '000000') || '_DUP_' || students.student_name
FROM (
  SELECT
    row_number() OVER (PARTITION BY student_id, student_name) AS dup_no,
    primary_id, student_id, student_name
  FROM students
) dupstudents
WHERE students.primary_id = dupstudents.primary_id
  AND dupstudents.dup_no > 1;

e.g。 http://sqlfiddle.com/#!15/5b1b8/9

我还没有对随机ID"位;我只是使用了重复的偏移位置。您可以通过适当调用(random()*10^6)::integer或其他任何内容来替换它,但要注意随机值冲突。

答案 1 :(得分:0)

试试这个:

select student_id, R_N, student_name, 
  CASE WHEN R_N <> 1 THEN to_char( r_n,'000000')||'_DUP_' ELSE '' END ||student_name  
  FROM (SELECT *,
    row_number() OVER ( PARTITION BY student_id ORDER BY student_name) as R_N  from student) AS T1

sql fiddle here

测试

使用随机数:

select student_id, R_N, student_name, 
  CASE WHEN R_N <> 1 THEN to_char(random()*1000000,'000000')||'_DUP_' ELSE '' END ||student_name  
  FROM (SELECT *,
    row_number() OVER ( PARTITION BY student_id ORDER BY student_name) as R_N  from student) AS T1

sql fiddle

在没有子查询的一个陈述中:

select student_id,
       row_number() OVER ( PARTITION BY student_id ORDER BY student_name) ,
       student_name, 
       CASE WHEN row_number() OVER ( PARTITION BY student_id ORDER BY student_name) <> 1 
       THEN to_char( random()*1000000,'000000')||'_DUP_' ELSE '' END ||student_name  
from student
;

Sql Fiddle

答案 2 :(得分:0)

跟进Craig Ringer's answer

with cte as 
(
  SELECT
  primary_id, student_id, student_name
  FROM 
   (
    SELECT
    row_number() OVER (PARTITION BY stu_id, stu_name) AS dup_no,
    primary_id, student_id, student_name
    FROM student
   ) dup
  WHERE dup.dup_no > 1 
),cte2 as(
select (to_char(random()*1000000,'000000')) || '_DUP_' ||student_name as 
duplictaestudentname,primary_id,student_id from student where primary_id in (select    
primary_id from cte)
)
update student as v 
set student_name=s.duplictaestudentname
from cte2 as s
where v.primary_id=s.primary_id