文本字段中的多个字符串替换

时间:2017-10-23 13:08:32

标签: sql postgresql replace

我有一个包含HTML内容的表。此内容可能包含一个或多个URL。我还有一个映射表,其中包含带有相关重写的URL。 我需要能够在每个HTML内容中将所有URL替换为存在时的重写。

用例(Postgres 9.5):

TABLE some_content (content_id int4, content text)
row1: 1, 'A BA BLAH PIKA',
row2: 2, 'B AB',
row3: 3, 'C PIKA NOTA CA'

TABLE rewrite (rule_id int4, old_string text, new_string text)
row1: 1, 'PIKA', 'CHU',
row2: 2, 'BLAH', 'POM'

查询应该输出以下集合:

row1: 1, 'A BA POM CHU'
row2: 2, 'B AB'
row3: 3, 'C CHU NOTA CA'

rewrite表格中添加新行,如:

row3: 3, 'NOTA', 'ISB'
然后

将结果集转换为(int4,text):

row1: 1, 'A BA POM CHU'
row2: 2, 'B AB'
row3: 3, 'C CHU ISB CA'

任何提示?

1 个答案:

答案 0 :(得分:1)

每次替换取决于最后一次的结果。你需要一种循环。而且你需要在替换中有一个确定性的顺序。假设namespace IR.Models { public class ApplicationUser : IdentityUser<int, CustomUserLogin, CustomUserRole, CustomUserClaim> { public async Task<ClaimsIdentity> GenerateUserIdentityAsync(UserManager<ApplicationUser, int> manager) { // Note the authenticationType must match the one defined in CookieAuthenticationOptions.AuthenticationType var userIdentity = await manager.CreateIdentityAsync(this, DefaultAuthenticationTypes.ApplicationCookie); // Add custom user claims here return userIdentity; } } public class CustomUserRole : IdentityUserRole<int> { } public class CustomUserClaim : IdentityUserClaim<int> { } public class CustomUserLogin : IdentityUserLogin<int> { } public class CustomRole : IdentityRole<int, CustomUserRole> { public CustomRole() { } public CustomRole(string name) { Name = name; } } public class CustomUserStore : UserStore<ApplicationUser, CustomRole, int, CustomUserLogin, CustomUserRole, CustomUserClaim> { public CustomUserStore(ApplicationDbContext context) : base(context) { } } public class CustomRoleStore : RoleStore<CustomRole, int, CustomUserRole> { public CustomRoleStore(ApplicationDbContext context) : base(context) { } } public class ApplicationDbContext : IdentityDbContext<ApplicationUser, CustomRole, int, CustomUserLogin, CustomUserRole, CustomUserClaim> { public ApplicationDbContext() : base("DefaultConnection") { } public static ApplicationDbContext Create() { return new ApplicationDbContext(); } } } 按升序排列。假设您想要替换任何匹配,而不仅仅是整个单词(易于调整)。

你可以循环一个plpgsql函数。可能更快。参见:

或者,对于纯SQL,请尝试这种递归CTE:

rule_id

使用引用CTE的直接子查询获得的WITH RECURSIVE cte AS ( SELECT s.content_id, r.rule_id , replace(s.content, r.old_string, r.new_string) AS content FROM some_content s CROSS JOIN ( SELECT rule_id, old_string, new_string FROM rewrite ORDER BY rule_id -- order of rows is relevant! LIMIT 1 ) r UNION ALL SELECT c.content_id, r.rule_id , replace(c.content, r.old_string, r.new_string) AS content FROM cte c , LATERAL ( SELECT rule_id, old_string, new_string FROM rewrite WHERE rule_id > c.rule_id ORDER BY rule_id -- order of rows is relevant! LIMIT 1 ) r ) SELECT DISTINCT ON (content_id) content FROM cte ORDER BY content_id, rule_id DESC; 联接来解决LATERAL。相关:

或者,使用"invalid reference to FROM-clause entry for table "c"生成没有间隔的序列号like you commented

row_number()

dbfiddle here

经常被忽视的是WITH RECURSIVE r AS ( SELECT old_string, new_string , row_number() OVER (ORDER BY rule_id) AS rn -- your ORDER BY expression? FROM rewrite ) , cte AS ( SELECT s.content_id, r.rn , replace(s.content, r.old_string, r.new_string) AS content FROM some_content s JOIN r ON r.rn = 1 UNION ALL SELECT s.content_id, r.rn , replace(s.content, r.old_string, r.new_string) AS content FROM cte s JOIN r ON r.rn = s.rn + 1 ) SELECT DISTINCT ON (content_id) content FROM cte ORDER BY content_id, rn DESC; 之后仍然可以添加普通的CTE:

关于WITH RECURSIVE