我有一个包含HTML内容的表。此内容可能包含一个或多个URL。我还有一个映射表,其中包含带有相关重写的URL。 我需要能够在每个HTML内容中将所有URL替换为存在时的重写。
用例(Postgres 9.5):
TABLE some_content (content_id int4, content text)
row1: 1, 'A BA BLAH PIKA',
row2: 2, 'B AB',
row3: 3, 'C PIKA NOTA CA'
TABLE rewrite (rule_id int4, old_string text, new_string text)
row1: 1, 'PIKA', 'CHU',
row2: 2, 'BLAH', 'POM'
查询应该输出以下集合:
row1: 1, 'A BA POM CHU'
row2: 2, 'B AB'
row3: 3, 'C CHU NOTA CA'
在rewrite
表格中添加新行,如:
row3: 3, 'NOTA', 'ISB'
然后将结果集转换为(int4,text):
row1: 1, 'A BA POM CHU'
row2: 2, 'B AB'
row3: 3, 'C CHU ISB CA'
任何提示?
答案 0 :(得分:1)
每次替换取决于最后一次的结果。你需要一种循环。而且你需要在替换中有一个确定性的顺序。假设namespace IR.Models
{
public class ApplicationUser : IdentityUser<int, CustomUserLogin, CustomUserRole, CustomUserClaim>
{
public async Task<ClaimsIdentity> GenerateUserIdentityAsync(UserManager<ApplicationUser, int> manager)
{
// Note the authenticationType must match the one defined in CookieAuthenticationOptions.AuthenticationType
var userIdentity = await manager.CreateIdentityAsync(this, DefaultAuthenticationTypes.ApplicationCookie);
// Add custom user claims here
return userIdentity;
}
}
public class CustomUserRole : IdentityUserRole<int> { }
public class CustomUserClaim : IdentityUserClaim<int> { }
public class CustomUserLogin : IdentityUserLogin<int> { }
public class CustomRole : IdentityRole<int, CustomUserRole>
{
public CustomRole() { }
public CustomRole(string name) { Name = name; }
}
public class CustomUserStore : UserStore<ApplicationUser, CustomRole, int, CustomUserLogin, CustomUserRole, CustomUserClaim>
{
public CustomUserStore(ApplicationDbContext context) : base(context)
{
}
}
public class CustomRoleStore : RoleStore<CustomRole, int, CustomUserRole>
{
public CustomRoleStore(ApplicationDbContext context) : base(context)
{
}
}
public class ApplicationDbContext : IdentityDbContext<ApplicationUser, CustomRole, int, CustomUserLogin, CustomUserRole, CustomUserClaim>
{
public ApplicationDbContext() : base("DefaultConnection")
{
}
public static ApplicationDbContext Create()
{
return new ApplicationDbContext();
}
}
}
按升序排列。假设您想要替换任何匹配,而不仅仅是整个单词(易于调整)。
你可以循环一个plpgsql函数。可能更快。参见:
或者,对于纯SQL,请尝试这种递归CTE:
rule_id
使用引用CTE的直接子查询获得的WITH RECURSIVE cte AS (
SELECT s.content_id, r.rule_id
, replace(s.content, r.old_string, r.new_string) AS content
FROM some_content s
CROSS JOIN (
SELECT rule_id, old_string, new_string
FROM rewrite
ORDER BY rule_id -- order of rows is relevant!
LIMIT 1
) r
UNION ALL
SELECT c.content_id, r.rule_id
, replace(c.content, r.old_string, r.new_string) AS content
FROM cte c
, LATERAL (
SELECT rule_id, old_string, new_string
FROM rewrite
WHERE rule_id > c.rule_id
ORDER BY rule_id -- order of rows is relevant!
LIMIT 1
) r
)
SELECT DISTINCT ON (content_id) content
FROM cte
ORDER BY content_id, rule_id DESC;
联接来解决LATERAL
。相关:
或者,使用"invalid reference to FROM-clause entry for table "c"
生成没有间隔的序列号like you commented:
row_number()
dbfiddle here
经常被忽视的是WITH RECURSIVE r AS (
SELECT old_string, new_string
, row_number() OVER (ORDER BY rule_id) AS rn -- your ORDER BY expression?
FROM rewrite
)
, cte AS (
SELECT s.content_id, r.rn
, replace(s.content, r.old_string, r.new_string) AS content
FROM some_content s
JOIN r ON r.rn = 1
UNION ALL
SELECT s.content_id, r.rn
, replace(s.content, r.old_string, r.new_string) AS content
FROM cte s
JOIN r ON r.rn = s.rn + 1
)
SELECT DISTINCT ON (content_id) content
FROM cte
ORDER BY content_id, rn DESC;
之后仍然可以添加普通的CTE:
关于WITH RECURSIVE
: