Postgres UPDATE to_tsvector将所有行更新为相同的值

时间:2017-02-03 03:43:18

标签: postgresql

我想设置to_tsvector语言(例如:'法语'),因此在渲染FTS向量时它会使用正确的字典。

表消息的locale_id列位于locales表中。然后,我需要将locales表加入locale_id上​​的languages表以获取实际的语言名称。

此UPDATE应循环遍历messages中的所有行并设置vector列= to_tsvector(joined language name, message),但它将每行更新为相同的值和相同的语言字典(例如: to_tsvector('french', stringX)

这是为什么?每行都有不同的message字符串和不同的locale_id(意思是不同的语言名称)。

因此,如果我只是忽略pg_dictionary的语言配置并执行:

  UPDATE messages 
  SET vector = to_tsvector(message);

表结果:

消息:

message   | locale_id | vector
-----------------------------
Hi there  | 1         | 'hi':1
Is a test | 2         | 'test':3
Le french | 3         | 'french':2 'le':1 --'le' SHOULD BE omitted since it's a stop word in French pg_dictionary

这很好用。显然,它没有为每一行加载正确的语言字典。但是,执行以下操作会为每行产生相同的结果:

  UPDATE messages 
  SET vector = to_tsvector(messages_languages.language::regconfig, messages_languages.message) 
  FROM (
    select t3.language, t1.message 
    from messages as t1 
    inner join locales as t2 on (t1.locale_id = t2.id) 
    inner join languages as t3 on (t2.language_id = t3.id)
  ) messages_languages;

我还尝试了WITH,结果相同:

 WITH messages_languages as (
    select t3.language, t1.message 
    from messages as t1 
    inner join locales as t2 on (t1.locale_id = t2.id) 
    inner join languages as t3 on (t2.language_id = t3.id)  
  )
  UPDATE messages
  SET vector = to_tsvector(messages_languages.language::regconfig, messages_languages.message) 
  FROM messages_languages;

表结果:

消息:

message   | locale_id | vector
-----------------------------
Hi there  | 1         | 'french':2
Is a test | 2         | 'french':2
Le french | 3         | 'french':2  --'le' omitted correctly in french pg_dictionary as it's a STOP word

' french',对于pg_dictionary_name ='法语',应该是此表中唯一的' french':2向量结果,但所有行都相同

区域设置:

id        | language_id    
------------------
1         | 4         
2         | 5       
3         | 6  

语言:

id        | language    
------------------
4         | 'English'         
5         | 'German'       
6         | 'French'     

2 个答案:

答案 0 :(得分:1)

  • 你不需要子查询
  • 您不需要重新选择消息(目标表已在范围表中)
  • 您需要将源查询与结果行
  • 相关联
UPDATE messages msg
  SET vector = to_tsvector(lang.language::regconfig, msg.message)
  FROM locales as loco
  JOIN languages as lang ON loco.language_id = lang.id
  WHERE msg.locale_id = loco.id
     ;

答案 1 :(得分:0)

事实证明,您必须将别名子查询的ID检查到您在UPDATE中在同一个表上重复执行的行:

  UPDATE messages 
  SET vector = to_tsvector(messages_languages.language::regconfig, messages_languages.message) 
  FROM (
    select t1.id, t3.language, t1.message 
    from messages as t1 
    inner join locales as t2 on (t1.locale_id = t2.id) 
    inner join languages as t3 on (t2.language_id = t3.id)
  ) messages_languages
  -- Need to make sure you're referencing the same row in the subquery by comparing IDs
  WHERE messages.id = messages_languages.id;