根据列匹配条件过滤PostgreSQL表中的行

时间:2020-07-13 01:01:01

标签: postgresql

我在PostgreSQL 11.0中有下表

s = 'this is some text about some text but not some other stuff'.split()

bigrams = [(s1, s2) for s1, s2 in zip(s, s[1:])]

# [('this', 'is'),
#  ('is', 'some'),
# ('some', 'text'),
# ('text', 'about'),
# ...

number_of_bigrams = len(bigrams)
# 11

# how many times 'some' occurs 
some_count = s.count('some')
# 3

# how many times bigram occurs
bg_count = bigrams.count(('some', 'text'))
# 2

# probabily of 'text' given 'some' P(bigram | some)
# i.e. you found `some`, what's the probability that its' makes the bigram:
bg_count/some_count
# 0.666

# probabilty of bigram in text P(some text)
# i.e. pick a bigram at random, what's the probability it's your bigram:
bg_count/number_of_bigrams
# 0.181818

我想过滤上表,以便如果col2和col4相等,则仅应选择此匹配项,并且排除两行以下。当col2和col4不相等时,应保留col2 = col3的行。

所需的输出是:

col1    col2      col3      col4
1       a         a         a          
1       a         a         a_1
1       a         a         a_2
1       b         b         c
2       d         d         c
3       e         d         e

我正在尝试跟踪查询,到目前为止没有成功。

col1    col2      col3      col4
1       a         a         a          
1       b         b         c
2       d         d         c
3       e         d         e

但是这将包括已经存在匹配项的行,我希望在最终输出中排除这些行。

select * from table1
where col2=col4
union
select * from table1
where col2 !=  col4 and col2=col3 

2 个答案:

答案 0 :(得分:1)

我会用

SELECT DISTINCT ON (col2) *
FROM table1
WHERE col2 = col4 OR col2 = col3
ORDER BY col2, col2 IS DISTINCT FROM col4;

这取决于FALSE < TRUE

答案 1 :(得分:0)

根据我的理解,您希望在给定条件下获得独特的col2

尝试一下:

with cte as 
(select *, 
 case 
 when col2=col4 then 2 
 when col2=col3 then 1 
 else 0 
 end "flag" from table1 )
 
 select distinct on(col2) col1,col2,col3,col4 from cte where flag>0
order by col2, "flag" desc

Demo on Fiddle