在oracle中检索类似的数据

时间:2018-01-08 09:13:42

标签: sql oracle

假设我有下表:

id  name
---------
1   Matt
2   Ryan
3   Joseph
4   Matt1
5   5Joseph
6   David
7   Matt_43

我们看到马特和约瑟夫被重复了不止一次,即MattMatt1Matt_43。同样Joseph重复两次。

有没有办法检索此类数据?

5 个答案:

答案 0 :(得分:2)

你可以用LIKE

自我加入你的桌子

例如:

select t1.id as id1, t1.name as name1, t2.id as id2, t2.name as name2
from your_table t1
join your_table t2
  on (upper(t1.name) like '%'|| upper(t2.name) ||'%' and t1.id <> t2.id)

答案 1 :(得分:1)

你可以使用像'%matt%'这样的运算符的通配符,即使前缀或后缀不匹配也会返回所有的matt值

这是查询

% Table created by stargazer v.5.2 by Marek Hlavac, Harvard University. E-mail: hlavac at fas.harvard.edu
% Date and time: Mon, Jan 08, 2018 - 3:18:09 AM
\begin{table}[!htbp] \centering 
  \caption{} 
  \label{} 
\begin{tabular}{@{\extracolsep{5pt}}lc} 
\\[-1.8ex]\hline 
\hline \\[-1.8ex] 
 & \multicolumn{1}{c}{\textit{Dependent variable:}} \\ 
\cline{2-2} 
\\[-1.8ex] & disp \\ 
\hline \\[-1.8ex] 
 gear (ref=3) \\ \-\hspace{0.3cm} gear4 & $-$202.921$^{***}$ (22.477) \\ 
  \-\hspace{0.3cm} gear5 & $-$160.898$^{***}$ (36.282) \\ 
  carb (ref=1) \\ \-\hspace{0.3cm} carb2 & 71.282$^{**}$ (27.919) \\ 
  \-\hspace{0.3cm} carb3 & 25.574 (39.919) \\ 
  \-\hspace{0.3cm} carb4 & 155.852$^{***}$ (27.355) \\ 
  \-\hspace{0.3cm} carb6 & 55.672 (68.065) \\ 
  \-\hspace{0.3cm} carb8 & 211.672$^{***}$ (68.065) \\ 
  Constant & 250.226$^{***}$ (24.363) \\ 
 \hline \\[-1.8ex] 
\hline 
\hline \\[-1.8ex] 
\end{tabular} 
\end{table} 

答案 2 :(得分:1)

因为即使2个单词相同,oracle也非常区分大小写 哪里   一个以大写字母开头   第二个是小的,

oracle会考虑两种不同的数据,为了让它们保留在您的搜索中,您需要这种方法:  select name from emp where UPPER(name) like '%UPPER(Ryan)%';

答案 3 :(得分:1)

假设您需要检索列name中具有相似文本的数据,您可以将表连接到自身,在连接条件中使用like并返回distinct记录,如下所示。 / p>

select distinct t1.id as id, 
       t1.name as name
from table1 t1
join table1 t2
  on ((t1.name like '%'|| t2.name ||'%' or t2.name like '%'|| t1.name ||'%') 
      and t1.id <> t2.id);

<强>结果:

+----+---------+
| ID |  NAME   |
+----+---------+
|  4 | Matt1   |
|  7 | Matt_43 |
|  1 | Matt    |
|  3 | Joseph  |
|  5 | 5Joseph |
+----+---------+

<强> DEMO

<强>更新

如果您不希望结果区分大小写,请使用upper

select distinct t1.id as id, 
       t1.name as name
from table1 t1
join table1 t2
  on ((upper(t1.name) like '%'|| upper(t2.name) ||'%' or upper(t2.name) like '%'|| upper(t1.name) ||'%') 
      and t1.id <> t2.id)

答案 4 :(得分:1)

根据您定义&#39;类似&#39;的方式,您可以查看soundex()utl_match

for items in browser.find_elements_by_css_selector(".contentnode"):
    data = ' '.join([' '.join(item.text.split()) for item in items.find_elements_by_css_selector("dd")])
    print(data)