基本上我有一个街道地址栏(例如321 North Laredo Road)。此列有10,000多条记录。有些记录完全没有缩写,有些是部分记录,有些是完全记录。点不一致(在我访问之前就是这样)。我需要将所有地址的Directional(N,S,E,W)和街道名称(road,blvd,ave等)改为缩写形式并添加到第二列。任何有方法论的建议或可能已经写好的解决方案。使用以下测试代码获得混合结果。 。 。
/* First find/update directionals */
UPDATE hospital
set tiger_address = replace(street, 'North', 'N')
where street LIKE "%North%"
;
UPDATE hospital
set tiger_address = replace(street, 'South', 'S')
where street LIKE "%South%"
;
UPDATE hospital
set tiger_address = replace(street, 'East', 'E')
where street LIKE "%East%"
;
UPDATE hospital
set tiger_address = replace(street, 'West', 'W')
where street LIKE "%West%"
;
/* Second look for streets to abbreviate */
UPDATE hospital
set tiger_address = replace(tiger_address, 'Alley', 'Aly')
where tiger_address LIKE "%Alley%"
;
UPDATE hospital
set tiger_address = replace(tiger_address, 'Anex', 'Anx')
where tiger_address LIKE "%Anex%"
;
UPDATE hospital
set tiger_address = replace(tiger_address, 'Arcade', 'Arc')
where tiger_address LIKE "%Arcade%"
;
UPDATE hospital
set tiger_address = replace(tiger_address, 'Avenue', 'Ave')
where tiger_address LIKE "%Avenue%"
;
UPDATE hospital
set tiger_address = replace(tiger_address, 'Bayou', 'Byu')
where tiger_address LIKE "%Bayou%"
;
UPDATE hospital
set tiger_address = replace(tiger_address, 'Beach', 'Bch')
where tiger_address LIKE "%Beach%"
;
UPDATE hospital
set tiger_address = replace(tiger_address, 'Bend', 'Bnd')
where tiger_address LIKE "%Bend%"
;
UPDATE hospital
set tiger_address = replace(tiger_address, 'Bluff', 'Blf')
where tiger_address LIKE "%Bluff%"
;
UPDATE hospital
set tiger_address = replace(tiger_address, 'Bluff', 'Blfs')
where tiger_address LIKE "%Bluffs%"
;
UPDATE hospital
set tiger_address = replace(tiger_address, 'Street', 'St')
where tiger_address LIKE "%street%"
;
UPDATE hospital
set tiger_address = replace(street, 'Road', 'Rd')
where tiger_address LIKE "%road%"
;
UPDATE hospital
set tiger_address = replace(tiger_address, 'Drive', 'Dr')
where tiger_address LIKE "drive%"
;
UPDATE hospital
set tiger_address = replace(tiger_address, 'Boulevard', 'Blvd')
where tiger_address LIKE "boulevard%"
;
岁月必须有更好的方式。 谢谢
答案 0 :(得分:0)
使用MariaDB的regexp_replace
和positive look-ahead and look-behind assertions可以完成从单词中删除一个块。对于Bayou和Boulevard,我使用replace
。
select replace(replace(
regexp_replace(
'North East South West Alley Anex Arcade Avenue '
'Bayou Beach Bend Bluff Bluffs Street Road Drive Boulevard',
'(?<=N)orth|(?<=S)outh|(?<=E)ast|(?<=W)est|(?<=Al)le(?=y)|'
'(?<=An)e(?=x)|(?<=Arc)ade|(?<=Ave)nue|(?<=B)ea(?=ch)|(?<=B)e(?=nd)|'
'(?<=Bl)uf(?=f)|(?<=St)reet|(?<=R)oa(?=d)|(?<=Dr)ive',
''),
'Bayou', 'Byu'),
'Boulevard', 'Blvd')