SQL解析带有空格和逗号的地址字符串到单独的字段中

时间:2016-05-16 19:12:37

标签: sql string parsing sql-server-2012

我有这个数据列表:

    ADDRESS
    '204 W 8th St, ABC, New York, NY 12345-6789'
    '222 N Barley St, Pittsburh, Pennsylvania, PA 98765-4321'
    '1 S Main St, Good Day, Washington, PA 13579-2468'
    '232 Justin Blvd, Sacramento, CA 86420-7531'

我想解析5个字段,即邮件地址1,邮件地址2,城市,州,邮政编码。我能够解析其中一些个体虽然不正确,例如:

    select distinct StreetName =
    substring(ADDRESS, CHARINDEX(',', ADDRESS+',', 1) +1,
    CHARINDEX(',', ADDRESS+',', CHARINDEX(',', ADDRESS+',', 1) +1) -
    CHARINDEX(',', ADDRESS+',', 1) -1)
    from Bills
    where ISNUMERIC(LEFT(ADDRESS,1))=1
    AND LEN(ADDRESS) > 1

主要是我的邮件地址2。我该怎么做才能将字符串分成5列?

1 个答案:

答案 0 :(得分:1)

解析地址可能是相当棘手的任务,我真的不知道所有的规则。我将展示如何在SQL中逐步进行计算,就像在其他编程语言中一样。使用CROSS / OUTER APPLY。

select sd.addr, t.*, tt.*, ttt.*
from (
   select '204 W 8th St, ABC, New York, NY 12345-6789' as addr
) sd
cross apply(
    select nParts = len(addr) - len(replace(addr,',',''))
    , lastPos = len(addr) - charindex(',', reverse(addr),1) +1
    , secondPos = charindex(',' ,addr,1)
    ) t
cross apply(
    select first = left(addr, secondPos-1)
    ,middle = substring(addr, secondPos+1, lastPos - secondPos -1)
    ,last = right(addr, len(addr) - lastPos )
    ) tt
outer apply( 
    select thirdPos = charindex(',' , middle ,1)
    ,forthPos = len(middle) - charindex(',', reverse(middle),1)+1
    where nParts >3    
    ) ttt

等等。根据需要添加步骤和逻辑。