格式地址字符串

时间:2015-01-30 09:00:18

标签: sql-server tsql

使用T-SQL,我需要知道将包含地址的字符串拆分为单独字段的最佳方法,以便我可以重新格式化它们。它必须能够处理不同的输入格式。

AddressIn:           AddressOut:
                     StreetNumber:     StreetName:
'25 Main Street'   | '25'              'Main Street'
'25 MainStreet'    | '25'              'MainStreet'
'Main Street 25'   | '25'              'Main Street'
'MainStreet 25'    | '25'              'MainStreet'
'25B Main Street'  | '25B'             'Main Street'
'25B MainStreet'   | '25B'             'MainStreet'
'Main Street 25B'  | '25B'             'Main Street'
'MainStreet 25B'   | '25B'             'MainStreet'

非常感谢任何帮助。

3 个答案:

答案 0 :(得分:2)

这会做你想要的,但我同意上面的评论,使用SQL进行这种解析并不理想。

select
CASE 
    WHEN patindex('%[0-9]%',theAddress)=1 
    THEN substring(theaddress,1,charindex(' ',theaddress) ) 
    ELSE rtrim(substring(theAddress,patindex('%[0-9]%',theAddress),99) )
END as StreetNumber,
CASE 
    WHEN patindex('%[0-9]%',theAddress)=1 
    THEN substring(theaddress,charindex(' ',theAddress)+1,99) 
    ELSE substring(theAddress,1,patindex('%[0-9]%',theAddress)-1)
END as StreetName
FROM <yourtable>

警告,您可以从评论中看到。

  • 表现不会很好,特别是如果你的桌子是 大。
  • 其他地址格式可能会破坏你的例子 代码

答案 1 :(得分:1)

按照以下步骤操作:

Step_1:(拆分字符串并仅提取包含street_number编号的部分)

CREATE FUNCTION [dbo].[splitAddress_1] 
( 
    @string NVARCHAR(MAX)     
) 
RETURNS @output_2 TABLE(street_number NVARCHAR(50),street_name NVARCHAR(100))
BEGIN 
    DECLARE @delimiter CHAR(1)
    SET @delimiter=' '
    DECLARE @output TABLE(splitdata NVARCHAR(MAX))
    DECLARE @start INT, @end INT 
    DECLARE @CHKStr VARCHAR(50)
    SELECT @start = 1, @end = CHARINDEX(@delimiter, @string) 
    WHILE @start < LEN(@string) + 1 BEGIN 
        IF @end = 0  
            SET @end = LEN(@string) + 1

        SET @CHKStr=SUBSTRING(@string, @start, @end - @start)
        IF @CHKStr LIKE '%[0-9]%'BEGIN
        INSERT INTO @output_2(street_number) VALUES(@CHKStr)
        END
        ELSE
        BEGIN
        INSERT INTO @output (splitdata)  
        VALUES(@chksTR)
        END 
        SET @start = @end + 1 
        SET @end = CHARINDEX(@delimiter, @string, @start)

    END 
    UPDATE @output_2
    SET street_name=
    (SELECT STUFF((SELECT ' ' + splitdata FROM @output FOR XML PATH('')),1,1,''))
    RETURN 
END

并使用它:

select * from dbo.splitAddress_1('25B Main Street')

输出: '25B' 'Main Street'

step_2:(step_1是主要想法,最重要的是,下面的功能只是让选择变得容易 - 功能可能不同,可以用不同的方式写出来)

CREATE TABLE address_table (AddressIn  NVARCHAR(200))
INSERT INTO address_table VALUES
    ('25 Main Street'),
    ('25 MainStreet'),
    ('Main Street 25'),
    ('MainStreet 25'),
    ('25B Main Street'),
    ('25B MainStreet'),
    ('Main Street 25B'),
    ('MainStreet 25B'),
    ('Main 25B Street')

然后

CREATE FUNCTION [dbo].[splitAddress_2]()
RETURNS @output TABLE(street_number NVARCHAR(50),street_name NVARCHAR(100))
BEGIN 
 DECLARE @adr varchar(200)
 DECLARE cr CURSOR
    FOR SELECT AddressIn FROM address_table
OPEN cr
FETCH NEXT FROM cr into @adr;
while(@@fetch_status=0)
begin
insert into @output(street_number,street_name)
select * from dbo.splitAddress_1(@adr)
FETCH NEXT FROM cr into @adr;
end
close cr
deallocate cr
 RETURN 
END

最后这个查询:select * from dbo.splitAddress_2()

<强>结果: enter image description here

  

注意:如果数字部分位于地址字符串的开头,中间或末尾,例如对于像地址这样的地址,则无关紧要   'Main 25B Street'此解决方案可以正常运行。

答案 2 :(得分:0)

@Sparky的答案修改了健壮性。

DECLARE @Addresses TABLE (Address  NVARCHAR(200))
INSERT INTO @Addresses VALUES
    ('25 Main Street'),
    ('25 MainStreet'),
    ('Main Street 25'),
    ('MainStreet 25'),
    ('25B Main Street'),
    ('25B MainStreet'),
    ('Main Street 25B'),
    ('MainStreet 25B')
select
CASE 
    WHEN patindex('%[0-9]%',Address)=1 
    THEN substring(Address,1,charindex(' ',Address) ) 
    ELSE rtrim(substring(Address,patindex('%[0-9]%',Address),99) )
END as StreetNumber,
CASE 
    WHEN patindex('%[0-9]%',Address)=1 
    THEN REPLACE(Address,substring(Address,1,charindex(' ',Address) ),'')
    ELSE REPLACE(Address,rtrim(substring(Address,patindex('%[0-9]%',Address),99) ),'')
END as StreetName
FROM @Addresses

据说这段代码应该处理更多类型的地址。

这个版本有额外的例外:

DECLARE @Addresses TABLE (Address  NVARCHAR(200))
INSERT INTO @Addresses VALUES
    ('25 Main Street'),
    ('25 MainStreet'),
    ('Main Street 25'),
    ('MainStreet 25'),
    ('25B Main Street'),
    ('25B MainStreet'),
    ('Main Street 25B'),
    ('MainStreet 25B'),
    ('Main 25B Street')
select
CASE
    WHEN CHARINDEX(' ',Address,patindex('%[0-9]%',Address))-patindex('%[0-9]%',Address) > 0
    THEN substring(Address, patindex('%[0-9]%',Address), CHARINDEX(' ',Address,patindex('%[0-9]%',Address))-patindex('%[0-9]%',Address))
    WHEN patindex('%[0-9]%',Address)=1 
    THEN substring(Address,1,charindex(' ',Address) ) 
    ELSE rtrim(substring(Address,patindex('%[0-9]%',Address),99) )
END as StreetNumber,
CASE
    WHEN CHARINDEX(' ',Address,patindex('%[0-9]%',Address))-patindex('%[0-9]%',Address) > 0
    THEN REPLACE(Address,substring(Address, patindex('%[0-9]%',Address), CHARINDEX(' ',Address,patindex('%[0-9]%',Address))-patindex('%[0-9]%',Address)),'')
    WHEN patindex('%[0-9]%',Address)=1 
    THEN REPLACE(Address,substring(Address,1,charindex(' ',Address) ),'')
    ELSE REPLACE(Address,rtrim(substring(Address,patindex('%[0-9]%',Address),99) ),'')
END as StreetName
FROM @Addresses