将“地址分割”列拆分为SQL视图中的单独列

时间:2010-02-05 16:44:25

标签: sql sql-server database sql-server-2005

我在表中有一个Address列,我需要在SQL Server 2005的视图中拆分成多列。我需要在换行符chr(10)上拆分列,并且可以从1开始列中的4行(0到3行换行)。以下是我需要做的几个例子。实现这一目标的最简单方法是什么?

Examples:

Address                 Address1      Address2       Address3            Address4
------------        =   -----------   -----------    -----------------   ---------
My Company              My Company     123 Main St.  Somewhere,NY 12345  
123 Main St.         
Somewhere,NY 12345

Address                 Address1       Address2      Address3      Address4
------------        =   ------------   ----------    -----------   ---------
123 Main St.            123 Main St.

3 个答案:

答案 0 :(得分:3)

这将使用parsename函数拆分地址,并将其与COALESCE结合使用以在正确的列中获取正确的信息

如果你有超过4行,这个方法将不起作用

编辑:添加了反转顺序的代码

    create table #test (address varchar(1000))

    --test data
    insert #test values('My Company
    123 Main St.         
    Somewhere,NY 12345')

    insert #test values('My Company2
    666 Main St.  
    Bla Bla       
    Somewhere,NY 12345')

    insert #test values('My Company2')

    --split happens here
                            select
replace(parsename(address,ParseLen +1),'^','') as Address1,
replace(parsename(address,ParseLen ),'^','') as Address2,
replace(parsename(address,ParseLen -1),'^','') as Address3,
replace(parsename(address,ParseLen -2),'^','') as Address4
from(
select case  ascii(right(address,1)) when 10 then
replace(replace(left(address,(len(address)-1)),'.','^'),char(10),'.')  
else 
replace(replace(address,'.','^'),char(10),'.') end as address,
case  ascii(right(address,1)) when 10 then
len(replace(replace(address,'.','^'),char(10),'.')) -
len(replace(replace(address,'.','^'),char(10),'')) -1
else
len(replace(replace(address,'.','^'),char(10),'.')) -
len(replace(replace(address,'.','^'),char(10),'')) end as ParseLen
 from #test) x

答案 1 :(得分:1)

这非常令人讨厌......我强烈建议您如果要分别处理每个地址行,那么首先要正确存储它。而不是继续做你正在做的事情,添加其他列,修复现有数据一次(而不是每次运行查询时“修复”它),然后调整存储过程插入/更新,以便它知道使用其他列。

DECLARE @Address TABLE(id INT IDENTITY(1,1), ad VARCHAR(MAX));

INSERT @Address(ad) SELECT 'line 1
line 2
line 3
line 4'
UNION ALL SELECT 'row 1
row 2
row 3'
UNION ALL SELECT 'address 1
address 2'
UNION ALL SELECT 'only 1 entry here'
UNION ALL SELECT 'let us try 5 lines
line 2
line 3
line 4 
line 5';

SELECT
    id,
    Line1 = REPLACE(REPLACE(COALESCE(Line1, ''), CHAR(10), ''), CHAR(13), ''),
    Line2 = REPLACE(REPLACE(COALESCE(Line2, ''), CHAR(10), ''), CHAR(13), ''),
    Line3 = REPLACE(REPLACE(COALESCE(SUBSTRING(Rest, 1, COALESCE(NULLIF(CHARINDEX(CHAR(10), Rest), 0), LEN(Rest))), ''), CHAR(10), ''), CHAR(13), ''),
    Line4 = REPLACE(REPLACE(COALESCE(SUBSTRING(Rest, NULLIF(CHARINDEX(CHAR(10), Rest) + 1, 1), LEN(Rest)), ''), CHAR(10), ''), CHAR(13), '')
FROM

(
    SELECT 
        id,
        ad,
        Line1,
        Line2 = SUBSTRING(Rest, 1, COALESCE(NULLIF(CHARINDEX(CHAR(10), Rest), 0), LEN(Rest))),
        Rest = SUBSTRING(Rest, NULLIF(CHARINDEX(CHAR(10), Rest) + 1, 1), LEN(Rest))
    FROM
    (
        SELECT
            id,
            ad,
            Line1 = SUBSTRING(ad, 1, COALESCE(NULLIF(CHARINDEX(CHAR(10), ad), 0), LEN(ad))),
            Rest = SUBSTRING(ad, NULLIF(CHARINDEX(CHAR(10), ad) + 1, 1), LEN(ad))
        FROM
            @address
    ) AS x
) AS y
ORDER BY id;

Denis的PARSENAME()技巧当然要更加整洁,但你必须非常小心地使用一个真正不可能自然地出现在数据中的替换字符。克拉(^)可能是一个不错的选择,但就像我说的那样,你需要小心。

还有一些软件包非常擅长清理地址和其他人口统计数据。但是清理数据输入是最重要的,我将继续强调......如果每个地址行需要单独处理,那么以这种方式存储它们。

答案 2 :(得分:0)

在SQL中解析文本并不好玩。如果我必须做这样的事情,我会将列导出到csv文本文件并用脚本语言(如Perl / PHP / Python)解析它。这样我就可以利用内置的字符串函数和脚本语言的正则表达式。