我有一个500k行的表,其中地址在一个字段中,由Char(13)+ Char(10)分隔。我已经在表格中添加了5个字段,希望将其拆分。
发现在线this split function似乎效果不错,因为我有5个部分,而且parsename
可能在该字段中,因此我无法使用.
。
这是一个表值函数,所以我必须循环行并更新记录,以前我会使用游标或sql,或者甚至可能使用c#来执行此操作但我觉得它们必须是 cte 或设置基于的答案来执行此操作。
答案 0 :(得分:3)
您有几种选择:
您可以创建临时表,然后将地址解析为临时表,然后通过将原始表连接到临时表来更新原始表。
或
您可以编写自己的T-SQL函数,并在更新语句函数中使用这些函数,如下所示:
UPDATE myTable
SET address1 = myGetAddress1Function(address),
address2 = myGetAddress2Function(address)....
答案 1 :(得分:3)
所以给出了一些源数据:
CREATE TABLE dbo.Addresses
(
AddressID INT IDENTITY(1,1),
[Address] VARCHAR(255),
Address1 VARCHAR(255),
Address2 VARCHAR(255),
Address3 VARCHAR(255),
Address4 VARCHAR(255),
Address5 VARCHAR(255)
);
INSERT dbo.Addresses([Address])
SELECT 'foo
bar'
UNION ALL SELECT 'add1
add2
add3
add4
add5';
让我们创建一个函数,按顺序返回地址部分:
CREATE FUNCTION dbo.SplitAddressOrdered
(
@AddressID INT,
@List VARCHAR(MAX),
@Delimiter VARCHAR(32)
)
RETURNS TABLE
AS
RETURN
(
SELECT
AddressID = @AddressID,
rn = ROW_NUMBER() OVER (ORDER BY Number),
AddressItem = Item
FROM (SELECT Number, Item = LTRIM(RTRIM(SUBSTRING(@List, Number,
CHARINDEX(@Delimiter, @List + @Delimiter, Number) - Number)))
FROM (SELECT ROW_NUMBER() OVER (ORDER BY [object_id])
FROM sys.all_objects) AS n(Number)
WHERE Number <= CONVERT(INT, LEN(@List))
AND SUBSTRING(@Delimiter + @List, Number, LEN(@Delimiter)) = @Delimiter
) AS y
);
GO
现在你可以这样做(你必须运行5次查询):
DECLARE
@i INT = 1,
@sql NVARCHAR(MAX),
@src NVARCHAR(MAX) = N';WITH x AS
(
SELECT a.*, Original = s.AddressID, s.rn, s.AddressItem
FROM dbo.Addresses AS a
CROSS APPLY dbo.SplitAddressOrdered(a.AddressID, a.Address,
CHAR(13) + CHAR(10)) AS s WHERE rn = @i
)';
WHILE @i <= 5
BEGIN
SET @sql = @src + N'UPDATE x SET Address' + RTRIM(@i)
+ ' = CASE WHEN AddressID = Original AND rn = '
+ RTRIM(@i) + ' THEN AddressItem END;';
EXEC sp_executesql @sql, N'@i INT', @i;
SET @i += 1;
END
然后您可以删除Address
列:
ALTER TABLE dbo.Addresses DROP COLUMN [Address];
然后表格有:
AddressID Address1 Address2 Address3 Address4 Address5
--------- -------- -------- -------- -------- --------
1 foo bar NULL NULL NULL
2 add1 add2 add3 add4 add5
我确信有人会比我更聪明地展示如何利用该功能而不必循环。
我还可以想象一下这个功能会稍微改变一下就可以让你简单地拉出某个元素......请等待......
编辑
这是一个标量函数,它本身更昂贵,但允许你进行一次传递而不是5:
CREATE FUNCTION dbo.ElementFromOrderedList
(
@List VARCHAR(MAX),
@Delimiter VARCHAR(32),
@Index SMALLINT
)
RETURNS VARCHAR(255)
AS
BEGIN
RETURN
(
SELECT Item
FROM (SELECT rn = ROW_NUMBER() OVER (ORDER BY Number),
Item = LTRIM(RTRIM(SUBSTRING(@List, Number,
CHARINDEX(@Delimiter, @List + @Delimiter, Number) - Number)))
FROM (SELECT ROW_NUMBER() OVER (ORDER BY [object_id])
FROM sys.all_objects) AS n(Number)
WHERE Number <= CONVERT(INT, LEN(@List))
AND SUBSTRING(@Delimiter + @List, Number, LEN(@Delimiter)) = @Delimiter
) AS y WHERE rn = @Index
);
END
GO
现在,根据上表(更新之前和删除之前)的更新,只是:
UPDATE dbo.Addresses
SET Address1 = dbo.ElementFromOrderedList([Address], CHAR(13) + CHAR(10), 1),
Address2 = dbo.ElementFromOrderedList([Address], CHAR(13) + CHAR(10), 2),
Address3 = dbo.ElementFromOrderedList([Address], CHAR(13) + CHAR(10), 3),
Address4 = dbo.ElementFromOrderedList([Address], CHAR(13) + CHAR(10), 4),
Address5 = dbo.ElementFromOrderedList([Address], CHAR(13) + CHAR(10), 5);