我想要加入两个表。 两个表都包含一些用于连接它们的varchar列。 但是,运行查询以使用varchar列进行计算以进行连接是一个缓慢的过程。 所以,我想将这些varchar列转换为唯一的整数id,以便比较更快。
SELECT /*do calculations*/
FROM [dbo].[messages] m WITH (NOLOCK)
JOIN [dbo].[jointable] j ON j.address = m.orig OR j.address = m.recip
地址,orig和recip是具有字符串的列,并且最好使id具有更快的性能。
我意识到部分ON j.address = m.orig OR j.address = m.recip
会降低性能。
我想加入的表具有以下结构:
CREATE TABLE [dbo].[jointable](
[displayname] [nvarchar](256) NULL,
[alias] [nvarchar](129) NULL,
[firstname] [nvarchar](129) NULL,
[lastname] [nvarchar](129) NULL,
[address] [nvarchar](256) NULL,
[company] [nvarchar](129) NULL,
[department] [nvarchar](129) NULL,
[office] [nvarchar](129) NULL)
CREATE TABLE [dbo].[messages](
[messageid] [bigint] NOT NULL,
[message] [varchar](150) NULL,
[orig] [nvarchar](256) NULL,
[recip] [nvarchar](256) NULL)
我该怎么做?是否有任何函数可以将id从varchar转换为整数?提前谢谢。
答案 0 :(得分:3)
答案 1 :(得分:2)
您可以 从VARCHAR 生成GUID,但我怀疑,您对此感到满意(因此您需要某种映射表,如其他答案所示) )。只是为了表明原则:
如果您的字符串很短且在16字节内是唯一的,这可能对您有用:
DECLARE @tbl TABLE(SomeString VARCHAR(100),TheGUID UNIQUEIDENTIFIER);
--a GUID is a 16-Byte(128 bit) sized type
INSERT INTO @tbl(SomeString) VALUES
('test1')
,('Some short text')
,('Some very very very very long text')
,('Some very very very very long text which is the same as the other one in the first 16 bytes');
UPDATE @tbl SET TheGUID=CAST(CAST(SomeString AS VARBINARY(16)) AS UNIQUEIDENTIFIER);
SELECT SomeString
,TheGUID
,CAST(CAST(TheGUID AS VARBINARY(16)) AS VARCHAR(16))
FROM @tbl;
结果(滚动到一边)
+---------------------------------------------------------------------------------------------+--------------------------------------+--------------------+
| SomeString | TheGUID | (Kein Spaltenname) |
+---------------------------------------------------------------------------------------------+--------------------------------------+--------------------+
| test1 | 74736574-0031-0000-0000-000000000000 | test1 |
+---------------------------------------------------------------------------------------------+--------------------------------------+--------------------+
| Some short text | 656D6F53-7320-6F68-7274-207465787400 | Some short text |
+---------------------------------------------------------------------------------------------+--------------------------------------+--------------------+
| Some very very very very long text | 656D6F53-7620-7265-7920-766572792076 | Some very very v |
+---------------------------------------------------------------------------------------------+--------------------------------------+--------------------+
| Some very very very very long text which is the same as the other one in the first 16 bytes | 656D6F53-7620-7265-7920-766572792076 | Some very very v |
+---------------------------------------------------------------------------------------------+--------------------------------------+--------------------+
答案 2 :(得分:1)
首先尝试在列上添加索引(即使它们是VARCHAR
)。如果您仍在努力提高性能,可以使用以下内容按整数值连接。
-- Create a table to link a varchar with an integer
CREATE TABLE WordIndex(
WordID INT IDENTITY PRIMARY KEY,
Word VARCHAR(500))
CREATE NONCLUSTERED INDEX NCI_WordIndex_Word ON WordIndex (Word)
GO
-- Load the table with all available words
INSERT INTO WordIndex (
Word)
SELECT DISTINCT
YourVarcharColumn
FROM
YourTable
UNION
SELECT DISTINCT
YourOtherVarcharColumn
FROM
YourSecondTable
GO
-- Add the integer ID to your tables
ALTER TABLE YourTable ADD WordID INT
ALTER TABLE YourSecondTable ADD WordID INT
ALTER TABLE YourTable ADD FOREIGN KEY (WordID) REFERENCES WordIndex (WordID)
ALTER TABLE YourSecondTable ADD FOREIGN KEY (WordID) REFERENCES WordIndex (WordID)
GO
-- Optionally (but recommended) add indexes on the ID
CREATE NONCLUSTERED INDEX NCI_YourTable_WordID ON YourTable (WordID)
CREATE NONCLUSTERED INDEX NCI_YourSecondTable_WordID ON YourSecondTable (WordID)
GO
-- Update the integer ID
UPDATE T SET
WordID = W.WordID
FROM
YourTable AS T
INNER JOIN WordIndex AS W ON T.Word = W.Word
UPDATE T SET
WordID = W.WordID
FROM
YourSecondTable AS T
INNER JOIN WordIndex AS W ON T.Word = W.Word
GO
-- Join by integer
SELECT
1
FROM
YourTable AS T
INNER JOIN YourSecondTable AS N ON T.WordID = N.WordID
使用这种方法需要维护单词索引表。
答案 3 :(得分:0)
需要查看表格设计才能正确回答。
如果你只是为关系添加int并正确分配int,问题是如果你以后更改数据那么int关系是错误的。
如果索引这些列,varchar连接并不比int快得多。
可能是相同的查询计划,但
SELECT /*do calculations*/
FROM [dbo].[messages] m
JOIN [dbo].[jointable] j ON j.address in (m.orig, m.recip)