我们说我的下面有粘贴描述的行:
DAS0_CIP.TRA_HVAC01_073.TRA_HVAC_BCK.TRA_HVAC_CLX.Program:AHU_5.
DAS0_CIP.TRA_HVAC01_072.TRA_HVAC_BCK.TRA_HVAC_CLX.Program:AHU_5.
这些非常相似但不完全相同。不幸的是,Soundex不打算将它们分组,我不认为我可以使用两个记录比较功能,因为我有一个包含700万条记录的表,我需要根据描述进行分组。我希望类似的描述可以组合在一起。有没有办法对行的相似性进行评分或排名?我如何以更好的方式将这些组合在一起?
现在,我在左边([描述],50)或类似的东西进行分组以获得粗略的分组,但它并不是那么准确。我认为必须有更好的方法。
扩展数据示例:
[GREDS2UA1381TFT][MR_Data_2]MRSaveData Script: Sproc failed to Update
!ERR Invalid hpItemArray in AlmLinkedList::AddItemAtEnd
!ERR Invalid hpItemArray in AlmLinkedList::GetItemPtrAt
!ERR Semaphore wait failed in CBThrd_CallbackThreadMain [hCBThreadSemaphore=000001AC]
!WNAL_Register(AlarmPriority10b) - failed(501)
!WNALderegister AlarmMgr not running
!WNAPalarm AlarmMgr not running
.Initialize(Label Details ToteStacker Manual): EntID or AreaName is bad: EntID= 0| AreaName=
.RefreshGrid(Reprint): No production labels found with filters
.RefreshMaterialsGrid [PRODUCE]: - Fact.Common.XMLReadBuilder
{1a} MES_CheeseMakeVatL1_AppEngine: StopHostedObjects() - m_AccessManagerProvider->UnregisterAutomationObjectNoUpdate( objectId = 16 ) failed, hr = 0x8000FFFF
{1a} MES_CheeseMakeVatL1_AppEngine: StopHostedObjects() - m_AccessManagerProvider->UnregisterAutomationObjectNoUpdate( objectId = 3 ) failed, hr = 0x80070057
[GRE_CM_REC_OS3][MR_Data_2]MRSaveData Script: objResult Not Successful Proxy failed to connect to Middleware Server service endpoint. This can happen if your Middleware Server is not available or if the service endpoint portqueue is not accessible. The endpoint is [net.tcp:GRECMESDB1:8099FactMES.Server.HostEventBrokerSvc].
{1a} MES_StarterL1_AppEngine: StopHostedObjects() - m_AccessManagerProvider->UnregisterAutomationObjectNoUpdate( objectId = 8 ) failed, hr = 0x80070057
{1a} MES_StarterL1_AppEngine: StopHostedObjects() - m_AccessManagerProvider->UnregisterAutomationObjectNoUpdate( objectId = 9 ) failed, hr = 0x8000FFFF
0:A2C Attempt to store values in the future; timestamps were overwritten with current time (2542, 20160313 03:31:23.009, 20160313 03:31:23.008) [aahMDASSF.cpp; 3315; 1]
0:A2C Attempt to store values in the future; timestamps were overwritten with current time (2560, 20160313 03:22:55.884, 20160313 03:22:55.883) [aahMDASSF.cpp; 3315; 1]
0:A2C Attempt to store values in the future; timestamps were overwritten with current time (2560, 20160313 03:22:55.884, 20160313 03:22:55.883) [aahMDASSF.cpp; 3315; 3782]
[GREDS2UA1381TFT][MR_Data_2]MRSaveData Script: Unable to get session instance
0:A2C Attempt to store values in the future; timestamps were overwritten with current time (2579, 20160313 03:27:09.509, 20160313 03:27:09.508) [aahMDASSF.cpp; 3315; 3651]
{1a} MES_StarterL1_AppEngine: StopHostedObjects() - m_AccessManagerProvider->UnregisterAutomationObjectNoUpdate( objectId = 10 ) failed, hr = 0x80070057
0:A2C Attempt to store values in the future; timestamps were overwritten with current time (2579, 20160313 03:35:36.037, 20160313 03:35:36.036) [aahMDASSF.cpp; 3315; 2380]
0:A2C Attempt to store values in the future; timestamps were overwritten with current time (2579, 20160313 03:39:50.371, 20160313 03:39:50.371) [aahMDASSF.cpp; 3315; 1]
{1a} MES_StarterL1_AppEngine: StopHostedObjects() - m_AccessManagerProvider->UnregisterAutomationObjectNoUpdate( objectId = 3 ) failed, hr = 0x80070057
0:A2C Attempt to store values in the future; timestamps were overwritten with current time (2579, 20160313 03:52:29.264, 20160313 03:52:29.264) [aahMDASSF.cpp; 3315; 3598]
0:A2C Attempt to store values in the future; timestamps were overwritten with current time (2579, 20160313 03:56:41.972, 20160313 03:56:41.971) [aahMDASSF.cpp; 3315; 776]
0:A2C Attempt to store values in the future; timestamps were overwritten with current time (2588, 20160313 03:44:03.194, 20160313 03:44:03.194) [aahMDASSF.cpp; 3315; 1]
{1a} MES_SweetWheyL1_AppEngine: StopHostedObjects() - m_AccessManagerProvider->UnregisterAutomationObjectNoUpdate( objectId = 7 ) failed, hr = 0x8000FFFF
0:A2C Attempt to store values in the future; timestamps were overwritten with current time (2588, 20160313 03:44:03.194, 20160313 03:44:03.194) [aahMDASSF.cpp; 3315; 3199]
0:A58 Attempt to store values in the future; timestamps were overwritten with current time (2233, 20160313 03:27:07.759, 20160313 03:27:07.758) [aahMDASSF.cpp; 3315; 1]
0:A2C Attempt to store values in the future; timestamps were overwritten with current time (2542, 20160313 03:31:23.009, 20160313 03:31:23.008) [aahMDASSF.cpp; 3315; 3472]
0:A2C Attempt to store values in the future; timestamps were overwritten with current time (2579, 20160313 03:27:09.509, 20160313 03:27:09.508) [aahMDASSF.cpp; 3315; 1]
0:A2C Attempt to store values in the future; timestamps were overwritten with current time (2579, 20160313 03:35:36.037, 20160313 03:35:36.036) [aahMDASSF.cpp; 3315; 1]
0:A58 Attempt to store values in the future; timestamps were overwritten with current time (2233, 20160313 03:27:07.759, 20160313 03:27:07.758) [aahMDASSF.cpp; 3315; 332]
0:A58 Attempt to store values in the future; timestamps were overwritten with current time (2233, 20160313 03:31:20.290, 20160313 03:31:20.290) [aahMDASSF.cpp; 3315; 1]
0:A58 Attempt to store values in the future; timestamps were overwritten with current time (2233, 20160313 03:35:35.019, 20160313 03:35:35.018) [aahMDASSF.cpp; 3315; 1]
0:A58 Attempt to store values in the future; timestamps were overwritten with current time (2233, 20160313 03:35:35.019, 20160313 03:35:35.018) [aahMDASSF.cpp; 3315; 233]
0:A58 Attempt to store values in the future; timestamps were overwritten with current time (2233, 20160313 03:39:50.959, 20160313 03:39:50.959) [aahMDASSF.cpp; 3315; 253]
0:A58 Attempt to store values in the future; timestamps were overwritten with current time (2233, 20160313 03:44:04.936, 20160313 03:44:04.936) [aahMDASSF.cpp; 3315; 1]
0:A58 Attempt to store values in the future; timestamps were overwritten with current time (2233, 20160313 03:44:04.936, 20160313 03:44:04.936) [aahMDASSF.cpp; 3315; 289]
0:A58 Attempt to store values in the future; timestamps were overwritten with current time (2233, 20160313 03:52:30.967, 20160313 03:52:30.967) [aahMDASSF.cpp; 3315; 1]
0:A2C Attempt to store values in the future; timestamps were overwritten with current time (2579, 20160313 03:39:50.371, 20160313 03:39:50.371) [aahMDASSF.cpp; 3315; 2725]
0:A58 Attempt to store values in the future; timestamps were overwritten with current time (2233, 20160313 03:52:30.967, 20160313 03:52:30.967) [aahMDASSF.cpp; 3315; 326]
0:A2C Attempt to store values in the future; timestamps were overwritten with current time (2579, 20160313 03:48:17.312, 20160313 03:48:17.312) [aahMDASSF.cpp; 3315; 1]
0:A2C Attempt to store values in the future; timestamps were overwritten with current time (2579, 20160313 03:48:17.312, 20160313 03:48:17.312) [aahMDASSF.cpp; 3315; 3683]
0:A2C Attempt to store values in the future; timestamps were overwritten with current time (2579, 20160313 03:52:29.264, 20160313 03:52:29.264) [aahMDASSF.cpp; 3315; 1]
0:A2C Attempt to store values in the future; timestamps were overwritten with current time (2579, 20160313 03:56:41.972, 20160313 03:56:41.971) [aahMDASSF.cpp; 3315; 1]
0:A58 Attempt to store values in the future; timestamps were overwritten with current time (2233, 20160313 03:22:55.228, 20160313 03:22:55.227) [aahMDASSF.cpp; 3315; 1]
0:A58 Attempt to store values in the future; timestamps were overwritten with current time (2233, 20160313 03:22:55.228, 20160313 03:22:55.227) [aahMDASSF.cpp; 3315; 331]
0:A58 Attempt to store values in the future; timestamps were overwritten with current time (2233, 20160313 03:31:20.290, 20160313 03:31:20.290) [aahMDASSF.cpp; 3315; 325]
0:A58 Attempt to store values in the future; timestamps were overwritten with current time (2233, 20160313 03:39:50.959, 20160313 03:39:50.959) [aahMDASSF.cpp; 3315; 1]
0:A58 Attempt to store values in the future; timestamps were overwritten with current time (2233, 20160313 03:48:18.899, 20160313 03:48:18.899) [aahMDASSF.cpp; 3315; 1]
0:A58 Attempt to store values in the future; timestamps were overwritten with current time (2233, 20160313 03:48:18.899, 20160313 03:48:18.899) [aahMDASSF.cpp; 3315; 316]
0:A58 Attempt to store values in the future; timestamps were overwritten with current time (2233, 20160313 03:56:44.589, 20160313 03:56:44.588) [aahMDASSF.cpp; 3315; 1]
0:A5C Attempt to store values in the future; timestamps were overwritten with current time (2100, 20160313 03:35:40.381, 20160313 03:35:40.380) [aahMDASSF.cpp; 3315; 1]
0:A5C Attempt to store values in the future; timestamps were overwritten with current time (2101, 20160313 03:22:59.993, 20160313 03:22:59.993) [aahMDASSF.cpp; 3315; 1]
0:A5C Attempt to store values in the future; timestamps were overwritten with current time (2101, 20160313 03:22:59.993, 20160313 03:22:59.993) [aahMDASSF.cpp; 3315; 1097]
0:A5C Attempt to store values in the future; timestamps were overwritten with current time (2103, 20160313 03:27:13.618, 20160313 03:27:13.618) [aahMDASSF.cpp; 3315; 1]
0:A5C Attempt to store values in the future; timestamps were overwritten with current time (2103, 20160313 03:27:13.618, 20160313 03:27:13.618) [aahMDASSF.cpp; 3315; 1092]
0:A5C Attempt to store values in the future; timestamps were overwritten with current time (2103, 20160313 03:31:27.118, 20160313 03:31:27.118) [aahMDASSF.cpp; 3315; 992]
0:A5C Attempt to store values in the future; timestamps were overwritten with current time (2103, 20160313 03:39:53.131, 20160313 03:39:53.131) [aahMDASSF.cpp; 3315; 1]
0:A5C Attempt to store values in the future; timestamps were overwritten with current time (2103, 20160313 03:39:53.131, 20160313 03:39:53.131) [aahMDASSF.cpp; 3315; 798]
0:A5C Attempt to store values in the future; timestamps were overwritten with current time (2103, 20160313 03:44:07.930, 20160313 03:44:07.930) [aahMDASSF.cpp; 3315; 1008]
0:A5C Attempt to store values in the future; timestamps were overwritten with current time (2103, 20160313 03:48:21.578, 20160313 03:48:21.577) [aahMDASSF.cpp; 3315; 1]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2249, 20160313 03:22:55.884, 20160313 03:22:55.883) [aahMDASSF.cpp; 3315; 419]
0:A58 Attempt to store values in the future; timestamps were overwritten with current time (2233, 20160313 03:56:44.589, 20160313 03:56:44.588) [aahMDASSF.cpp; 3315; 66]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2249, 20160313 03:31:23.009, 20160313 03:31:23.008) [aahMDASSF.cpp; 3315; 1]
0:A5C Attempt to store values in the future; timestamps were overwritten with current time (2100, 20160313 03:35:40.381, 20160313 03:35:40.380) [aahMDASSF.cpp; 3315; 725]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2249, 20160313 03:35:37.598, 20160313 03:35:37.597) [aahMDASSF.cpp; 3315; 1]
0:A5C Attempt to store values in the future; timestamps were overwritten with current time (2103, 20160313 03:31:27.118, 20160313 03:31:27.118) [aahMDASSF.cpp; 3315; 1]
0:A5C Attempt to store values in the future; timestamps were overwritten with current time (2103, 20160313 03:44:07.930, 20160313 03:44:07.930) [aahMDASSF.cpp; 3315; 1]
0:A5C Attempt to store values in the future; timestamps were overwritten with current time (2103, 20160313 03:48:21.578, 20160313 03:48:21.577) [aahMDASSF.cpp; 3315; 1143]
0:A5C Attempt to store values in the future; timestamps were overwritten with current time (2103, 20160313 03:52:37.035, 20160313 03:52:37.034) [aahMDASSF.cpp; 3315; 1]
0:A5C Attempt to store values in the future; timestamps were overwritten with current time (2103, 20160313 03:52:37.035, 20160313 03:52:37.034) [aahMDASSF.cpp; 3315; 1161]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2249, 20160313 03:39:53.494, 20160313 03:39:53.493) [aahMDASSF.cpp; 3315; 306]
0:A5C Attempt to store values in the future; timestamps were overwritten with current time (2103, 20160313 03:56:50.535, 20160313 03:56:50.535) [aahMDASSF.cpp; 3315; 1]
0:A5C Attempt to store values in the future; timestamps were overwritten with current time (2103, 20160313 03:56:50.535, 20160313 03:56:50.535) [aahMDASSF.cpp; 3315; 183]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2249, 20160313 03:44:05.698, 20160313 03:44:05.698) [aahMDASSF.cpp; 3315; 355]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2249, 20160313 03:22:55.884, 20160313 03:22:55.883) [aahMDASSF.cpp; 3315; 1]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2249, 20160313 03:48:18.453, 20160313 03:48:18.452) [aahMDASSF.cpp; 3315; 1]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2249, 20160313 03:27:09.509, 20160313 03:27:09.508) [aahMDASSF.cpp; 3315; 1]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2249, 20160313 03:48:18.453, 20160313 03:48:18.452) [aahMDASSF.cpp; 3315; 390]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2249, 20160313 03:27:09.509, 20160313 03:27:09.508) [aahMDASSF.cpp; 3315; 416]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2249, 20160313 03:31:23.009, 20160313 03:31:23.008) [aahMDASSF.cpp; 3315; 393]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2249, 20160313 03:56:48.585, 20160313 03:56:48.584) [aahMDASSF.cpp; 3315; 71]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2249, 20160313 03:35:37.598, 20160313 03:35:37.597) [aahMDASSF.cpp; 3315; 291]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2249, 20160313 03:39:53.494, 20160313 03:39:53.493) [aahMDASSF.cpp; 3315; 1]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2249, 20160313 03:44:05.698, 20160313 03:44:05.698) [aahMDASSF.cpp; 3315; 1]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2288, 20160313 03:42:55.241, 20160313 03:42:55.240) [aahMDASSF.cpp; 3315; 1]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2249, 20160313 03:52:32.720, 20160313 03:52:32.719) [aahMDASSF.cpp; 3315; 1]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2288, 20160313 03:42:55.241, 20160313 03:42:55.240) [aahMDASSF.cpp; 3315; 639]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2294, 20160313 03:38:43.218, 20160313 03:38:43.218) [aahMDASSF.cpp; 3315; 442]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2249, 20160313 03:52:32.720, 20160313 03:52:32.719) [aahMDASSF.cpp; 3315; 401]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2249, 20160313 03:56:48.585, 20160313 03:56:48.584) [aahMDASSF.cpp; 3315; 1]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2338, 20160313 03:55:38.184, 20160313 03:55:38.184) [aahMDASSF.cpp; 3315; 1]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2294, 20160313 03:38:43.218, 20160313 03:38:43.218) [aahMDASSF.cpp; 3315; 1]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2336, 20160313 03:51:24.094, 20160313 03:51:24.094) [aahMDASSF.cpp; 3315; 1]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2344, 20160313 03:34:28.353, 20160313 03:34:28.352) [aahMDASSF.cpp; 3315; 455]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2344, 20160313 03:59:50.704, 20160313 03:59:50.703) [aahMDASSF.cpp; 3315; 1]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2336, 20160313 03:51:24.094, 20160313 03:51:24.094) [aahMDASSF.cpp; 3315; 841]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2338, 20160313 03:55:38.184, 20160313 03:55:38.184) [aahMDASSF.cpp; 3315; 670]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2343, 20160313 03:47:09.092, 20160313 03:47:09.091) [aahMDASSF.cpp; 3315; 1]
0:A6C Attempt to store values in the future; timestamps were overwritten with current time (2343, 20160313 03:47:09.092, 20160313 03:47:09.091) [aahMDASSF.cpp; 3315; 658]
A write to InTouch was skipped with a value of because the data quality was bad or initializing.
A write to InTouch was skipped with a value of CM__Ctrl because the data quality was bad or initializing.
A write to InTouch was skipped with a value of false because the data quality was bad or initializing.
答案 0 :(得分:1)
一种方法可能就是这样:它将测试,有多少部分是相同的...你可以继续将这个想法分成每个部分,将其分成_
。
CREATE FUNCTION dbo.TestSimilarity(@s1 VARCHAR(100),@s2 VARCHAR(100))
RETURNS INT
AS
BEGIN
DECLARE @x1 XML=CAST('<x>' + REPLACE(@s1,'.','</x><x>') + '</x>' AS XML);
DECLARE @x2 XML=CAST('<x>' + REPLACE(@s2,'.','</x><x>') + '</x>' AS XML);
DECLARE @retVal INT=0;
IF @x1.value('/x[1]','varchar(100)')= @x2.value('/x[1]','varchar(100)') SET @retVal=@retVal+1;
IF @x1.value('/x[2]','varchar(100)')= @x2.value('/x[2]','varchar(100)') SET @retVal=@retVal+1;
IF @x1.value('/x[3]','varchar(100)')= @x2.value('/x[3]','varchar(100)') SET @retVal=@retVal+1;
IF @x1.value('/x[4]','varchar(100)')= @x2.value('/x[4]','varchar(100)') SET @retVal=@retVal+1;
IF @x1.value('/x[5]','varchar(100)')= @x2.value('/x[5]','varchar(100)') SET @retVal=@retVal+1;
RETURN @retVal;
END;
GO
SELECT dbo.TestSimilarity('DAS0_CIP.TRA_HVAC01_073.TRA_HVAC_BCK.TRA_HVAC_CLX.Program:AHU_5.','DAS0_CIP.TRA_HVAC01_072.TRA_HVAC_BCK.TRA_HVAC_CLX.Program:AHU_5.');
GO
DROP FUNCTION dbo.TestSimilarity;
结果将是4 ......