从SQL

时间:2015-12-18 17:28:33

标签: sql sql-server

我有一张包含下表的表

表A

Text,id,Cid,CName,Aid,AName
Acc.sa is very Acc.pa and Acc.ba is awesome, 1,2,AB,1,CC
Acc.aa is awesome and Acc.sas is great,2,3,CC,1,CC
Acc.ee is not only great but Acc.sew is best,4,3,FF,1,CC

它应该获取与Acc相关的所有单词,因此结果应为

 Did,id,Cid,CName,Aid,AName
 Acc.sa,1,2,AB,1,CC
 Acc.pa,1,2,AB,1,CC
 Acc.ba,1,2,AB,1,CC
 Acc.aa,2,3,CC,1,CC
 Acc.sas,2,3,CC,1,CC
 Acc.ee,4,3,FF,1,CC
 Acc.sew,4,3,FF,1,CC

即。每次搜索都应该有一个新行

我尝试了CHARINDEX和子字符串,但我不知道如何继续使用SELECT语句来使用CHARINDEX和substring任何帮助都非常感谢。

3 个答案:

答案 0 :(得分:1)

with base10 as (
    select n
    from (values (0), (1), (2), (3), (4), (5), (6), (7), (8), (9)) v(n)
), k as (
    select d2.n * 100 + d1.n * 10 + d0.n + 1 as n
    from base10 d0 cross join base10 d1 cross join base10 d2
)
select
    substring(a.Text, k.n, charindex(' ', a.Text, k.n) - k.n) as Did,
    Id, Cid, CName, Aid, AName
from TableA a inner join k
    on substring(a.Text, k.n, 4) = 'Acc.'
        and k.n < len(a.Text) /* not necessary but optimizer might use it??? */

http://sqlfiddle.com/#!6/a0f87/4输出一些额外的列

这只能处理大约一千个字符的字符串。我怀疑这可能已经足够了,如果这很慢,你甚至可能想缩小搜索长度。

我假设一个空格会立即跟随你的“Acc”。值意味着它不会出现在一行的末尾。如有必要,可以处理。

由于您看到错误,因此显示您的输入行与您指定的格式不同。除了我提到的空格字符分隔符之外,我没有看到任何其他假设。

对于调试,您可以使用此输出替换整个substring()行,以更好地了解正在发生的事情。还要添加where子句以将行限制为可能导致错误的行:

select
    'Bad offset' as Msg,
    a.Text, k.n as StartOfAccBlick, charindex(' ', a.Text, k.n) as EndOfAccBlock
from ...    
where
    k.n - charindex(' ', a.Text, k.n) <= 0

答案 1 :(得分:1)

xml nodes方法可以方便地将单元格值解析为这样的行。 E.g:

select n.value('@s[1]', 'varchar(max)'), id, Cid, CName, Aid, AName
from Table_A ta
cross apply (select convert(xml, '<x s="' + replace(replace(replace(replace(replace(replace(ta.[Text],'&','&amp;'),'>','&gt;'),'<','&lt;'),'''','&apos;'),'"','&quot;'),' ','"/><x s="') + '"/>') xval) r
cross apply r.xval.nodes('*') x(n)
where n.value('@s[1]', 'varchar(max)') like 'Acc.%'

SqlFiddle

编辑 - 转义5个无效的xml字符

答案 2 :(得分:0)

您需要使用递归CTE将每次出现的子串。我今天早上忙于工作,所以我无法创建符合您特定模式的内容,但这里有一个我过去用来从字符串中提取所有PDF文件名的例子。所有注释掉的行都是为了显示我如何获得我需要提取的PDF文件名的构建块。您需要根据您的方案相应地修改模式搜索。这是SQL Fiddle

DECLARE @f TABLE (fieldName VARCHAR(255), IDField int)
INSERT INTO @f VALUES('>>>>>>1.pdf test> >b>c>xyz.pdf bob >hello world.pdf foo >womp womp.pdf>', 1)
INSERT INTO @f VALUES('>2.pdf other unnecssary stuff > bar.pdf', 2)

; WITH cte2 AS (
   SELECT 
       IDField,
       --PATINDEX('%.pdf%', fieldName) + 3 AS PDFLocation,
       --SUBSTRING(fieldName, 1, (PATINDEX('%.pdf%', fieldName) + 3)) AS PDFSubstring,
       --REVERSE(SUBSTRING(fieldName, 1, (PATINDEX('%.pdf%', fieldName) + 3))) AS PDFSubstringReverse,
       --PATINDEX('%>%', REVERSE(SUBSTRING(fieldName, 1, (PATINDEX('%.pdf%', fieldName) + 3)))) AS ReverseSymbolLocationBeforePDF,
       --LEN(SUBSTRING(fieldName, 1, (PATINDEX('%.pdf%', fieldName) + 3))) - PATINDEX('%>%', REVERSE(SUBSTRING(fieldName, 1, (PATINDEX('%.pdf%', fieldName) + 3)))) + 2 AS SymbolLocationBeforePDF,

       CONVERT(VARCHAR(255), SUBSTRING(fieldName, 
           LEN(SUBSTRING(fieldName, 1, (PATINDEX('%.pdf%', fieldName) + 3))) - PATINDEX('%>%', REVERSE(SUBSTRING(fieldName, 1, (PATINDEX('%.pdf%', fieldName) + 3)))) + 2, 
           PATINDEX('%.pdf%', fieldName) + 3 - (LEN(SUBSTRING(fieldName, 1, (PATINDEX('%.pdf%', fieldName) + 3))) - PATINDEX('%>%', REVERSE(SUBSTRING(fieldName, 1, (PATINDEX('%.pdf%', fieldName) + 3)))) + 2) + 1
       )) AS PDFName,

       CONVERT(VARCHAR(255), STUFF(fieldName, 1, PATINDEX('%.pdf%', fieldName) + 3, '')) AS strWhatsLeft
   FROM @f

   UNION ALL

   SELECT 
       IDField,
       --PATINDEX('%.pdf%', strWhatsLeft) + 3 AS PDFLocation,
       --SUBSTRING(strWhatsLeft, 1, (PATINDEX('%.pdf%', strWhatsLeft) + 3)) AS PDFSubstring,
       --REVERSE(SUBSTRING(strWhatsLeft, 1, (PATINDEX('%.pdf%', strWhatsLeft) + 3))) AS PDFSubstringReverse,
       --PATINDEX('%>%', REVERSE(SUBSTRING(strWhatsLeft, 1, (PATINDEX('%.pdf%', strWhatsLeft) + 3)))) AS ReverseSymbolLocationBeforePDF,
       --LEN(SUBSTRING(strWhatsLeft, 1, (PATINDEX('%.pdf%', strWhatsLeft) + 3))) - PATINDEX('%>%', REVERSE(SUBSTRING(strWhatsLeft, 1, (PATINDEX('%.pdf%', strWhatsLeft) + 3)))) + 2 AS SymbolLocationBeforePDF,

       CONVERT(VARCHAR(255), SUBSTRING(strWhatsLeft, 
           LEN(SUBSTRING(strWhatsLeft, 1, (PATINDEX('%.pdf%', strWhatsLeft) + 3))) - PATINDEX('%>%', REVERSE(SUBSTRING(strWhatsLeft, 1, (PATINDEX('%.pdf%', strWhatsLeft) + 3)))) + 2, 
           PATINDEX('%.pdf%', strWhatsLeft) + 3 - (LEN(SUBSTRING(strWhatsLeft, 1, (PATINDEX('%.pdf%', strWhatsLeft) + 3))) - PATINDEX('%>%', REVERSE(SUBSTRING(strWhatsLeft, 1, (PATINDEX('%.pdf%', strWhatsLeft) + 3)))) + 2) + 1
       )) AS PDFName,

       CONVERT(VARCHAR(255), STUFF(strWhatsLeft, 1, PATINDEX('%.pdf%', strWhatsLeft) + 3, '')) AS strWhatsLeft
   FROM cte2
   WHERE strWhatsLeft LIKE '%.pdf%'
)

SELECT * FROM cte2 ORDER BY IDField