我有一个SQL表,其中包含以下两列:
FORMAT Sample
GT:AD:DP:GQ:PL 0/0:233,0:233:99:0,120,1800
GT:AD:DP:GQ:PL 0/1:101,61:220:99:835,0,1859
GT:AD:DP:GQ:PL 0/0:172,0:172:99:0,120,1800
GT:AD:DP:GQ:PL 0/0:216,0:216:99:0,120,1800
GT:AD:DP:GQ:PL 0/0:216,0:216:99:0,120,1800
GT:AD:DP:GQ:PGT:PID:PL 0/1:185,232:417:99:0|1:8029494_T_G:8670,0,6429
GT:AD:DP:GQ:PL 0/0:367,0:367:99:0,120,1800
GT:AD:DP:GQ:PGT:PID:PL 0/1:150,198:348:99:0|1:8029494_T_G:7930,0,5677
GT:AD:DP:GQ:PGT:PID:PL 0/1:148,196:344:99:0|1:8029494_T_G:7876,0,5652
GT:AD:DP:GQ:PGT:PID:PL 0/0:148,0:344:99:0|1:8029494_T_G:7876,8334,14591
GT:AD:DP:GQ:PGT:PID:PL 0/0:148,0:344:99:0|1:8029494_T_G:7876,8334,14591
FORMAT列指定由以下列分隔的列中给出的字段的ID:“/”。
我想根据FORMAT列中的ID /位置从第二列中提取特定字段,即AD(第二个),DP(第三个)或GQ(第四个)。
我能够使用以下代码提取AD字段:
SELECT SUBSTRING(Sample, CHARINDEX(':',Sample)+1, CHARINDEX(':',Sample,5)-5) FROM Table 1;
问题是我无法提取字段DP或GQ,因为不同字段的长度并不总是相同的,我无法指定哪个应该是搜索以下“:”的起始位置位置。
我还尝试使用本网站的分割功能:
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=50648
问题在于我不知道如何将列声明为变量,以便我可以为表的每一行提取必需的字段。
[Sample]列的所需输出应如下所示:
GT AD DP GQ
0/0 233,0 233 99
0/1 101,61 220 99
0/0 172,0 172 99
0/0 216,0 216 99
0/0 216,0 216 99
0/1 185,232 417 99
0/0 367,0 367 99
0/1 150,198 348 99
0/1 148,196 344 99
0/0 148,0 344 99
0/0 148,0 344 99
任何帮助将不胜感激,
谢谢,
答案 0 :(得分:1)
也许有点XML作为解析器
示例强>
Select A.Format
,B.*
From YourTable A
Cross Apply (
Select Pos2 = ltrim(rtrim(xDim.value('/x[2]','varchar(max)')))
,Pos3 = ltrim(rtrim(xDim.value('/x[3]','varchar(max)')))
,Pos4 = ltrim(rtrim(xDim.value('/x[4]','varchar(max)')))
From (Select Cast('<x>' + replace((Select replace(A.Format,':','§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml) as xDim) as A
) B
<强>返回强>
Format Pos2 Pos3 Pos4
GT:AD:DP:GQ:PL AD DP GQ
GT:AD:DP:GQ:PL AD DP GQ
GT:AD:DP:GQ:PL AD DP GQ
GT:AD:DP:GQ:PL AD DP GQ
GT:AD:DP:GQ:PL AD DP GQ
GT:AD:DP:GQ:PGT:PID:PL AD DP GQ
GT:AD:DP:GQ:PL AD DP GQ
GT:AD:DP:GQ:PGT:PID:PL AD DP GQ
GT:AD:DP:GQ:PGT:PID:PL AD DP GQ
GT:AD:DP:GQ:PGT:PID:PL AD DP GQ
GT:AD:DP:GQ:PGT:PID:PL AD DP GQ
或简单版
Select A.Format
,Pos2 = Cast('<x>' + replace(Format,':','</x><x>')+'</x>' as xml).value('/x[2]','varchar(max)')
,Pos3 = Cast('<x>' + replace(Format,':','</x><x>')+'</x>' as xml).value('/x[3]','varchar(max)')
,Pos4 = Cast('<x>' + replace(Format,':','</x><x>')+'</x>' as xml).value('/x[4]','varchar(max)')
From YourTable A
或者如果打开UDF
查看TSQL/SQL Server - table function to parse/split delimited string to multiple/separate columns
编辑 - 样本更新
Select A.Format
,GT = Cast('<x>' + replace(Sample,':','</x><x>')+'</x>' as xml).value('/x[1]','varchar(max)')
,AD = Cast('<x>' + replace(Sample,':','</x><x>')+'</x>' as xml).value('/x[2]','varchar(max)')
,DP = Cast('<x>' + replace(Sample,':','</x><x>')+'</x>' as xml).value('/x[3]','varchar(max)')
,GQ = Cast('<x>' + replace(Sample,':','</x><x>')+'</x>' as xml).value('/x[4]','varchar(max)')
From YourTable A