Chromosome Locus Variant_A Variant_B Variant Strain_ID Family Parent1_Name Parent1_Marker Parent2_Name Parent2_Marker Line Marker Gid
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Gm09 40907915 G A GA DS11.46096 46 IA3023 AA PI507.681B* BB 96 BB 2
Gm09 422384 G A GA DS11.46096 46 IA3023 AA PI507.681B* BB 96 AA 4
Gm09 422720 A G AG DS11.46096 46 IA3023 BB PI507.681B* AA 96 BB 5
Gm09 424439 C A CA DS11.46096 46 IA3023 AA PI507.681B* BB 96 AA 7
Gm09 425375 G T GT DS11.46096 46 IA3023 AA PI507.681B* BB 96 AA 9
Gm09 425581 T C TC DS11.46096 46 IA3023 BB PI507.681B* AA 96 BB 10
Gm09 43921862 C A CA DS11.46096 46 IA3023 BB PI507.681B* AA 96 AA 12
我附上了桌子上的图像。此表中的个别ID为Strain_ID
,每个Id在不同的位置具有多个标记。我希望在Marker上聚合Loci pivot
,以便我可以在行中包含单个strain_Ids
,并将所有loci作为列。
这是我在Sql server 2012 Management studio上使用的脚本:
declare @cols varchar(Max)
declare @cols1 varchar(Max)
set @cols ='[' ;
select @cols += 'D.'+QUOTENAME (Locus) + ','
from(
select distinct Locus from genotypeQA where Chromosome IN ('Gm01')
) as X
set @cols= stuff(replace(@cols,'D.[','['),1,1,'')
print @cols
set @cols1 = SUBSTRING(@cols,1,LEN(@cols)-1)
print @cols1
select *
from (
select
genotypeQA.Strain_ID,
genotypeQA.Family,
'+ @cols +',
genotypeQA.Marker
from genotypeQA
where
genotypeQA.Family IN ('10')
AND genotypeQA.Chromosome IN ('Gm01')
) as D
Pivot(
MAX(Marker)
For Locus IN ('+ @cols +')) as p
我收到以下错误:
Msg 102, Level 15, State 1, Line 31
Incorrect syntax near '+ @cols +'.
I expect the following format of output shown here with part of the table:
| Strain | | Gm09_40907915 | Gm09_422384 | Gm09_422720 | Gm09_424439 | |
| DS11.46096 | Variant_A | G | G | A | C | |
| DS11.46096 | Variant_B | A | A | G | A | |
| DS11.46096 | Variant | GA | GA | AG | CA | |
+ ------------ + ----------- + --------------- + ----- -------- + ------------- + ----------
答案 0 :(得分:0)
您必须将SQL放入变量中,然后使用EXEC(@sqlCode)动态执行SQL。例如。 DECLARE @mySql AS VARCHAR(MAX) = sql pivot query here
。然后调用EXEC(@sqlCode)生成结果。像这样:
DECLARE @sqlCode AS VARCHAR(MAX);
SET @sqlCode = 'select *
from (
select genotypeQA.Strain_ID, genotypeQA.Family,'+ @cols + ' ,genotypeQA.Marker from genotypeQA where genotypeQA.Family IN ('10') AND genotypeQA.Chromosome IN ('Gm01')
) as D
Pivot( MAX(Marker) For Locus IN ('+ @cols +')) as p'
EXEC (@sqlCode)
答案 1 :(得分:0)
编辑:
首先,您需要将Variant
,Variant_A
和Variant_B
值添加到另一列中的单列和相应值列名称,即您必须UNPIVOT
。我在这里使用了CROSS APPLY
而不是UNPIVOT
。
声明一个变量以获取数据透视表的列名
DECLARE @cols NVARCHAR (MAX)
SELECT @cols = COALESCE (@cols + ',[' + ChrLocus + ']', '[' + ChrLocus + ']')
FROM
(
SELECT DISTINCT Chromosome+'_'+ CAST(Locus AS VARCHAR(10))ChrLocus
FROM #TEMP
) PV
ORDER BY ChrLocus
现在转动结果。我在查询中编写了逻辑
DECLARE @query NVARCHAR(MAX)
SET @query = '-- This outer query forms your pivoted result
SELECT * FROM
(
-- Source data for pivoting
SELECT DISTINCT Chromosome+''_''+ CAST(Locus AS VARCHAR(10))ChrLocus,Strain_ID,
Variants,COLNAMES
FROM #TEMP
CROSS APPLY(VALUES (Variant_A,''Variant_A''),(Variant_B,''Variant_B''),(Variant,''Variant''))
AS COLUMNNAMES(Variants,COLNAMES)
) x
PIVOT
(
--Defines the values in each dynamic columns
MIN(Variants)
-- Get the names from the @cols variable to show as column
FOR ChrLocus IN (' + @cols + ')
) p
ORDER BY Strain_ID;'
EXEC SP_EXECUTESQL @query