我有一张这样的桌子:
+--------+-------+--------+-------+
| attr1 | attr2 | attr3 | attr4 |
+--------+-------+--------+-------+
| purple | wine | clear | 10.0 |
| red | wine | solid | 20.0 |
| red | beer | cloudy | 10.0 |
| purple | ale | clear | 34.0 |
| blue | ale | solid | 16.0 |
+--------+-------+--------+-------+
我想这样转换:
+--------+-------+-------+-------+-------+
| | attr1 | attr2 | attr3 | attr4 |
+--------+-------+-------+-------+-------+
| purple | 2 | | | |
| red | 2 | | | |
| blue | 1 | | | |
| wine | | 2 | | |
| beer | | 1 | | |
| ale | | 2 | | |
| clear | | | 2 | |
| solid | | | 2 | |
| cloudy | | | 1 | |
| 10.0 | | | | 2 |
| 20.0 | | | | 1 |
| 34.0 | | | | 1 |
| 16.0 | | | | 1 |
+--------+-------+-------+-------+-------+
此数据透视表或交叉表将向我显示每个属性值在其各自列中的计数。
如何使用Google Query语言显示此类交叉表?
答案 0 :(得分:2)
好吧,如果数据分为两列,则很简单,例如对于这样的事情
Attrib Column
Red 1
Red 1
Green 1
Blue 1
Beer 2
Ale 2
Ale 2
您可以使用类似的查询
=query(A:B,"select A,count(A) where A<>'' group by A pivot B")
因此,问题在于将OP#的数据分为两列。
这可以通过目前相当标准的split / join / transpose技术来完成
=ArrayFormula(split(transpose(split(textjoin("|",true,if(A2:D="","",A2:D&" "&column(A2:D))),"|"))," "))
给予
您可以基于此结果运行查询,也可以像这样将两者结合
=ArrayFormula(query({"Attrib","Number";split(transpose(split(textjoin("|",true,if(A2:D="","",column(A2:D)&"-"&A2:D&" "&column(A2:D))),"|"))," ")},"Select Col1,count(Col1) group by Col1 pivot Col2"))
我已经将列号加入了属性,例如1-blue,以便按正确的顺序排序。如果您不喜欢它,可以使用regexreplace摆脱它。
修改
略短的公式-我不需要将标题分别放置:
=ArrayFormula(query(split(transpose(split(textjoin("|",true,if(A2:D="","",column(A2:D)&"-"&A2:D&" Attr"&column(A2:D))),"|"))," "),
"Select Col1,count(Col1) group by Col1 pivot Col2",0))
编辑2
我那里有点厚,应该使用OP数据的第一行作为属性标签,而不是列号
=ArrayFormula(query(split(transpose(split(textjoin("|",true,if(A2:D="","",column(A2:D)&"-"&A2:D&" "&A1:D1)),"|"))," "),
"Select Col1,count(Col1) group by Col1 pivot Col2",0))
编辑3
应该选择更好的一对定界符
=ArrayFormula(query(split(transpose(split(textjoin("",true,if(A2:D="","",column(A2:D)&"-"&A2:D&""&A1:D1)),"")),""),
"Select Col1,count(Col1) group by Col1 pivot Col2",0))