查找excel中多列中出现的值

时间:2018-02-23 18:11:33

标签: excel

我有一组基因探针,当置于不同的化学应激下时会被上调。每列包含所有上调的基因探针。我有12列,如何获得所有12列中出现的基因探针列表?

我已经能够使用公式

找到两列之间的相似性
 =IF(ISERROR(MATCH(A2,$C$2:$C$21473,0)),"",A2)

但无法解决如何使其适应12列

G.Ac  G.As  G.At  G.Ac.At  G.As.Ac  G.As.At G.Cd  G.Cu  G.Ni    
G.Cd.Cu  G.Cd.Ni  G.Ni.Cu               

GENE:JGI_V11_3346220103 GENE:JGI_V11_2653050203 GENE:JGI_V11_3299790103 
GENE:JGI_V11_359040103  GENE:JGI_V11_2228010103 GENE:JGI_V11_2662750203 
GENE:JGI_V11_1926920303 GENE:JGI_V11_3134270303 GENE:JGI_V11_3119540303 
GENE:JGI_V11_3134270203 GENE:JGI_V11_1926920303 GENE:JGI_V11_3134270303             

GENE:JGI_V11_3164760203 GENE:JGI_V11_565470303  GENE:JGI_V11_2296170203 
GENE:JGI_V11_2045300203 GENE:JGI_V11_2421620203 GENE:JGI_V11_2228010303 
GENE:JGI_V11_2196580303 GENE:JGI_V11_3134270203 GENE:JGI_V11_3119540203 
GENE:JGI_V11_1926920103 GENE:JGI_V11_1926920103 GENE:JGI_V11_1014720202             

GENE:JGI_V11_478830203  GENE:JGI_V11_3168730303 GENE:JGI_V11_3311070202 
GENE:JGI_V11_3216620102 GENE:JGI_V11_2653050303 GENE:JGI_V11_3300140202 
GENE:JGI_V11_2653050303 GENE:JGI_V11_1159220202 GENE:JGI_V11_2024180303 
GENE:JGI_V11_1926920303 GENE:JGI_V11_2196580303 GENE:JGI_V11_1159220202             

GENE:JGI_V11_3164760303 GENE:JGI_V11_2228010203 GENE:JGI_V11_2341670203 
GENE:JGI_V11_1938910303 GENE:JGI_V11_3026230203 GENE:JGI_V11_2449230203 
GENE:JGI_V11_3134270303 GENE:JGI_V11_2235750203 GENE:JGI_V11_1981410203 
GENE:JGI_V11_3251310202 GENE:JGI_V11_977750103  GENE:JGI_V11_954070203              

GENE:JGI_V11_2267320203 GENE:JGI_V11_2268000303 GENE:JGI_V11_2226270101 
GENE:JGI_V11_3003640303 GENE:JGI_V11_223520203  GENE:JGI_V11_2662750103 
GENE:JGI_V11_2228010103 GENE:JGI_V11_3251310202 GENE:JGI_V11_3198630203 
GENE:JGI_V11_3134270303 GENE:JGI_V11_1926920203 GENE:JGI_V11_287750103              

GENE:JGI_V11_465160203  GENE:JGI_V11_2268000203 GENE:JGI_V11_2473230303 
GENE:JGI_V11_3192220102 GENE:JGI_V11_3026230303 GENE:JGI_V11_3039310303 
GENE:JGI_V11_1926920103 GENE:JGI_V11_1159220102 GENE:JGI_V11_3052790202 
GENE:JGI_V11_3075830303 GENE:JGI_V11_2196580203 GENE:JGI_V11_3134280203             

GENE:JGI_V11_3142970303 GENE:JGI_V11_503720303  GENE:JGI_V11_2236410103 
GENE:JGI_V11_3042230103 GENE:JGI_V11_2228010203 GENE:JGI_V11_3028210101 
GENE:JGI_V11_2105710303 GENE:JGI_V11_1926920303 GENE:JGI_V11_2131620103 
GENE:JGI_V11_1002840203 GENE:JGI_V11_2088480203 GENE:JGI_V11_3196120102             

这是12列中的前8行。总共有21473行。

由于

1 个答案:

答案 0 :(得分:0)

您可以使用这样的数组公式来计算特定基因探针在

中出现的列数
=SUM(--(MMULT(TRANSPOSE(ROW(A$2:L$10000)^0),N(A$2:L$10000=A2))>0))

这是获取2D数组的列总数的标准方法 - 在这种情况下,对应于数组元素的实例等于/不等于A2的真/假值数组。

这是一种蛮力方法 - 每行需要〜120K乘法。如果将公式复制为~10K行,则计算机上的延迟时间约为100秒,而Excel会计算出结果。

必须使用 Ctrl Shift 输入

作为数组公式输入

enter image description here

在这个虚拟数据中,C是所有12列中唯一出现的值。