从多列获得第一和第二高频率

时间:2018-02-15 13:48:21

标签: r sapply

我有一个包含52行和161列的数据框。我已经给出了我的数据帧的结构。

>str(CEPH)
'data.frame':   52 obs. of  161 variables:
$ id         : chr  "85" "86" "94" "00" ...
$ subgroup   : chr  "AAA" "AAA" "AAA" "AAA" ...
$ A1_A   : chr  "3:01" "3:01" "2:01" "2:01" ...
$ A1_B   : chr  "" "" "" "" ...
$ A2_A   : chr  "2:01" "32:01:01" "32:01:01" "68:01:02" ...
$ A2_B   : chr  "" "32:01:02" "32:01:02" "" ...
$ A2_C   : chr  "" "" "" "" ...
$ B1_A   : chr  "7:02:01" "44:03:01" "40:02:00" "44:02:00" ...
...

我在一些专栏中有更多的NA。因此,我需要找到第一和第二高频率。我尝试了以下代码。但是有超过50列。不可能逐一传递列。是否有任何方法可以使用sapply

进行检索

输入数据:

 id subgroup A1_A A1_B A1_C A1_D A1_E A1_F A1_G  
 1  85     AAA     3:01   ""       ""        ""      ""       ""                                                                                                                 
 2  86     AAA     3:01   05:01    ""        07:08   ""       ""                                                                                                                              
 3  94     AAA     2:01   05:01    ""        ""      ""       ""                                                                                                                                              
 4  000    AAA     2:01   06:07    ""        ""      ""       ""                                                                                                                                              
 5  37     AAA 30:01:00   07:08    ""        ""      ""       ""                                                                                                                                              
 6  48     AAA     2:01   01:01    ""        ""      ""       "" 

fre <- function(CEPH,col) {
q<-sort(table(CEPH[,col]),decreasing = TRUE)[1:2]
          return(q) }
 fre(AAA,4)

我得到的输出没有列名

  NA      32:01:02 
  49        2 

欲望输出

Types   Frequent_Type    Highest_Frequency       
A1_A     2:01            20
A1_A     NA               5
A1_B     NA              49    
A1_B     3:01:01         5  
A1_C     2:01            20
A1_C     05:02            2

1 个答案:

答案 0 :(得分:0)

这可能不是确切的解决方案。但不知何故,我设法分别得到两个频率并合并到其他地方。

<cfset httpHeaders = getHttpRequestData().headers>

<h3>getHttpRequestData().headers</h3>

<cfloop collection="#httpHeaders#" item="key" >
    <cfoutput><strong>#Key#</strong> : #httpHeaders[key]#<br></cfoutput>
</cfloop>

<h3>cgi keys dash to underscore</h3>

<cfloop collection="#httpHeaders#" item="key" >
    <cfset keyUnderscore = replace(key, "-", "_", "all")>
    <cfoutput><strong>#keyUnderscore#</strong> : #cgi[keyUnderscore]#<br></cfoutput>
</cfloop>