R中的函数表示表列中的条目

时间:2014-07-22 08:34:32

标签: r read.table

我在R中有这个输入:

> table2[2]
   Describe.the.color.shown.in.the.image.below.
1                                  soft crimson
2                                     dark pink
3                                    watermelon
4                                     Light Red
5                                    dark coral
6                                          Rose
7                                         peach
8                               strawberry pink
9                                     light red
10                                         pink
11                                    light red
12                                       salmon
13                                    light red
14                                    light red
15                                         pink
16                                         pink
17                        light and unclear red
18                                   velvet red
19                                    light red
20                                       orange
21                                    light red
22                                   light  red
23                                    light red
24                                    dark pink
25                                   red orange
26                                         pink

我需要做的是例如:

light red = 8/26
pink = 4/26
orange = 1/26
salmon = 1/26
rose = 1/26

那说我想自动做一个完整的描述table[2],意思是对颜色进行聚类或简单地计算它们。

有什么想法?非常感谢

1 个答案:

答案 0 :(得分:1)

您的示例数据显示每个条目之间的空格不等,您还有light redLight red等。如果您的实际情况不是cleanstr_trim数据集没有空间问题。

 table1 <- structure(list(val = 1:26, V1 = c("soft crimson", "dark pink", 
 "watermelon", "Light Red", "dark coral", "Rose", "peach", "strawberry pink", 
 "light red", "pink", "light red", "salmon", "light red", "light red", 
 "pink", "pink", "light and unclear red", "velvet red", "light red", 
 "orange", "light red", "light  red", "light red", "dark pink", 
 "red orange", "pink")), .Names = c("val", "V1"), row.names = c(NA, 
 -26L), class = "data.frame")



 library(qdap)
 library(stringr)


 100*round(prop.table(table(clean(str_trim(toupper(table1[,2]))))),2) #in the above dataset, `str_trim` is not needed though

    #   DARK CORAL             DARK PINK LIGHT AND UNCLEAR RED 
    #            4                     8                     4 
    #    LIGHT RED                ORANGE                 PEACH 
    #           35                     4                     4 
    #          PINK            RED ORANGE                  ROSE 
    #          15                     4                     4 
    #       SALMON          SOFT CRIMSON       STRAWBERRY PINK 
    #            4                     4                     4 
    #    VELVET RED            WATERMELON 
    #             4                     4