如何在R中的数据框的列中压缩具有相同字符的字符串数据

时间:2014-10-02 06:06:00

标签: r string dataframe

我有一个带有字符串列的数据框,我需要将所有相等的值压缩成一个值。我的数据集的一个例子:

                      CommonName Month                      Site season period
23                Gambel's Quail   Oct McDowell Sonoran Preserve Autumn      4
24              American Kestrel   Nov McDowell Sonoran Preserve Autumn      4
25        Black-throated Sparrow   Nov McDowell Sonoran Preserve Autumn      4
26              Brewer's Sparrow   Nov McDowell Sonoran Preserve Autumn      4
27                  Common Raven   Nov McDowell Sonoran Preserve Autumn      4
28                Gilded Flicker   Nov McDowell Sonoran Preserve Autumn      4
29             Loggerhead Shrike   Nov McDowell Sonoran Preserve Autumn      4
30             Loggerhead Shrike   Nov McDowell Sonoran Preserve Autumn      4
31          Northern Mockingbird   Nov McDowell Sonoran Preserve Autumn      4
32               Red-tailed Hawk   Nov McDowell Sonoran Preserve Autumn      4
33         White-crowned Sparrow   Nov McDowell Sonoran Preserve Autumn      4
107             Acorn Woodpecker   Oct McDowell Sonoran Preserve Autumn      4
108                 Say's Phoebe   Nov McDowell Sonoran Preserve Autumn      4
236               Abert's Towhee   Nov        Brown's Ranch Wash Autumn      4
237                  Cactus Wren   Nov        Brown's Ranch Wash Autumn      4
238                Canyon Towhee   Nov        Brown's Ranch Wash Autumn      4
239        Curve-billed Thrasher   Nov        Brown's Ranch Wash Autumn      4
240               Gambel's Quail   Nov        Brown's Ranch Wash Autumn      4

此数据跨越多年,因此物种可能会多次列出。这是我想避免的,因为我只想确定每个地点和季节内物种的出现。所以在这个例子中,我想只为Loggerhead Shrike和Gambel的Quail提供一个数据点,而其他一切都保持不变。我感谢您的帮助。我一直没有找到类似的问题,但我不确切地知道这个过程会被称为什么。

1 个答案:

答案 0 :(得分:0)

To"为确定每个地点和季节内物种的发生情况,请尝试以下方法:

> with(ddf, table(CommonName, Site, Season))

, , Season = Autumn

                        Site
CommonName               BrownsRanch McDowell
  Aberts Towhee                    1        0
  Acorn Woodpecker                 0        1
  American Kestrel                 0        1
  Black-throated Sparrow           0        1
  Brewers Sparrow                  0        1
  Cactus Wren                      1        0
  Canyon Towhee                    1        0
  Common Raven                     0        1
  Curve-billed Thrasher            1        0
  Gambels Quail                    1        1
  Gilded Flicker                   0        1
  Loggerhead Shrike                0        2
  Northern Mockingbird             0        1
  Red-tailed Hawk                  0        1
  Says Phoebe                      0        1
  White-crowned Sparrow            0        1

或者:

> with(ddf, table(CommonName, Season, Site))
, , Site = BrownsRanch

                        Season
CommonName               Autumn
  Aberts Towhee               1
  Acorn Woodpecker            0
  American Kestrel            0
  Black-throated Sparrow      0
  Brewers Sparrow             0
  Cactus Wren                 1
  Canyon Towhee               1
  Common Raven                0
  Curve-billed Thrasher       1
  Gambels Quail               1
  Gilded Flicker              0
  Loggerhead Shrike           0
  Northern Mockingbird        0
  Red-tailed Hawk             0
  Says Phoebe                 0
  White-crowned Sparrow       0

, , Site = McDowell

                        Season
CommonName               Autumn
  Aberts Towhee               0
  Acorn Woodpecker            1
  American Kestrel            1
  Black-throated Sparrow      1
  Brewers Sparrow             1
  Cactus Wren                 0
  Canyon Towhee               0
  Common Raven                1
  Curve-billed Thrasher       0
  Gambels Quail               1
  Gilded Flicker              1
  Loggerhead Shrike           2
  Northern Mockingbird        1
  Red-tailed Hawk             1
  Says Phoebe                 1
  White-crowned Sparrow       1

我将一些季节作品更改为春季:

> with(ddf, table(CommonName, Season, Site))
, , Site = BrownsRanch

                        Season
CommonName               Autumn Spring
  Aberts Towhee               0      1
  Acorn Woodpecker            0      0
  American Kestrel            0      0
  Black-throated Sparrow      0      0
  Brewers Sparrow             0      0
  Cactus Wren                 0      1
  Canyon Towhee               1      0
  Common Raven                0      0
  Curve-billed Thrasher       1      0
  Gambels Quail               1      0
  Gilded Flicker              0      0
  Loggerhead Shrike           0      0
  Northern Mockingbird        0      0
  Red-tailed Hawk             0      0
  Says Phoebe                 0      0
  White-crowned Sparrow       0      0

, , Site = McDowell

                        Season
CommonName               Autumn Spring
  Aberts Towhee               0      0
  Acorn Woodpecker            0      1
  American Kestrel            1      0
  Black-throated Sparrow      1      0
  Brewers Sparrow             1      0
  Cactus Wren                 0      0
  Canyon Towhee               0      0
  Common Raven                1      0
  Curve-billed Thrasher       0      0
  Gambels Quail               1      0
  Gilded Flicker              1      0
  Loggerhead Shrike           0      2
  Northern Mockingbird        0      1
  Red-tailed Hawk             0      1
  Says Phoebe                 0      1
  White-crowned Sparrow       0      1

删除名称,地点和季节相同的额外行:

> ddf[!duplicated(paste(ddf$CommonName,ddf$Site,ddf$Season)),]
               CommonName Month        Site Season period
1           Gambels Quail   Oct    McDowell Autumn      4
2        American Kestrel   Nov    McDowell Autumn      4
3  Black-throated Sparrow   Nov    McDowell Autumn      4
4         Brewers Sparrow   Nov    McDowell Autumn      4
5            Common Raven   Nov    McDowell Autumn      4
6          Gilded Flicker   Nov    McDowell Autumn      4
7       Loggerhead Shrike   Nov    McDowell Autumn      4
9    Northern Mockingbird   Nov    McDowell Autumn      4
10        Red-tailed Hawk   Nov    McDowell Autumn      4
11  White-crowned Sparrow   Nov    McDowell Autumn      4
12       Acorn Woodpecker   Oct    McDowell Autumn      4
13            Says Phoebe   Nov    McDowell Autumn      4
14          Aberts Towhee   Nov BrownsRanch Autumn      4
15            Cactus Wren   Nov BrownsRanch Autumn      4
16          Canyon Towhee   Nov BrownsRanch Autumn      4
17  Curve-billed Thrasher   Nov BrownsRanch Autumn      4
18          Gambels Quail   Nov BrownsRanch Autumn      4

请注意,引号(如Gambel' s)在字符串条目中表现不佳,并已从上述数据中删除。