Question

我基本上试图改变数据集并根据该数据集中另一列的值添加一列。我该怎么做？

假设我有一个如下所示的数据集：

movies
# A tibble: 651 x 32
                    title   title_type       genre runtime mpaa_rating                   studio
                    <chr>       <fctr>      <fctr>   <dbl>      <fctr>                   <fctr>
 1            Filly Brown Feature Film       Drama      80           R      Indomina Media Inc.
 2               The Dish Feature Film       Drama     101       PG-13    Warner Bros. Pictures
 3    Waiting for Guffman Feature Film      Comedy      84           R   Sony Pictures Classics
 4   The Age of Innocence Feature Film       Drama     139          PG        Columbia Pictures
 ... (more columns and more rows than shown)

假设它有一个名为thtr_release_month的列（未显示），可能的值等于一年中的某个月，如“十月”或“一月”

如果电影在oscar_season或yes中发布，我想添加一个名为no的列November或December。怎么做到这一点？我觉得这很接近：

movies_with_oscar_season <- movies %>% mutate(oscar_season = ifelse(movies$thtr_release_month == 'November' | movies$thtr_release_month == 'December', 'yes', 'no'))

我错过了什么？如何改进上述代码？

我实际上遇到了错误：

Column oscar_season must be length 651 (the number of rows) or one, not 0 Calls: <Anonymous> ... <Anonymous> -> mutate -> mutate.tbl_df

我做错了什么？

有没有办法写那个长or表达式？

Answer 1

您可以创建一个新的向量，其结果是评估您的条件：

oscar_season <- (ifelse(movies$thtr_release_month %in% c('November','December')), "yes", "no")

编辑： 根据评论，如果是，则需要显示“是”或“否” condicion分别为TRUE或FALSE。

然后使用新列调用mutate：

movies_oscar_season <- mutate(movies, oscar_season)

这应该会为您提供包含oscar_season列的原始数据集。

使用R根据另一列的值来改变列

1 个答案: