在数据帧中混合`stringsAsFactors`

时间:2016-08-11 11:31:12

标签: r dataframe

我试图创建一个R dataFrame,其中某些列将被视为因子而其他列将被视为字符串。

fruits <- data.frame(fruit = character(), descr = character())
fruits <- rbind(fruits, data.frame(fruit = "apple", descr = "jjrkgnser"))
fruits <- rbind(fruits, data.frame(fruit = "apple", descr = "aprtgh"))
fruits <- rbind(fruits, data.frame(fruit = "pear", descr = "akjreg"))

这两个列都被视为因素,我最终得到的descr列的因子与fruits dataFrame中的行数一样多。

如何将descr视为字符串,将fruit视为因素? 如果我使用stringsAsFactors = FALSE它适用于所有列!

修改

我破解了这个解决方案,并不那么优雅:

fruits <- data.frame(fruit = factor(), path = character(), stringsAsFactors = FALSE)
fruits <- rbind(fruits, data.frame(fruit = factor("apple"), path = "jjrkgnser", stringsAsFactors = FALSE))
fruits <- rbind(fruits, data.frame(fruit = factor("apple"), path = "aprtgh", stringsAsFactors = FALSE))
fruits <- rbind(fruits, data.frame(fruit = factor("pear"), path = "akjreg", stringsAsFactors = FALSE))

否则

> str(fruits)
'data.frame':   3 obs. of  2 variables:
 $ fruit: Factor w/ 2 levels "apple","pear": 1 1 2
 $ path : chr  "jjrkgnser" "aprtgh" "akjreg"

符合要求。是否有更好的方式?

2 个答案:

答案 0 :(得分:1)

我不确定您是使用rbind作为说明性示例还是用例(以这种方式扩展数据的内存效率非常低),但假设它是必要的,那么您可以输入使用data_frame包中的tibble(在dplyr等包装生态系统中)更加简约

library(tibble)
fruits <- data_frame(fruit = factor(), descr = character())
fruits <- rbind(fruits, data_frame(fruit = factor("apple"), descr = "jjrkgnser"))
fruits <- rbind(fruits, data_frame(fruit = factor("apple"), descr = "aprtgh"))
fruits <- rbind(fruits, data_frame(fruit = factor("pear"), descr = "akjreg"))

答案 1 :(得分:1)

# creating the dataset (no usage of rbind if possible) with factor columns by default
fruits <- data.frame(fruit = c("apple", "apple", "pear"), 
                     path = c("jjrkgnser", "aprtgh", "akjreg"))

# transform this column to a character vector
fruits$path = as.character(fruits$path)