R中的因子和子因子

时间:2014-06-25 09:42:32

标签: r

如果我有以下数据框的因子及其子因子。我怎样才能提取狗吃的食物和猫吃的食物等信息?

factors subfactors
dog          biscuit
dog          chicken
dog          chicken
cat          milk
cat          soup
dog          pedigree

预期产出:

dog: biscuit,chicken,pedigree
cat: milk,soup

4 个答案:

答案 0 :(得分:2)

可以做

library(data.table)
setDT(df)[, list(Ate = paste(unique(subfactors), collapse = ", ")), by = factors]


##    factors                        Ate
## 1:     dog biscuit, chicken, pedigree
## 2:     cat                 milk, soup

更多data.table这样做的方法是

setDT(df)[, lapply(.SD, function(x) paste(unique(x), collapse = ", ")), by = factors]

您也可以使用基数R执行此操作(尽管如果您拥有大数据集,data.table总是更可取)

aggregate(subfactors ~ factors, df, function (x) paste(unique(x), collapse = ", "))

##   factors                 subfactors
## 1     cat                 milk, soup
## 2     dog biscuit, chicken, pedigree

答案 1 :(得分:1)

你的意思是基本索引????

df <- read.table(header = TRUE, text = 'factors subfactors
dog          biscuit
dog          chicken
cat          milk
cat          soup
dog          pedigree')

df[df$factors == 'dog', "subfactors"]
df[df$factors == 'cat', "subfactors"]

或者可能会分成一个列表:

split(df$subfactors, df$factors)

答案 2 :(得分:1)

使用sqldf

#reproducible data
df <- read.table(text = "factors subfactors
dog          biscuit
dog          chicken
dog          chicken
cat          milk
cat          soup
dog          pedigree",header=TRUE,as.is=TRUE)

library(sqldf)

sqldf("SELECT factors, GROUP_CONCAT(subfactors) AS Food
       FROM (SELECT DISTINCT factors, subfactors
              FROM df)
       GROUP BY factors")

#output
#   factors                     Food
# 1     cat                milk,soup
# 2     dog biscuit,chicken,pedigree

答案 3 :(得分:0)

subfactors的列表输出:

 aggregate(subfactors~factors,df,FUN=unique)
 factors                 subfactors
1     cat                 milk, soup
2     dog biscuit, chicken, pedigree

或使用dplyr

library(dplyr)
df %>% 
group_by(factors) %>% #grouping variable
summarise(subfactors=paste(unique(subfactors),collapse=","))