如果我有以下数据框的因子及其子因子。我怎样才能提取狗吃的食物和猫吃的食物等信息?
factors subfactors
dog biscuit
dog chicken
dog chicken
cat milk
cat soup
dog pedigree
预期产出:
dog: biscuit,chicken,pedigree
cat: milk,soup
答案 0 :(得分:2)
可以做
library(data.table)
setDT(df)[, list(Ate = paste(unique(subfactors), collapse = ", ")), by = factors]
## factors Ate
## 1: dog biscuit, chicken, pedigree
## 2: cat milk, soup
更多data.table
这样做的方法是
setDT(df)[, lapply(.SD, function(x) paste(unique(x), collapse = ", ")), by = factors]
您也可以使用基数R执行此操作(尽管如果您拥有大数据集,data.table
总是更可取)
aggregate(subfactors ~ factors, df, function (x) paste(unique(x), collapse = ", "))
## factors subfactors
## 1 cat milk, soup
## 2 dog biscuit, chicken, pedigree
答案 1 :(得分:1)
你的意思是基本索引????
df <- read.table(header = TRUE, text = 'factors subfactors
dog biscuit
dog chicken
cat milk
cat soup
dog pedigree')
df[df$factors == 'dog', "subfactors"]
df[df$factors == 'cat', "subfactors"]
或者可能会分成一个列表:
split(df$subfactors, df$factors)
答案 2 :(得分:1)
使用sqldf
:
#reproducible data
df <- read.table(text = "factors subfactors
dog biscuit
dog chicken
dog chicken
cat milk
cat soup
dog pedigree",header=TRUE,as.is=TRUE)
library(sqldf)
sqldf("SELECT factors, GROUP_CONCAT(subfactors) AS Food
FROM (SELECT DISTINCT factors, subfactors
FROM df)
GROUP BY factors")
#output
# factors Food
# 1 cat milk,soup
# 2 dog biscuit,chicken,pedigree
答案 3 :(得分:0)
subfactors
的列表输出:
aggregate(subfactors~factors,df,FUN=unique)
factors subfactors
1 cat milk, soup
2 dog biscuit, chicken, pedigree
或使用dplyr
library(dplyr)
df %>%
group_by(factors) %>% #grouping variable
summarise(subfactors=paste(unique(subfactors),collapse=","))