我需要找出一个值在某个数据帧的列中出现多少次。
主要逻辑是根据另一列获取特定字符串的出现次数。
例如:
df<- data.frame(fruits = c("apples", "apples", "orange", "pears", "apples", "pears", "pears", "papaya", "papaya"),
veggies = c("beans", "carrots", "carrots", "carrots", "brinjal","carrots", "brinjal", "brinjal", "beans"),
branches=c( "Area1", "Area1", "Area1", "Area2","Area2","Area2", "Area2", "Area3", "Area3" ))
这是我的数据框架。我需要根据分支栏
知道水果或蔬菜的数量当我使用table(df$fruits)
输出是:
apples-3 orange-1 papaya-2 pears-3
输出通常显示所有分支的苹果和其余水果的总数。我需要为每个分支准确计算。
我的所需输出应基于列df$Branches
for Area1
apples-2 orange-1,
for Area2
pears-3 apples-1
for Area3
papaya-3
答案 0 :(得分:1)
试试这个:
library(data.table)
setDT(df)[,list(count=.N),list(branches, fruits)]
# branches fruits count
#1: Area1 apples 2
#2: Area1 orange 1
#3: Area2 pears 3
#4: Area2 apples 1
#5: Area3 papaya 2
答案 1 :(得分:1)
也许只使用ftable
:
> ftable(fruits ~ branches, data = df)
fruits apples orange papaya pears
branches
Area1 2 1 0 0
Area2 1 0 0 3
Area3 0 0 2 0
> ftable(veggies ~ branches, data = df)
veggies beans brinjal carrots
branches
Area1 1 0 2
Area2 0 2 2
Area3 1 1 0
答案 2 :(得分:0)
我不知道您期望的输出,但您可以使用dplyr包获取计数:
例如:
library(dplyr)
df %>% count(fruits, branches)
# OR
count(df, fruits, branches)
输出:
Source: local data frame [5 x 3]
Groups: fruits
fruits branches n
1 apples Area1 2
2 apples Area2 1
3 orange Area1 1
4 papaya Area3 2
5 pears Area2 3