计算级别内的值

时间:2017-03-01 21:02:20

标签: r intervals r-factor

我在R中使用cut生成一组级别,例如:比如0到1之间的小数值,分为0.1个区间:

> frac <- cut(c(0, 1), breaks=10)
> levels(frac)
[1] "(-0.001,0.1]" "(0.1,0.2]"    "(0.2,0.3]"    "(0.3,0.4]"    "(0.4,0.5]"
[6] "(0.5,0.6]"    "(0.6,0.7]"    "(0.7,0.8]"    "(0.8,0.9]"    "(0.9,1]"

如果向量v包含[0.0, 1.0]之间的连续值,如何计算v中属于levels(frac)中每个级别的元素的频率?

我可以自定义中断的数量和/或我制作级别的时间间隔,所以我正在寻找一种方法来使用标准R命令,这样我就可以构建一个双列数据帧:一个列为级别作为因子,第二列为v级别中总元素的小数或百分比值。

注意:以下内容不起作用:

> table(frac)
frac
(-0.001,0.1]    (0.1,0.2]    (0.2,0.3]    (0.3,0.4]    (0.4,0.5]    (0.5,0.6]
           1            0            0            0            0            0
   (0.6,0.7]    (0.7,0.8]    (0.8,0.9]      (0.9,1]
           0            0            0            1

如果我直接在cut使用v,那么当我在不同的向量上运行cut时,我得不到相同的级别,因为值的范围 - 它们的最小值和最大值 - 在任意向量之间会有所不同,因此虽然我可能具有相同数量的中断,但是级别间隔将不相同。

我的目标是采用不同的向量并将它们分成同一组级别。希望这有助于澄清我的问题。感谢您的帮助。

5 个答案:

答案 0 :(得分:2)

修改str(tree.text_content())以实际代表您所需的时间间隔,然后使用frac功能:

table

结果:

x = runif(100) # For example.
frac = cut(x, breaks = seq(0, 1, 0.1))
table(frac)

答案 1 :(得分:2)

将极值c(0, 1)引入v,然后使用相同的cut

library(dplyr)

#dummy data
set.seed(1)
v <- round(runif(7), 2)

#result
data.frame(v,
           vFrac = cut(c(0, 1, v), breaks = 10)[-c(1, 2)]) %>% 
  group_by(vFrac) %>% 
  mutate(vFreq = n())

# Source: local data frame [10 x 3]
# Groups: vFrac [8]
# 
#        v        vFrac vFreq
#    <dbl>       <fctr> <int>
# 1   0.27    (0.2,0.3]     1
# 2   0.37    (0.3,0.4]     1
# 3   0.57    (0.5,0.6]     1
# 4   0.91      (0.9,1]     2
# 5   0.20    (0.1,0.2]     1
# 6   0.90    (0.8,0.9]     1
# 7   0.94      (0.9,1]     2

答案 2 :(得分:1)

使用findInterval而不是cut:

v<-data.frame(v=runif(100,0,1))

library(plyr)
v$x<-findInterval(v$v,seq(0,1,by=0.1))*0.1
ddply(v, .(x), summarize, n=length(x))

答案 3 :(得分:1)

frac = seq(0,1,by=0.1)

ranges = paste(head(frac,-1), frac[-1], sep=" - ")
freq   = hist(v, breaks=frac, include.lowest=TRUE, plot=FALSE)

data.frame(range = ranges, frequency = freq$counts)

答案 4 :(得分:1)

SELECT customers.customerID, customers.fName
FROM customers LEFT OUTER JOIN orders on customers.customerID = orders.customerID
WHERE orders.customerID IS NULL