我希望在R中对数据框进行子集化。我的语法显然是错误的(即产生错误的结果)。
data[i,]$m_cnt <- nrow(w_data[
w_data$direction >= data[i,]$min_a &
w_data$direction < data[i,]$max_a &
w_data$windspeed >= 3 &
w_data$windspeed < 15,
])/records;
类似的问题:Filtering a data.frame
w_data data.frame(为简洁而简化)由风速和风向时间序列数据组成。
time_stamp windspeed direction
2010-06-01 00:00 12.2 125
2010-06-03 02:50 17.4 312
2010-06-05 21:30 2.1 132
2010-06-12 15:10 7.8 71
2010-06-22 17:40 2.6 307
2010-06-30 03:20 5.1 310
上述R语句应该计算特定风向范围内的记录数,例如>=120°
和<135°
,并且在某个风速范围内,在此示例中{{1 }和>=3m/s
。然后将计数转换为所采用测量总数的百分比,因此上述示例应等于6 = 16.66%中的1个记录。然后将百分比记录到另一个具有以下结构的data.frame(数据)中:
<15m/s
我遇到的问题是所有百分比的总和不等于100%(这个例子确实如此,但不是我的脚本超过10,000个记录)。
我也经历过奇怪的结果,例如:
min_a max_a l_cnt m_cnt h_cnt
0 15 0 0 0
15 30 0 0 0
30 45 0 0 0
45 60 0 0 0
60 75 0 0.1666 0
75 90 0 0 0
90 105 0 0 0
105 120 0 0 0
120 135 0.1666 0.1666 0
135 150 0 0 0
150 165 0 0 0
165 180 0 0 0
180 195 0 0 0
195 210 0 0 0
210 225 0 0 0
225 240 0 0 0
240 255 0 0 0
255 270 0 0 0
270 285 0 0 0
285 300 0 0 0
300 315 0.1666 0.1666 0.1666
315 330 0 0 0
330 345 0 0 0
345 360 0 0 0
产生总数:
data[i,]$l_cnt <- nrow(w_data[
w_data$direction >= data[i,]$min_a &
w_data$direction < data[i,]$max_a &
w_data$windspeed <= 3,
])/records;
data[i,]$m_cnt <- nrow(w_data[
w_data$direction >= data[i,]$min_a &
w_data$direction < data[i,]$max_a &
w_data$windspeed <= 15,
])/records;
data[i,]$h_cnt <- nrow(w_data[
w_data$direction >= data[i,]$min_a &
w_data$direction < data[i,]$max_a &
w_data$windspeed > 15,
])/records;
但是,如果我使用大于和小于m_cnt计算的资格,即:
l_cnt 0,360637343
m_cnt 0,187836625
h_cnt 0,811938959
total 1,360412926
我明白了:
data[i,]$m_cnt <- nrow(w_data[
w_data$direction >= data[i,]$min_a &
w_data$direction < data[i,]$max_a &
w_data$windspeed >= 3 &
w_data$windspeed < 15,
])/records;
答案 0 :(得分:3)
可能这很接近你想要的东西:
> # data
> w_data
windspeed direction
1 12.2 125
2 17.4 312
3 2.1 132
4 7.8 71
5 2.6 307
6 5.1 310
> # grouping by cut
> w_data <- transform(w_data,
+ dg = cut(direction, breaks=0:24*15),
+ wg = cut(windspeed, breaks=c(0, 3, 15, Inf)))
> # now the data looks like:
> w_data
windspeed direction dg wg
1 12.2 125 (120,135] (3,15]
2 17.4 312 (300,315] (15,Inf]
3 2.1 132 (120,135] (0,3]
4 7.8 71 (60,75] (3,15]
5 2.6 307 (300,315] (0,3]
6 5.1 310 (300,315] (3,15]
> # tabulate and calculate the parcentage
> table(w_data$dg, w_data$wg) / nrow(w_data)
(0,3] (3,15] (15,Inf]
(0,15] 0.0000000 0.0000000 0.0000000
(15,30] 0.0000000 0.0000000 0.0000000
(30,45] 0.0000000 0.0000000 0.0000000
(45,60] 0.0000000 0.0000000 0.0000000
(60,75] 0.0000000 0.1666667 0.0000000
(75,90] 0.0000000 0.0000000 0.0000000
(90,105] 0.0000000 0.0000000 0.0000000
(105,120] 0.0000000 0.0000000 0.0000000
(120,135] 0.1666667 0.1666667 0.0000000
(135,150] 0.0000000 0.0000000 0.0000000
(150,165] 0.0000000 0.0000000 0.0000000
(165,180] 0.0000000 0.0000000 0.0000000
(180,195] 0.0000000 0.0000000 0.0000000
(195,210] 0.0000000 0.0000000 0.0000000
(210,225] 0.0000000 0.0000000 0.0000000
(225,240] 0.0000000 0.0000000 0.0000000
(240,255] 0.0000000 0.0000000 0.0000000
(255,270] 0.0000000 0.0000000 0.0000000
(270,285] 0.0000000 0.0000000 0.0000000
(285,300] 0.0000000 0.0000000 0.0000000
(300,315] 0.1666667 0.1666667 0.1666667
(315,330] 0.0000000 0.0000000 0.0000000
(330,345] 0.0000000 0.0000000 0.0000000
(345,360] 0.0000000 0.0000000 0.0000000