R-根据传递的多个条件对数据进行分类,其中之一是日期时间类型

时间:2018-12-13 03:50:49

标签: r classification

我正在尝试根据适用于各种条件的数据对数据进行分类。条件之一是日期时间(m / d / y hh:mm)。分类应在白天/夜晚进行,即白天的时间为07:00/19:00,晚上的时间为19:01/06:59,并且还应基于季节,即m / d / y。是否可以创建基于多个条件(例如数值数据-温度,湿度,因子-“强风”,“中风”和日期时间)进行分类的新变量。

这是我的数据结构。

$ T       : int  11 11 13 13 14 16 17 17 18 18 ...
$ P0      : num  700 699 700 699 699 ...
$ P       : num  764 763 763 763 762 ...
$ U       : int  54 58 47 47 41 36 34 37 34 34 ...
$ DD      : Factor w/ 18 levels "","Calm","Wind 
blowing from the east",..: 17 17 9 17 9 9 9 9 9 10 
...
$ Ff      : int  5 3 4 4 4 5 4 6 7 7 ...
$ ff10    : int  NA NA NA NA NA NA NA NA 11 10 ...
$ WW      : Factor w/ 3 levels "","In the vicinity 
thunderstorm",..: 1 1 1 1 1 1 1 1 1 1 ...
$ W.W.    : logi  NA NA NA NA NA NA ...
$ c       : Factor w/ 245 levels "Broken clouds (60- 
90%) 1020 m",..: 32 154 86 86 151 154 216 124 86 86 
...
$ VV      : num  16 16 NA NA 16 16 16 16 16 16 ...
$ Td      : int  2 3 NA NA 1 1 1 2 2 2 ...
$ datetime: chr  "9/30/14 23:00" "9/30/14 22:00" 
"9/30/14 21:00" "9/30/14 20:00" ...
$ T_g_5   : num  12.8 13.4 14.1 14.9 16 17.2 18 19.1 
19.9 19.9 ...
$ T_g_20  : num  16.3 16.5 16.7 16.8 16.8 16.8 16.7 
16.3 16 15.4 ...
$ T_g_35  : num  17.3 17.2 17.3 17.3 17.3 17 17.2 17 
17 16.7 ...
$ T_g_50  : num  17.5 17.5 17.5 17.5 17.5 17.5 17.7 
17.7 17.7 17.7 ...
$ T_g_75  : num  18.6 18.6 18.6 18.6 18.8 18.9 18.9 
18.9 18.9 18.9 ...
$ s_m_5   : num  0.182 0.184 0.184 0.187 0.185 0.192 
0.193 0.19 0.193 0.195 ...
$ s_m_20  : num  0.209 0.205 0.207 0.206 0.202 0.201 
0.195 0.195 0.195 0.19 ...
$ s_m_35  : num  0.142 0.142 0.142 0.146 0.144 0.143 
0.146 0.146 0.146 0.146 ...
$ s_m_50  : num  0.149 0.149 0.151 0.146 0.149 0.146 
0.144 0.144 0.149 0.149 ...
$ s_m_75  : num  0.139 0.144 0.144 0.144 0.144 0.144 
0.144 0.142 0.142 0.142 ...

我尝试使用以下代码对数据进行分类。

DF$pest[DF$T <= 15 & DF$T > 10 & DF$U >=50 & DF$U < 
75 & DF$datetime >= "9/27/14 13:00" & DF$datetime < 
"9/27/14 16:00" ] <- "Threat"

以下上面的代码在我的列有害生物中给出了“ NA”。还有其他方法可以对上述数据进行分类吗?谢谢。

dput输出

 > dput(smalldata)
structure(list(X.x = 1:4, T = c(11L, 11L, 13L, 13L), P0 = c(699.6, 
699.4, 699.6, 699.4), P = c(763.5, 763.3, 763, 762.8), U = c(54L, 
58L, 47L, 47L), DD = structure(c(17L, 17L, 9L, 17L), .Label = c("", 
"Calm", "Wind blowing from the east", "Wind blowing from the east-northeast", 
"Wind blowing from the east-southeast", "Wind blowing from the north", 
"Wind blowing from the north-east", "Wind blowing from the north-northeast", 
"Wind blowing from the north-northwest", "Wind blowing from the north-west", 
"Wind blowing from the south", "Wind blowing from the south-east", 
"Wind blowing from the south-southeast", "Wind blowing from the south-southwest", 
"Wind blowing from the south-west", "Wind blowing from the west", 
"Wind blowing from the west-northwest", "Wind blowing from the west-southwest"
), class = "factor"), Ff = c(5L, 3L, 4L, 4L), ff10 = c(NA_integer_, 
NA_integer_, NA_integer_, NA_integer_), WW = structure(c(1L, 
1L, 1L, 1L), .Label = c("", "In the vicinity thunderstorm", "Thunderstorm"
), class = "factor"), W.W. = c(NA, NA, NA, NA), VV = c(16, 16, 
NA, NA), Td = c(2L, 3L, NA, NA), datetime = c("9/30/14 23:00", 
"9/30/14 22:00", "9/30/14 21:00", "9/30/14 20:00"), T_g_5 = c(12.8, 
13.4, 14.1, 14.9), T_g_20 = c(16.3, 16.5, 16.7, 16.8), T_g_35 = c(17.3, 
17.2, 17.3, 17.3), T_g_50 = c(17.5, 17.5, 17.5, 17.5), T_g_75 = c(18.6, 
18.6, 18.6, 18.6), s_m_5 = c(0.182, 0.184, 0.184, 0.187), s_m_20 = c(0.209, 
0.205, 0.207, 0.206), s_m_35 = c(0.142, 0.142, 0.142, 0.146), 
    s_m_50 = c(0.149, 0.149, 0.151, 0.146), s_m_75 = c(0.139, 
    0.144, 0.144, 0.144), X.y = c(NA, NA, NA, NA), pest = c(NA_character_, 
    NA_character_, NA_character_, NA_character_)), .Names = c("X.x", 
"T", "P0", "P", "U", "DD", "Ff", "ff10", "WW", "W.W.", "VV", 
"Td", "datetime", "T_g_5", "T_g_20", "T_g_35", "T_g_50", "T_g_75", 
"s_m_5", "s_m_20", "s_m_35", "s_m_50", "s_m_75", "X.y", "pest"
), row.names = c(NA, 4L), class = "data.frame")

1 个答案:

答案 0 :(得分:0)

尝试将其分解,然后您可以更轻松地发现问题所在。

也就是说,我可以立即看到一个问题:您的日期存储为character
如果这样做,是否有效?

as.POSIXct(DF$datetime, format='%m/%d/%y %H:%M') >= 
as.POSIXct("9/27/14 13:00", format='%m/%d/%y %H:%M')