我有4个传感器,T1, T2, T3, T4
,定期获取数据。每个传感器位于不同的径向和角度位置。这些数据存储如下:
test_data <- data.frame(T1=c(0,0,rnorm(3),NA,rnorm(6)), T2=c(1,0,NA,NA,rnorm(8)), T3=c(0,2*pi,rnorm(9),NA), T4=c(1,2*pi,rnorm(1),NA,rnorm(8)))
,即包含传感器 i 的观测值的前两行列分别包含传感器的半径和角位置。以下行存储从传感器获取的度量。某些测量可能会丢失,
我想要一个 tidy 格式的数据框,因此每列只应包含一个变量。这意味着半径和角度必须得到自己的列。因此,传感器测量不能再存储在并行列中,而是必须串行存储。第二步很简单:
library(tidyr)
test_data <- test_data %>% gather(Sensor, Temperature)
但是,我不确定执行第一步的最惯用方法是哪一种。当然我可以使用for
循环,但我想知道是否可能有更惯用的方式。
答案 0 :(得分:4)
也许是这样的:
angle <- test_data %>%
slice(1:2) %>%
t %>%
data.frame %>%
setNames(., c("Radial", "Angular")) %>%
tibble::rownames_to_column("Sensor")
test_data %>%
slice(3:nrow(.)) %>%
gather(Sensor, Temperature) %>%
left_join(angle)
或者根据评论中提到的,更惯用的是:
test_data %>%
gather(Sensor, Temperature) %>%
group_by(Sensor) %>%
mutate(r = first(Temperature), theta = nth(Temperature,2)) %>%
slice(3:n())
答案 1 :(得分:1)
这有点晚了,但我不相信你需要slice
和join
。只需转置数据并使用原始数据的列名称为Sensor
添加一列:
test_data <- data.frame(t(test_data), Sensor=colnames(test_data))
然后,你可以gather
。在这里,为方便起见,我为每个测量添加了列名:
library(dplyr)
library(tidyr)
test_data %>% setNames(c("Radial", "Azimuth", paste0("M",1:(ncol(test_data)-3)), "Sensor")) %>%
gather("Measurement", "Temperature", M1:M10)
对于使用set.seed(123)
生成的以下输入:
test_data <- structure(list(T1 = c(0, 0, -0.560475646552213, -0.23017748948328,
1.55870831414912, NA, 0.070508391424576, 0.129287735160946, 1.71506498688328,
0.460916205989202, -1.26506123460653, -0.686852851893526), T2 = c(1,
0, NA, NA, -0.445661970099958, 1.22408179743946, 0.359813827057364,
0.400771450594052, 0.11068271594512, -0.555841134754075, 1.78691313680308,
0.497850478229239), T3 = c(0, 6.28318530717959, -1.96661715662964,
0.701355901563686, -0.472791407727934, -1.06782370598685, -0.217974914658295,
-1.02600444830724, -0.72889122929114, -0.625039267849257, -1.68669331074241,
NA), T4 = c(1, 6.28318530717959, 0.837787044494525, NA, 0.153373117836515,
-1.13813693701195, 1.25381492106993, 0.426464221476814, -0.295071482992271,
0.895125661045022, 0.878133487533042, 0.821581081637487)), .Names = c("T1",
"T2", "T3", "T4"), row.names = c(NA, -12L), class = "data.frame")
## T1 T2 T3 T4
##1 0.00000000 1.0000000 0.0000000 1.0000000
##2 0.00000000 0.0000000 6.2831853 6.2831853
##3 -0.56047565 NA -1.9666172 0.8377870
##4 -0.23017749 NA 0.7013559 NA
##5 1.55870831 -0.4456620 -0.4727914 0.1533731
##6 NA 1.2240818 -1.0678237 -1.1381369
##7 0.07050839 0.3598138 -0.2179749 1.2538149
##8 0.12928774 0.4007715 -1.0260044 0.4264642
##9 1.71506499 0.1106827 -0.7288912 -0.2950715
##10 0.46091621 -0.5558411 -0.6250393 0.8951257
##11 -1.26506123 1.7869131 -1.6866933 0.8781335
##12 -0.68685285 0.4978505 NA 0.8215811
你得到:
## Radial Azimuth Sensor Measurement Temperature
##1 0 0.000000 T1 M1 -0.56047565
##2 1 0.000000 T2 M1 NA
##3 0 6.283185 T3 M1 -1.96661716
##4 1 6.283185 T4 M1 0.83778704
##5 0 0.000000 T1 M2 -0.23017749
##6 1 0.000000 T2 M2 NA
##7 0 6.283185 T3 M2 0.70135590
##8 1 6.283185 T4 M2 NA
##9 0 0.000000 T1 M3 1.55870831
##10 1 0.000000 T2 M3 -0.44566197
##11 0 6.283185 T3 M3 -0.47279141
##12 1 6.283185 T4 M3 0.15337312
##13 0 0.000000 T1 M4 NA
##14 1 0.000000 T2 M4 1.22408180
##15 0 6.283185 T3 M4 -1.06782371
##16 1 6.283185 T4 M4 -1.13813694
##17 0 0.000000 T1 M5 0.07050839
##18 1 0.000000 T2 M5 0.35981383
##19 0 6.283185 T3 M5 -0.21797491
##20 1 6.283185 T4 M5 1.25381492
##21 0 0.000000 T1 M6 0.12928774
##22 1 0.000000 T2 M6 0.40077145
##23 0 6.283185 T3 M6 -1.02600445
##24 1 6.283185 T4 M6 0.42646422
##25 0 0.000000 T1 M7 1.71506499
##26 1 0.000000 T2 M7 0.11068272
##27 0 6.283185 T3 M7 -0.72889123
##28 1 6.283185 T4 M7 -0.29507148
##29 0 0.000000 T1 M8 0.46091621
##30 1 0.000000 T2 M8 -0.55584113
##31 0 6.283185 T3 M8 -0.62503927
##32 1 6.283185 T4 M8 0.89512566
##33 0 0.000000 T1 M9 -1.26506123
##34 1 0.000000 T2 M9 1.78691314
##35 0 6.283185 T3 M9 -1.68669331
##36 1 6.283185 T4 M9 0.87813349
##37 0 0.000000 T1 M10 -0.68685285
##38 1 0.000000 T2 M10 0.49785048
##39 0 6.283185 T3 M10 NA
##40 1 6.283185 T4 M10 0.82158108