我试图通过应用此论坛上提出的几种解决方案来解决我的问题,但是我没有工作。
基本上,我有一个数据框:
Concentration Salinity Light.Dark Distance Velocity In_Center Freezing Cruising Bursting Clockwise CounterClockwise
<ord> <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 V 0.5 Dark 0.0612 0.0826 0.0638 0.0207 0.0124 0.00511 -0.0866 -0.0439
2 L 0.5 Dark 0.0360 0.0282 -0.166 -0.00475 0.148 -0.0328 -0.0337 0.0615
3 M 0.5 Dark -0.144 -0.147 0.00761 0.0405 -0.191 -0.00586 0.0772 -0.0123
4 H 0.5 Dark 0.0464 0.0362 0.0949 -0.0565 0.0306 0.0335 0.0431 -0.00527
>
我想通过从每一行中减去第一行来规范从Distance到CounterClockwise的列。
我尝试过:
df_norm= df %>%
mutate_at(4:11, list(~ .- first(.)))
但它仅返回0:
Concentration Salinity Light.Dark Distance Velocity In_Center Freezing Cruising Bursting Clockwise CounterClockwise
<ord> <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 V 0.5 Dark 0 0 0 0 0 0 0 0
2 L 0.5 Dark 0 0 0 0 0 0 0 0
3 M 0.5 Dark 0 0 0 0 0 0 0 0
4 H 0.5 Dark 0 0 0 0 0 0 0 0
我尝试使用以下方式将小标题转换为数据框:
as.data.frame(df_norm)
但是我得到了
Concentration Salinity Light.Dark Distance Velocity In_Center Freezing Cruising Bursting Clockwise CounterClockwise
1 V 0.5 Dark 0 0 0 0 0 0 0 0
2 L 0.5 Dark 0 0 0 0 0 0 0 0
3 M 0.5 Dark 0 0 0 0 0 0 0 0
4 H 0.5 Dark 0 0 0 0 0 0 0 0
这是我df的报告:
structure(list(Concentration = structure(1:4, .Label = c("V",
"L", "M", "H"), class = c("ordered", "factor")), Salinity = structure(c(1L,
1L, 1L, 1L), .Label = c("0.5", "2", "6"), class = "factor"),
Light.Dark = structure(c(1L, 1L, 1L, 1L), .Label = c("Dark",
"ERROR", "Light"), class = "factor"), Distance = c(0.0611762417792624,
0.0359847599237893, -0.143596409795565, 0.0464354080925131
), Velocity = c(0.0825514600369596, 0.0282499048624341, -0.146998610001507,
0.0361972451021132), In_Center = c(0.06383139350079, -0.166302972291672,
0.00760502103176895, 0.0948665577591132), Freezing = c(0.0206958889309448,
-0.00474520212061713, 0.0405259034871347, -0.0564765902974621
), Cruising = c(0.0123684826368456, 0.148343102625951, -0.191335919657439,
0.0306243343946422), Bursting = c(0.00511229994076513, -0.0327935337663713,
-0.00586044139551122, 0.0335416752211175), Clockwise = c(-0.0865980448950217,
-0.0337007169788508, 0.077213035103443, 0.0430857267704295
), CounterClockwise = c(-0.0439324933628217, 0.0615054907079504,
-0.0123010981415901, -0.00527189920353861)), row.names = c(NA,
-4L), groups = structure(list(Concentration = structure(1:4, .Label = c("V",
"L", "M", "H"), class = c("ordered", "factor")), Salinity = structure(c(1L,
1L, 1L, 1L), .Label = c("0.5", "2", "6"), class = "factor"),
.rows = structure(list(1L, 2L, 3L, 4L), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = c(NA, 4L), class = c("tbl_df",
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))
有什么想法我做错了吗?
非常感谢您的帮助!
答案 0 :(得分:0)
基于dput
,它是一个分组数据集,每组只有一行
df %>%
summarise(n = n())
# A tibble: 4 x 3
# Groups: Concentration [4]
# Concentration Salinity n
# <ord> <fct> <int>
#1 V 0.5 1
#2 L 0.5 1
#3 M 0.5 1
#4 H 0.5 1
所以,基本上,它是减去相同的值。
如果要对整个数据集执行此操作,请ungroup
,然后应用代码
df %>%
ungroup %>%
mutate(across(4:11, ~ . - first(.)))
# // to get the difference on numeric columns
# mutate(across(where(is.numeric), ~ . - first(.)))
# A tibble: 4 x 11
# Concentration Salinity Light.Dark Distance Velocity In_Center Freezing Cruising Bursting Clockwise CounterClockwise
# <ord> <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 V 0.5 Dark 0 0 0 0 0 0 0 0
#2 L 0.5 Dark -0.0252 -0.0543 -0.230 -0.0254 0.136 -0.0379 0.0529 0.105
#3 M 0.5 Dark -0.205 -0.230 -0.0562 0.0198 -0.204 -0.0110 0.164 0.0316
#4 H 0.5 Dark -0.0147 -0.0464 0.0310 -0.0772 0.0183 0.0284 0.130 0.0387