我是R的新手 我有这样的数据框
Col1 Col2 col3 col4 col4 col5
city:Dallas #N/A #N/A #N/A #N/A #N/A
region:richardson #N/A #N/A #N/A #N/A #N/A
1.A.school1 0.0 0.0 0.0 0.0 0.0
1.B.school2 0.0 0.0 0.0 0.0 0.1
1.C.school3 n.a n.a 0.0 n.a n.a
4.B.school5 0.0 n.a 0.0 0.0 0.0
6.A.uni7 n.a n.a 0.0 0.0 n.a
4.D.uni9 n.a 0.0 0.0 0.0 0.0
8.A.uni1 n.a n.a 0.0 0.0 0.0
8.b.8 0.0 0.0 0.0 0.0 n.a
8.c.univ6 0.6 0.1 0.0 0.0 0.0
我需要从col1中找到匹配的模式并转换col2,col3,col4,col5的所有值乘以1000
例如: 我需要从col1找到模式8.c.univ6并转换0.6。 0.1 0.0 0.0 0.0 * 1000
同样明智的我需要找到更多模式并转换所有值
任何帮助将不胜感激
答案 0 :(得分:1)
这是实现目标的一种方式。首先,我不知道每列有什么类。所以,我假设所有列都是有特色的。鉴于此,我编写了以下代码。您的数据在此处称为mydf
。我将n.a
和#N/A
替换为NA
,并将Col2:Col6的类更改为数字。然后,我使用rowwise()
处理每一行。对于每一行,如果Col1有8.c.univ6
,请使用. * 1000
。 .
代表一列。这里 。可以是Col2:Col6中的每一个。所以只要条件为TRUE,我就会将每个列乘以1000。
library(dplyr)
mydf %>%
mutate_at(vars(-Col1),
funs(as.numeric(gsub(pattern = "n.a|#N/A", replacement = NA, x = .)))) %>%
rowwise %>%
mutate_at(vars(-Col1),
funs(if(Col1 == "8.c.univ6") {. * 1000} else{.}))
修改
在第二个mutate_at()
中,您可以使用if_else()
。
mydf %>%
mutate_at(vars(-Col1),
funs(as.numeric(gsub(pattern = "n.a|#N/A", replacement = NA, x = .)))) %>%
rowwise %>%
mutate_at(vars(-Col1),
funs(if_else(Col1 == "8.c.univ6", . * 1000, .)))
# Col1 Col2 Col3 Col4 Col5 Col6
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 city:Dallas NA NA NA NA NA
#2 region:richardson NA NA NA NA NA
#3 1.A.school1 0 0 0 0 0.0
#4 1.B.school2 0 0 0 0 0.1
#5 1.C.school3 NA NA 0 NA NA
#6 4.B.school5 0 NA 0 0 0.0
#7 6.A.uni7 NA NA 0 0 NA
#8 4.D.uni9 NA 0 0 0 0.0
#9 8.A.uni1 NA NA 0 0 0.0
#10 8.b.8 0 0 0 0 NA
#11 8.c.univ6 600 100 0 0 0.0
DATA
mydf <- structure(list(Col1 = c("city:Dallas", "region:richardson", "1.A.school1",
"1.B.school2", "1.C.school3", "4.B.school5", "6.A.uni7", "4.D.uni9",
"8.A.uni1", "8.b.8", "8.c.univ6"), Col2 = c("#N/A", "#N/A", "0.0",
"0.0", "n.a", "0.0", "n.a", "n.a", "n.a", "0.0", "0.6"), Col3 = c("#N/A",
"#N/A", "0.0", "0.0", "n.a", "n.a", "n.a", "0.0", "n.a", "0.0",
"0.1"), Col4 = c("#N/A", "#N/A", "0.0", "0.0", "0.0", "0.0",
"0.0", "0.0", "0.0", "0.0", "0.0"), Col5 = c("#N/A", "#N/A",
"0.0", "0.0", "n.a", "0.0", "0.0", "0.0", "0.0", "0.0", "0.0"
), Col6 = c("#N/A", "#N/A", "0.0", "0.1", "n.a", "0.0", "n.a",
"0.0", "0.0", "n.a", "0.0")), .Names = c("Col1", "Col2", "Col3",
"Col4", "Col5", "Col6"), row.names = c(NA, -11L), class = "data.frame")