Name1 Col1 Col2 Col3 Name2 Col4 Col5 Col6 Col7
John A A A Alex B B B 1
Alex B B B John A A A 0
查看上面的数据框,我想根据Col7的值选择数据。具体来说,如果Col7 = 1,那么我想选择第1,2和3列。如果Col7 = 0,则选择Cols 4,5,6。 Col 4,5,6与Cols 1,2,3的变量相同,只是与Alex而不是John相关联(第1行)。因此,两次选择John的数据,对于每一对都是相同的。
我在想“Dplyr”中的某种形式的选择会起作用,但我在条件选择方面遇到了麻烦。
我的最终数据框架如下所示:
Name1 Col1 Col2 Col3
John A A A
John A A A
答案 0 :(得分:1)
嗨尝试一些非常基本的东西(结合filter和select_at):
df1 <- df %>%
filter(Col7 == 1) %>%
select_at(vars(Name = Name1, Col1, Col2, Col3))
df2 <- df %>%
filter(Col7 == 0) %>%
select_at(vars(Name = Name2, Col1 = Col4, Col2 = Col5, Col3 = Col6))
df <- bind_rows(df1, df2)
您可以获得所需的数据框:
> df
Name Col1 Col2 Col3
1 John A A A
2 John A A A
答案 1 :(得分:0)
您可以使用data.table或reshape2中的melt
,然后在条件下保持联接:
library(data.table)
setDT(d)
d[, row := .I]
md = melt(d, id=c("row", "Col7"),
meas = Map(c, 1:4, 5:8),
variable.factor = FALSE,
variable.name = "colset",
value.name = names(d)[1:4])
# row Col7 colset Name1 Col1 Col2 Col3
# 1: 1 1 1 John A A A
# 2: 2 0 1 Alex B B B
# 3: 1 1 2 Alex B B B
# 4: 2 0 2 John A A A
cond = data.table(Col7 = 0:1, colset = c("2", "1"))
# Col7 colset
# 1: 0 2
# 2: 1 1
res = md[cond, on=names(cond), nomatch=0]
# row Col7 colset Name1 Col1 Col2 Col3
# 1: 2 0 2 John A A A
# 2: 1 1 1 John A A A
此方法扩展到两组以上的列,例如meas=Map(c, 1:4, 5:8, 9:12)
。
答案 2 :(得分:0)
在基地R:
create table TableC as select
a.ID,
case when b.Field1=1000 and a.Field1=50 then 20 else 0 end as FieldA,
case when b.Field2=15 and a.Field2=100 then 100 else 0 end as FieldB
from TableA a, TableB b
where a.ID=b.ID
order by 1
答案 3 :(得分:0)
这是一个tidyverse(比dplyr更多的tidyr)方法。这是相当冗长的,因为你的原始数据不是很整齐,所以大多数代码只是变成一个长形式,清理并传播回广泛的形式。
library(tidyverse)
df <- data_frame(Name1 = c("John", "Alex"),
Col1 = c("A", "B"), Col2 = c("A", "B"), Col3 = c("A", "B"),
Name2 = c("Alex", "John"),
Col4 = c("B", "A"), Col5 = c("B", "A"), Col6 = c("B", "A"),
Col7 = c(1L, 0L))
df %>%
# reshape to long form
gather(col, col_val, num_range('Col', 1:6)) %>%
gather(name_var, name, contains('Name')) %>%
# clean, subset, clean for spreading
mutate(col = parse_number(col),
name_var = parse_number(name_var)) %>%
filter(ifelse(Col7 == 1,
col %in% 1:3 & name_var == 1,
col %in% 4:6 & name_var == 2)) %>%
mutate(col = paste0('Col', col %% 3 + 1),
name_var = 'Name') %>%
# reshape back to wide form
spread(name_var, name) %>%
spread(col, col_val) %>%
# clean
select(-Col7)
#> # A tibble: 2 x 4
#> Name Col1 Col2 Col3
#> <chr> <chr> <chr> <chr>
#> 1 John A A A
#> 2 John A A A