假设您具有以下简单数据框:
Input <- c("X0_1-2 + X1_1-2","X0_1-2 + X1_1-2","X0_1-3 + X1_1-3","X0_3-2 + X1_3-2","X0_3-1 + X1_3-1","X0_2-1 + X1_2-1","X0_2-3 + X1_2-3","X0_13-1 + X1_13-1")
State1 <- c("1-3","1-3","1-2","3-1","3-2","2-1","2-1","13-3")
State2 <- c("1-2","1-2","1-3","3-2","3-1","2-3","2-3","13-1")
DataFrame <- cbind(Input,State1,State2)
DataFrame <- as.data.frame(DataFrame)
屈服
Input State1 State2
1 X0_1-2 + X1_1-2 1-3 1-2
2 X0_1-2 + X1_1-2 1-3 1-2
3 X0_1-3 + X1_1-3 1-2 1-3
4 X0_3-2 + X1_3-2 3-1 3-2
5 X0_3-1 + X1_3-1 3-2 3-1
6 X0_2-1 + X1_2-1 2-1 2-3
7 X0_2-3 + X1_2-3 2-1 2-3
8 X0_13-1 + X1_13-1 13-3 13-1
我试图提出一种聪明的方法来添加与“输入”列相等的额外列,但是要使“ _”后面的值 是State1或State2的那些,根据它们与Input中相应的子字符串不同,即在这种情况下,期望的结果是
Input State1 State2 Outcome
1 X0_1-2 + X1_1-2 1-3 1-2 X0_1-3 + X1_1-3
2 X0_1-2 + X1_1-2 1-3 1-2 X0_1-3 + X1_1-3
3 X0_1-3 + X1_1-3 1-2 1-3 X0_1-2 + X1_1-2
4 X0_3-2 + X1_3-2 3-1 3-2 X0_3-1 + X1_3-1
5 X0_3-1 + X1_3-1 3-2 3-1 X0_3-2 + X1_3-2
6 X0_2-1 + X1_2-1 2-1 2-3 X0_2-3 + X1_2-3
7 X0_2-3 + X1_2-3 2-1 2-3 X0_2-1 + X1_2-1
8 X0_13-1 + X1_13-1 13-3 13-1 X0_13-3 + X1_13-3
但是到目前为止还没有成功。
该想法是将State或State2的值替换为State1或State2中的值,取其两者之和。
任何想法/建议将不胜感激。 谢谢!
答案 0 :(得分:3)
如果我理解正确,则字符串的Input
和Outcome
部分的"XO"
和"X1"
表示的状态是相同的。 State1
和State2
也不相同。在这种情况下,您可以从输入中拉出状态,将其与两个状态之一进行比较,然后将输出字符串粘贴在一起:
output <- ifelse(substring(DataFrame$Input, 13) == State1, State2, State1)
DataFrame$Outcome <- paste("X0_", output, " + X1_", output, sep = "")
DataFrame
# Input State1 State2 Outcome
# 1 X0_1-2 + X1_1-2 1-3 1-2 X0_1-3 + X1_1-3
# 2 X0_1-2 + X1_1-2 1-3 1-2 X0_1-3 + X1_1-3
# 3 X0_1-3 + X1_1-3 1-2 1-3 X0_1-2 + X1_1-2
# 4 X0_3-2 + X1_3-2 3-1 3-2 X0_3-1 + X1_3-1
# 5 X0_3-1 + X1_3-1 3-2 3-1 X0_3-2 + X1_3-2
# 6 X0_2-1 + X1_2-1 2-1 2-3 X0_2-3 + X1_2-3
# 7 X0_2-3 + X1_2-3 2-1 2-3 X0_2-1 + X1_2-1
# 8 X0_13-1 + X1_13-1 13-3 13-1 X0_13-3 + X1_13-3
此解决方案适用于任何长度的“状态”子字符串(例如,两个"1-1" and "201-14") expressed by the
Input`变量。您可以使用regex,但在这种情况下,基于位置工作进行提取(效率更高)。
答案 1 :(得分:2)
我会这样做,假设df是您的数据框:
replacement <- c("State2","State1")[mapply(grepl, df$State2, df$Input)+1]
df$output <- sapply(1:nrow(df), function(i)gsub( "\\d+-\\d+",df[i, replacement[i]],df[i,"Input"]))
输出:
> df
Input State1 State2 output
1 X0_1-2 + X1_1-2 1-3 1-2 X0_1-3 + X1_1-3
2 X0_1-2 + X1_1-2 1-3 1-2 X0_1-3 + X1_1-3
3 X0_1-3 + X1_1-3 1-2 1-3 X0_1-2 + X1_1-2
4 X0_3-2 + X1_3-2 3-1 3-2 X0_3-1 + X1_3-1
5 X0_3-1 + X1_3-1 3-2 3-1 X0_3-2 + X1_3-2
6 X0_2-1 + X1_2-1 2-1 2-3 X0_2-3 + X1_2-3
7 X0_2-3 + X1_2-3 2-1 2-3 X0_2-1 + X1_2-1
8 X0_2-1 + X1_2-1 2-3 2-1 X0_2-3 + X1_2-3