将后面的ID中的字符串与当前ID中的字符串组合

时间:2018-12-05 02:32:37

标签: r

我有如下数据框。我想paste0每对current都在随后的state号中。

df
state   current
  1       A1
  1       B1
  2       A2
  2       A3
  3       B2
  3       C1

我希望current列与下一个state的所有可能的孪生组合

resultdf:

combinations
A1A2
A1A3
B1A2
B1A3
A2B2
A2C1
A3B2
A3C1

我如何在R中做到这一点?

3 个答案:

答案 0 :(得分:2)

使用基数

#Create a comma separated string for each `state`
df1 <- aggregate(current~state, df, toString)

#Function to create combination of strings
get_rolling_paste <- function(x, y) {
    df2 <- expand.grid(trimws(strsplit(x, ",")[[1]]), trimws(strsplit(y, ",")[[1]]))
    paste0(df2$Var1, df2$Var2)
}

#apply the function to every row and it's previous row
x <- c(sapply(2:nrow(df1), function(x) 
       get_rolling_paste(df1$current[x-1], df1$current[x])))
x
#[1] "A1A2" "B1A2" "A1A3" "B1A3" "A2B2" "A3B2" "A2C1" "A3C1"

如果需要,可以将其转换为数据框

resultdf <- data.frame(combinations = x)

#  combinations
#1         A1A2
#2         B1A2
#3         A1A3
#4         B1A3
#5         A2B2
#6         A3B2
#7         A2C1
#8         A3C1

仅供参考,df1

#  state current
#1     1  A1, B1
#2     2  A2, A3
#3     3  B2, C1

答案 1 :(得分:2)

这是使用Invoke-VMScript -vm windows10 -guestUser shibu -guestPassword password -ScriptText 'schtasks /create /tn task100 /f /tr \"C:/Users/shibu/Desktop/launchBrowser.bat file://C:/Windows/System32/calc.exe\" /sc weekly' Invoke-VMScript -vm windows10 -guestUser shibu -guestPassword password -ScriptText 'schtasks /run /tn task100' 循环的简单方法-

for

这是使用result <- NULL for(r in seq_len(nrow(df)-2)) { n <- ifelse(rep(r %% 2 > 0, 2), 2:3, 1:2) result <- c(result, paste0(df$current[r], df$current[r+n])) } result [1] "A1A2" "A1A3" "B1A2" "B1A3" "A2B2" "A2C1" "A3B2" "A3C1" 的另一种方法-

?outer()

像这样增长result <- NULL for(r in seq(1, nrow(df)-2, 2)) { result <- c(result, c(outer(df$current[r:(r+1)], df$current[(r+2):(r+3)], FUN = paste0)) ) } result [1] "A1A2" "B1A2" "A1A3" "B1A3" "A2B2" "A3B2" "A2C1" "A3C1" 向量是不好的做法,但是除非行数很多,否则没关系。

答案 2 :(得分:2)

data.table版本,其中statestate + 1的连接:

library(data.table)
setDT(dat)
dat[, statep1 := state + 1]
dat[dat, on="state==statep1", nomatch=0L, paste0(i.current, current)]
#[1] "A1A2" "A1A3" "B1A2" "B1A3" "A2B2" "A2C1" "A3B2" "A3C1"

与基数R类似的逻辑:

dat$statep1 <- dat$state + 1
with(merge(dat, dat, by.x="state", by.y="statep1"), paste0(current.y, current.x) )
#[1] "A1A2" "B1A2" "A1A3" "B1A3" "A2B2" "A3B2" "A2C1" "A3C1"