如何在不使用任何循环的情况下从第一个表(table1)获取第二个表(table2)?
table1 <- data.frame(stringsAsFactors=FALSE,
x = c("1,2,3"),
y = c("a,b,c,d"),
z = c("e,f"))
table1
|x |y |z |
|:-----|:-------|:---|
|1,2,3 |a,b,c,d |e,f |
table2 <- data.frame(stringsAsFactors=FALSE,
x = c(1, 2, 3, NA),
y = c("a", "b", "c", "d"),
z = c("e", "f", NA, NA))
table2
| x|y |z |
|--:|:--|:--|
| 1|a |e |
| 2|b |f |
| 3|c |NA |
| NA|d |NA |
答案 0 :(得分:0)
这是我尝试使用基数R的解决方案;
x = data.frame(x="1,2,3", y="a,b,c,d", z="e,f", stringsAsFactors = F)
# split each column by the comma
x2 = lapply(x, function(x) strsplit(x, ",")[[1]])
# find the max length of the column
L = max(sapply(x2, length))
# make all the columns equal that length i.e. fill the missing with NA
x3 = lapply(x2, function(x) { length(x) = L; x })
# cbind them all together and turn into dataframe
x4 = data.frame(do.call(cbind, x3))
虽然很长。我希望看到一个更好的解决方案。
答案 1 :(得分:0)
您可以使用stringr包来实现此目的
table1 <- data.frame(stringsAsFactors=FALSE,
x = c("1,2,3"),
y = c("a,b,c,d"),
z = c("e,f"))
t(stringr::str_split_fixed(table1, pattern = ",", max(stringr::str_count(table1, ","))+1))
#> [,1] [,2] [,3]
#> [1,] "1" "a" "e"
#> [2,] "2" "b" "f"
#> [3,] "3" "c" ""
#> [4,] "" "d" ""
由reprex package(v0.2.0)于2019-02-20创建。
将其分解为单独的步骤
max(stringr::str_count(table1, ","))+1
stringr::str_split_fixed(table1, pattern = ",", max(stringr::str_count(table1, ","))+1)
t(stringr::str_split_fixed(table1, pattern = ",", max(stringr::str_count(table1, ","))+1))