考虑以下载体:
[1] "1-1694429" "2-1546669" "3-928598" "4-834486" "5-802353" "6-659439" "7-552850"
"8-516804" "9-364061"
[10] "10-354181" "11-335154" "12-257915" "13-251310" "14-232313" "15-217628" "16-216569"
我正在尝试生成两个向量,每个向量包含通过分隔符“ - ”分割向量的每个元素而获得的值。
我用过:
f <- function(s) strsplit(s, "-")
cc<-sapply(names.reads, f)
头(cc)的 $
1-1694429
[1]“1”“1694429”
$`2-1546669`
[1] "2" "1546669"
我知道我可以访问它们:
> cc[[1]][1]
[1] "1"
> cc[[1]][2]
[1] "1694429"
我想有两个向量,每个向量包含存储在cc[[i]][1]
和cc[[i]][2]
的值...我可以不使用循环吗? (我有超过100万个元素)
答案 0 :(得分:20)
使用mathematical.coffee的建议,以下代码可以避免循环或sapply
names.reads <- c("1-1694429", "2-1546669", "3-928598", "4-834486", "5-802353",
"6-659439", "7-552850", "8-516804", "9-364061", "10-354181",
"11-335154", "12-257915", "13-251310", "14-232313", "15-217628",
"16-216569")
cc <- strsplit(names.reads,'-')
part1 <- unlist(cc)[2*(1:length(names.reads))-1]
part2 <- unlist(cc)[2*(1:length(names.reads)) ]
产生
> part1
[1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14" "15"
[16] "16"
> part2
[1] "1694429" "1546669" "928598" "834486" "802353" "659439" "552850"
[8] "516804" "364061" "354181" "335154" "257915" "251310" "232313"
[15] "217628" "216569"
虽然它确实要求每个原始值都是预期的格式。
答案 1 :(得分:6)
另一种方法:
names.reads <- c("1-1694429", "2-1546669", "3-928598", "4-834486", "5-802353",
"6-659439", "7-552850", "8-516804", "9-364061", "10-354181",
"11-335154", "12-257915", "13-251310", "14-232313", "15-217628",
"16-216569")
library(reshape2)
colsplit(string=names.reads, pattern="-", names=c("Part1", "Part2"))
Part1 Part2
1 1 1694429
2 2 1546669
3 3 928598
4 4 834486
5 5 802353
6 6 659439
7 7 552850
8 8 516804
9 9 364061
10 10 354181
11 11 335154
12 12 257915
13 13 251310
14 14 232313
15 15 217628
16 16 216569
答案 2 :(得分:6)
使用T
(为了完整性):
sapply()
正如@Bird在评论中指出的那样,y <- c("1-1694429", "2-1546669", "3-928598", "4-834486", "5-802353", "6-659439", "7-552850", "8-516804", "9-364061",
"10-354181", "11-335154", "12-257915", "13-251310", "14-232313", "15-217628", "16-216569")
参数可用于避免生成的向量中的名称。
USE.NAMES
x <- sapply(y, function(x) strsplit(x, "-")[[1]], USE.NAMES=FALSE)
a <- x[1,]
答案 3 :(得分:3)
或使用purrr
包:
第1部分:
> map(strsplit(names.reads, "-"), ~.x[1]) %>% unlist()
[1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13"
[14] "14" "15" "16"
第2部分:
> map(strsplit(names.reads, "-"), ~.x[2]) %>% unlist()
[1] "1694429" "1546669" "928598" "834486" "802353" "659439"
[7] "552850" "516804" "364061" "354181" "335154" "257915"
[13] "251310" "232313" "217628" "216569"
答案 4 :(得分:2)
想要解决类似的问题,遇到了这篇文章。添加我的解决方案虽然我在未来遥遥领先! (从亨利那里复制代码)
names.reads <- c("1-1694429", "2-1546669", "3-928598", "4-834486", "5-802353",
"6-659439", "7-552850", "8-516804", "9-364061", "10-354181",
"11-335154", "12-257915", "13-251310", "14-232313", "15-217628",
"16-216569")
require(plyr)
cc <- ldply(strsplit(names.reads, '-'))
cc$V1;cc$V2
生成一个数据框,可以从中提取与列表中每个项目的第n个元素相关的向量。