我有一个5级因子,如下所示:
tmp
[1] NA
[2] 1,2,3,6,11,12,13,18,20,21,22,26,29,33,40,43,46
[3] NA
[4] NA
[5] 5,9,16,24,35,36,42
[6] 4,7,10,14,15,17,19,23,25,27,28,30,31,32,34,37,38,41,44,45,47,48,49,50
[7] 8,39
5 Levels: 1,2,3,6,11,12,13,18,20,21,22,26,29,33,40,43,46 ...
我想访问除NA之外的每个级别中的项目。所以我使用了levels()
函数,它给了我:
> levels(tmp)
[1] "1,2,3,6,11,12,13,18,20,21,22,26,29,33,40,43,46"
[2] "4,7,10,14,15,17,19,23,25,27,28,30,31,32,34,37,38,41,44,45,47,48,49,50"
[3] "5,9,16,24,35,36,42"
[4] "8,39"
[5] "NA"
然后我想访问每个级别的元素,并将它们存储为数字。但是,例如,
>as.numeric(cat(levels(tmp)[3]))
5,9,16,24,35,36,42numeric(0)
你能帮我删除数字中的逗号和最后的数字(0)吗?我想有一个数字5,9,16,24,35,36,42的向量,以便我可以使用它们作为索引来访问数据帧。谢谢!
答案 0 :(得分:3)
您需要使用unlist
,strsplit
和unique
的组合。
首先,重新创建您的数据:
dat <- read.table(text="
NA
1,2,3,6,11,12,13,18,20,21,22,26,29,33,40,43,46
NA
NA
5,9,16,24,35,36,42
4,7,10,14,15,17,19,23,25,27,28,30,31,32,34,37,38,41,44,45,47,48,49,50
8,39")$V1
接下来,在使用unique
后找到所有strsplit
级别:
sort(unique(unlist(
sapply(levels(dat), function(x)unlist(strsplit(x, split=",")))
)))
[1] "1" "10" "11" "12" "13" "14" "15" "16" "17" "18" "19" "2" "20" "21" "22" "23" "24" "25" "26"
[20] "27" "28" "29" "3" "30" "31" "32" "33" "34" "35" "36" "37" "38" "39" "4" "40" "41" "42" "43"
[39] "44" "45" "46" "47" "48" "49" "5" "50" "6" "7" "8" "9"
答案 1 :(得分:2)
这样做你想要的吗?
levels_split <- strsplit(levels(tmp), ",")
lapply(levels_split, as.numeric)
答案 2 :(得分:0)
使用Andrie's
dat
val <- scan(text=levels(dat),sep=",")
#Read 50 items
split(val,cumsum(c(T,diff(val) <0)))
#$`1`
#[1] 1 2 3 6 11 12 13 18 20 21 22 26 29 33 40 43 46
#$`2`
#[1] 4 7 10 14 15 17 19 23 25 27 28 30 31 32 34 37 38 41 44 45 47 48 49 50
#$`3`
#[1] 5 9 16 24 35 36 42
#$`4`
#[1] 8 39