我有一个数据帧DF1,有两列,如下所示
Id delq
1 114321134522
2 220033445576
3 554721100333
4 776234167521
我想创建第三列,它将从delq字段中的值中捕获最高位数。所以我需要类似下面的内容
Id delq flag
1 114321134522 5
2 220033445576 7
3 554421160333 6
4 776234169521 9
此外,我想创建多个列,每个列都从这个数字中捕获数字,如下所示
Id Delq flag1 flag2 flag3 flag4 ...so on
1 114321134522 1 1 4 3 ....
2 220033445576 2 2 0 0...
3 554421160333 5 5 4 4...
4 776234169521 7 7 6 2
找不到办法做到这一点。
答案 0 :(得分:4)
我建议data.table::tstrsplit
执行这两项任务,因为它可以让您轻松实现这一过程
library(data.table)
# First question
do.call(pmax.int, tstrsplit(df$delq, "", type.convert = TRUE, fixed = TRUE))
## [1] 5 7 6 9
## Or you could compare digits while they are characters
## because ASCII for 0:9 is in increasing order
as.integer(do.call(pmax.int, tstrsplit(df$delq, "", fixed = TRUE)))
## [1] 5 7 6 9
## Second question
setDT(df)[, tstrsplit(delq, "", type.convert = TRUE, fixed = TRUE)]
# V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12
# 1: 1 1 4 3 2 1 1 3 4 5 2 2
# 2: 2 2 0 0 3 3 4 4 5 5 7 6
# 3: 5 5 4 4 2 1 1 6 0 3 3 3
# 4: 7 7 6 2 3 4 1 6 9 5 2 1
答案 1 :(得分:2)
我们可以split
' delq'转换为单个元素,将其转换为numeric
并获取max
值
sapply(strsplit(as.character(DF1$delq), ""), function(x) max(as.numeric(x)))
关于捕获数字,转换为strsplit
rbind
和list
numeric
元素
res <- do.call(rbind, lapply(strsplit(as.character(DF1$delq), ""), as.numeric))
names(res) <- paste0("Flag", seq_along(res))
cbind(DF1, res)
另外,我们可以使用read.fwf
cbind(DF1, read.fwf(textConnection(as.character(DF1$delq)),
widths= rep(1, max(nchar(DF1$delq)))))
# Id delq V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12
#1 1 114321134522 1 1 4 3 2 1 1 3 4 5 2 2
#2 2 220033445576 2 2 0 0 3 3 4 4 5 5 7 6
#3 3 554721100333 5 5 4 7 2 1 1 0 0 3 3 3
#4 4 776234167521 7 7 6 2 3 4 1 6 7 5 2 1
正如@DavidArenburg所提到的,widths
可以用
as.integer(sapply(strsplit(as.character(DF1$delq), ""), max))