我正在尝试为属于每个数据的数据添加数字
我的数据就像
df <- structure(list(data = structure(c(1L, 1L, 1L, 1L, 1L, 3L, 3L,
4L, 4L, 5L, 5L, 6L, 5L, 7L, 7L, 8L, 8L, 2L, 2L, 2L), .Label = c("data1",
"data10", "data2", "data3", "data4", "data5", "data6", "data7"
), class = "factor"), values = structure(c(3L, 8L, 18L, 1L, 15L,
17L, 19L, 7L, 2L, 2L, 11L, 10L, 6L, 4L, 9L, 12L, 14L, 5L, 13L,
16L), .Label = c("112864.443", "11319531", "12874.443", "142983324",
"1612410048", "16349475.63", "184901841", "2223793.8", "30553282.01",
"312004.547", "3135868.44", "317403612.9", "3686081.063", "43701608",
"623793.8", "64959501.42", "67666215", "767666215", "775987137.8"
), class = "factor")), .Names = c("data", "values"), class = "data.frame", row.names = c(NA,
-20L))
我想在每个第一列之后得到确切的值。因为它们不是连续的,所以我不知道如何将它们添加到单独的列中。欲望输出应如下所示
data values
data1 12874.443 1
data1 2223793.8 1
data1 767666215 1
data1 112864.443 1
data1 623793.8 1
data2 67666215 2
data2 775987137.8 2
data3 184901841 3
data3 11319531 3
data4 11319531 4
data4 3135868.44 4
data5 312004.547 5
data4 16349475.63 4
data6 142983324 6
data6 30553282.01 6
data7 317403612.9 7
data7 43701608 7
data10 1612410048 10
data10 3686081.063 10
data10 64959501.42 10
答案 0 :(得分:0)
一种方法是使用gsub
提取值并将其添加为另一列
df$label <- gsub("[^[:digit:]]", "", df$data)
另一种方法是使用str_extract
,因为这个问题R: split character data into numbers and letters
library(stringr)
df$label <- as.numeric(str_extract(df$data, "[0-9]+"))
> df
# data values label
# 1 data1 12874.443 1
# 2 data1 2223793.8 1
# 3 data1 767666215 1
# 4 data1 112864.443 1
# 5 data1 623793.8 1
# 6 data2 67666215 2
# 7 data2 775987137.8 2
# 8 data3 184901841 3
# 9 data3 11319531 3
# 10 data4 11319531 4
# 11 data4 3135868.44 4
# 12 data5 312004.547 5
# 13 data4 16349475.63 4
# 14 data6 142983324 6
# 15 data6 30553282.01 6
# 16 data7 317403612.9 7
# 17 data7 43701608 7
# 18 data10 1612410048 10
# 19 data10 3686081.063 10
# 20 data10 64959501.42 10