我有一个数据框system("pause");
,格式如下:元素数据类型是字符。
df
我希望将这些数据提取为两部分:第一部分是字符串中的最后一个数字,以及数字前面的空格之前的所有文本。此外,当提取数字时,我怎么能将字符转换为可用的整数?我打算将提取的数据保留在数据框中。完成后看起来如下:
Well and Depth
Black Peak 1000
Black Peak 1001
Black Peak 1002
Black Peak 10150
Black Peak 10151
上面的两个列表是数据框 Well Depth
Black Peak 1000
Black Peak 1001
Black Peak 1002
Black Peak 10150
Black Peak 10151
答案 0 :(得分:0)
从 stringr (https://www.rdocumentation.org/packages/stringr/versions/1.1.0/topics/str_split)尝试 str_split(),然后将第二列转换为数字,例如 as.numeric() 。
答案 1 :(得分:0)
数据
# example dataset
df = data.frame(v = c("Black Peak 1000", "Black Peak 1001", "Black Peak 1002",
"Black Peak 10150", "Black Peak 10151"), stringsAsFactors = F)
使用基础R
# split by last space, bind rows and save it as dataframe
df2 = data.frame(do.call(rbind, strsplit(df$v, ' (?=[^ ]+$)', perl=TRUE)), stringsAsFactors = F)
# set names
names(df2) = c("Well", "Depth")
# update to numeric
df2$Depth = as.numeric(df2$Depth)
df2
# Well Depth
# 1 Black Peak 1000
# 2 Black Peak 1001
# 3 Black Peak 1002
# 4 Black Peak 10150
# 5 Black Peak 10151
或使用tidyverse
方法
library(tidyverse)
df %>%
separate(v, sep = ' (?=[^ ]+$)', into = c("Well","Depth")) %>%
mutate(Depth = as.numeric(Depth))
# Well Depth
# 1 Black Peak 1000
# 2 Black Peak 1001
# 3 Black Peak 1002
# 4 Black Peak 10150
# 5 Black Peak 10151