如何在R中的第n个字符后分割字符串

时间:2020-02-05 21:00:22

标签: r string split data-management

我正在处理以下数据:

District <- c("AR01", "AZ03", "AZ05", "AZ08", "CA01", "CA05", "CA11", "CA16", "CA18", "CA21")

我想在第二个字符之后将字符串分开,并将它们分为两列。

因此数据看起来像这样:

state  district
AR        01
AZ        03
AZ        05
AZ        08
CA        01
CA        05
CA        11
CA        16
CA        18
CA        21

是否有简单的代码可以完成此任务?非常感谢您的帮助

5 个答案:

答案 0 :(得分:6)

如果您总是想除以第二个字符,可以使用substr

District <- c("AR01", "AZ03", "AZ05", "AZ08", "CA01", "CA05", "CA11", "CA16", "CA18", "CA21")
#split district  starting at the first and ending at the second
state <- substr(District,1,2)
#split district starting at the 3rd and ending at the 4th
district <- substr(District,3,4)
#put in data frame if needed.
st_dt <- data.frame(state = state, district = district, stringsAsFactors = FALSE)

答案 1 :(得分:5)

您可以在底数R中使用catch all

Microsoft.Extensions.DependencyInjection

其中strcapture表示两个单词

答案 2 :(得分:2)

OP有written

我更熟悉strsplit()。但是既然没有什么可分割的 ,在这种情况下不适用

Au矛盾!有一些东西可以拆分,它叫做 lookbehind

strsplit(District, "(?<=[A-Z]{2})", perl = TRUE) 

在2个大写字母后,后面的外观类似于“ 插入一个看不见的中断”,然后在其中拆分字符串。

结果是向量列表

[[1]]
[1] "AR" "01"

[[2]]
[1] "AZ" "03"

[[3]]
[1] "AZ" "05"

[[4]]
[1] "AZ" "08"

[[5]]
[1] "CA" "01"

[[6]]
[1] "CA" "05"

[[7]]
[1] "CA" "11"

[[8]]
[1] "CA" "16"

[[9]]
[1] "CA" "18"

[[10]]
[1] "CA" "21"

可以转化为矩阵,例如,通过

do.call(rbind, strsplit(District, "(?<=[A-Z]{2})", perl = TRUE))
      [,1] [,2]
 [1,] "AR" "01"
 [2,] "AZ" "03"
 [3,] "AZ" "05"
 [4,] "AZ" "08"
 [5,] "CA" "01"
 [6,] "CA" "05"
 [7,] "CA" "11"
 [8,] "CA" "16"
 [9,] "CA" "18"
[10,] "CA" "21"

答案 3 :(得分:1)

我们可以使用RESULT ---------------------------- Name Order --------------------------- A Salt B Onion C Black pepper 来捕获前两个字符和其余字符串在单独的列中。

return DB::table('table1')
->leftjoin('table2','table1.nameID','=','table2.nameID')
-get();

答案 4 :(得分:0)

将其作为固定宽度的文件进行处理,然后导入:

# read fixed width file
read.fwf(textConnection(District), widths = c(2, 2), colClasses = "character")
#    V1 V2
# 1  AR 01
# 2  AZ 03
# 3  AZ 05
# 4  AZ 08
# 5  CA 01
# 6  CA 05
# 7  CA 11
# 8  CA 16
# 9  CA 18
# 10 CA 21
相关问题