从列中提取数字,然后创建新行

时间:2018-11-26 00:38:37

标签: r

我有一个数据集,该数据集在同一列中包含多个纬度和经度点,并且还包含带有其他变量的列,例如:

当前数据是什么样子

What data currently looks like

我想做的是将数字2 (i.e. 144.81803494458699788 and -37.80978699721590175 then 144.8183146450259926 -37.80819285880839686)的倍数提取到自己的行中。新行还将复制原始行的其余部分,即

我希望数据看起来像什么

What I would like the data to look like

我对R很陌生,因此,也许对大家来说这是一个基本问题。更新:我现在已经使用

new$latlongs <- str_extract_all(roadchar$X.wkt_geom, "(?>-)*[0-9]+\\.[0-9]+") 

,并提取出包含负号的数字/经度:)

1 个答案:

答案 0 :(得分:0)

您可以使用结合了gsubstrsplit的循环:

## The data.frame
df <- data.frame ("Polyline" = c("MultiLineString((1.1 - 1.1, 2.2 - 2.2))",
                                 "MultiLineString((3.3 - 3.3, 4.4 - 4.4, 5.5 - 5.5))"),
                  t(matrix(c(LETTERS[c(1:3,24:26)]), 3,
                    dimnames = list(c("Char1", "Char2", "Char3")))),
                  stringsAsFactors = FALSE)


#                                             Polyline Char1 Char2 Char3
# 1            MultiLineString((1.1 - 1.1, 2.2 - 2.2))     A     B     C
# 2 MultiLineString((3.3 - 3.3, 4.4 - 4.4, 5.5 - 6.6))     X     Y     Z

## Function for splitting the line
split.polyline <- function(line, df) {
    ## Removing the text and brackets
    cleaned_line <- gsub("\\)\\)", "", gsub("MultiLineString\\(\\(", "", as.character(df$Polyline[line])))
    ## Splitting the line
    split_line <- strsplit(cleaned_line, split = ", ")[[1]]
    ## Making the line into a data.frame
    df_out <- data.frame("Polyline" = split_line,
                         matrix(rep(df[line, -1], length(split_line)),
                                nrow = length(split_line), byrow = TRUE,
                                dimnames = list(c(), names(df)[-1]))
                        )
    return(df_out)
}


## You can use the function like this for the first row for example
df_out <- split.polyline(1, df)
#    Polyline Char1 Char2 Char3
# 1 1.1 - 1.1     A     B     C
# 2 2.2 - 2.2     A     B     C

## Or loop through all the rows
for(line in 2:nrow(df)){
    df_out <- rbind(df_out, split.polyline(line, df))
}
#    Polyline Char1 Char2 Char3
# 1 1.1 - 1.1     A     B     C
# 2 2.2 - 2.2     A     B     C
# 3 3.3 - 3.3     X     Y     Z
# 4 4.4 - 4.4     X     Y     Z
# 5 5.5 - 5.5     X     Y     Z