我有一个数据集,该数据集在同一列中包含多个纬度和经度点,并且还包含带有其他变量的列,例如:
当前数据是什么样子
我想做的是将数字2 (i.e. 144.81803494458699788 and -37.80978699721590175 then 144.8183146450259926 -37.80819285880839686)
的倍数提取到自己的行中。新行还将复制原始行的其余部分,即
我希望数据看起来像什么
我对R很陌生,因此,也许对大家来说这是一个基本问题。更新:我现在已经使用
new$latlongs <- str_extract_all(roadchar$X.wkt_geom, "(?>-)*[0-9]+\\.[0-9]+")
,并提取出包含负号的数字/经度:)
答案 0 :(得分:0)
您可以使用结合了gsub
和strsplit
的循环:
## The data.frame
df <- data.frame ("Polyline" = c("MultiLineString((1.1 - 1.1, 2.2 - 2.2))",
"MultiLineString((3.3 - 3.3, 4.4 - 4.4, 5.5 - 5.5))"),
t(matrix(c(LETTERS[c(1:3,24:26)]), 3,
dimnames = list(c("Char1", "Char2", "Char3")))),
stringsAsFactors = FALSE)
# Polyline Char1 Char2 Char3
# 1 MultiLineString((1.1 - 1.1, 2.2 - 2.2)) A B C
# 2 MultiLineString((3.3 - 3.3, 4.4 - 4.4, 5.5 - 6.6)) X Y Z
## Function for splitting the line
split.polyline <- function(line, df) {
## Removing the text and brackets
cleaned_line <- gsub("\\)\\)", "", gsub("MultiLineString\\(\\(", "", as.character(df$Polyline[line])))
## Splitting the line
split_line <- strsplit(cleaned_line, split = ", ")[[1]]
## Making the line into a data.frame
df_out <- data.frame("Polyline" = split_line,
matrix(rep(df[line, -1], length(split_line)),
nrow = length(split_line), byrow = TRUE,
dimnames = list(c(), names(df)[-1]))
)
return(df_out)
}
## You can use the function like this for the first row for example
df_out <- split.polyline(1, df)
# Polyline Char1 Char2 Char3
# 1 1.1 - 1.1 A B C
# 2 2.2 - 2.2 A B C
## Or loop through all the rows
for(line in 2:nrow(df)){
df_out <- rbind(df_out, split.polyline(line, df))
}
# Polyline Char1 Char2 Char3
# 1 1.1 - 1.1 A B C
# 2 2.2 - 2.2 A B C
# 3 3.3 - 3.3 X Y Z
# 4 4.4 - 4.4 X Y Z
# 5 5.5 - 5.5 X Y Z