Question

我有许多文本文件包含psuedo坐标，格式为[x1 y1] [x2 y2] ...我试图将这些文件导入R中，以便我可以分析它们。然而，当我使用read.table导入它们时，它们变成具有两个变量（x和y）的列表，每个值为“[x”或“y]”，并且每个变量具有多个因子。我的问题是有一种导入或操作数据的方法，使它只是数值x值和y值的数据框？

我尝试使用substr（）删除“[”和“]”字符但得到
“nchar中的错误（test [1,2]）：'nchar（）'需要一个字符向量”
作为错误信息。

Answer 1

让我们假设这是输入文件，它在你的工作目录中，名为“fil.txt”

[5 6][7 8][9 10]
[5 6][7 8][9 10]
[5 6][7 8][9 10]

然后您可以使用readLines，从每行中删除“] [”对以及开头和结尾“[”和“]”，然后使用scan来读取配对值：< / p>

x <-"[5 6][7 8][9 1
[5 6][7 8][9 10]
[5 6][7 8][9 10]"

scan(text= gsub("(^\\[)|(\\]$)", "", gsub("\\]\\[", " ", readLines(textConnection(x))) ), what = list(numeric(), numeric() ) )
Read 9 records
[[1]]
[1] 5 7 9 5 7 9 5 7 9

[[2]]
[1]  6  8 10  6  8 10  6  8 10

# I later realized the pattern could just be "\\[|\\]" and use a single gsub()

> as.data.frame( .Last.value, col.names=c("x","y") )
  x  y
1 5  6
2 7  8
3 9 10
4 5  6
5 7  8
6 9 10
7 5  6
8 7  8
9 9 10

从R中的导入列表中删除[和]

1 个答案: