Question

我有包含以下行（或类似）的文本文件：

178487 \ ASF = -873.1421319 \ NFGH = 540.56201 \ PG = C01

如何使用R？

在

  <ScrollView
        android:layout_width="368dp"
        android:layout_height="495dp"
        tools:layout_editor_absoluteY="8dp"
        tools:layout_editor_absoluteX="8dp"
        app:layout_constraintBottom_toBottomOf="parent"
        app:layout_constraintLeft_toLeftOf="parent"
        app:layout_constraintRight_toRightOf="parent"
        app:layout_constraintTop_toTopOf="parent"
        tools:layout_constraintBottom_creator="1"
        tools:layout_constraintLeft_creator="1"
        tools:layout_constraintRight_creator="1"
        tools:layout_constraintTop_creator="1">
        <LinearLayout
            android:layout_width="match_parent"
            android:layout_height="match_parent"
            android:layout_marginBottom="8dp"
            android:layout_marginEnd="8dp"
            android:layout_marginLeft="8dp"
            android:layout_marginRight="8dp"
            android:layout_marginStart="8dp"
            android:layout_marginTop="8dp"
            android:gravity="center"
            android:orientation="vertical"
          >

之后提取值

我已经开始：

        let webView = WKWebView(frame:NSMakeRect(0, 0,1000, 1000))
        webView.load(NSURLRequest(url:NSURL(string:"https://YourWebURLToLoad") as! URL) as URLRequest)
        webView.navigationDelegate = self;
        self.window.contentView?.addSubview(webView)

Answer 1

您想要提取-873.1421319后面的\ASF=，例如，来自字符串：178487\ASF=-873.1421319\NFGH=540.56201\PG=C01

您使用的模式[0-9]+$不正确，原因有很多：

$匹配字符串的结尾。此模式将匹配示例字符串中的01，因为它是结尾处的数字序列。
模式[0-9]+将匹配非空数字序列。它不会包含-和.。

因此，您需要删除$，并改进模式以考虑-和.，例如：-?[0-9]+(\\.[0-9]+)?。

然而，这仍然足够，因为你只想要\ASF=之后的数字，但是在比赛中没有包含\ASF=本身。要做到这一点，你需要使用积极的lookbehind：

library(stringr)
str_extract(s, '(?<=\\\\ASF=)-?[0-9]+(\\.[0-9]+)?')

Answer 2

library(stringr)
file_list <- list.files(pattern = "*.txt")
for (i in 1:length(file_list)) {
    mydataFrame = readLines(file_list[i])
    for (line in mydataFrame) {
        elems <- unlist(strsplit(line, split = "\\\\"))
        value <- as.numeric(str_extract(elems[2], "[+|-][0-9]*\\.?[0-9]*"))
    }
}

首先，字符串分为\，第二个字段为ASF及其关联值。然后，您可以使用str_extract立即提取数字部分。

Answer 3

编辑以显示完整代码

当您循环遍历多个文件时，您需要做一些事情来防止在每个循环中覆盖值。一种选择是使用列表。如果您希望将结果作为一个向量，则可以使用c代替。

file_list <- list.files(pattern = "*.txt")
# Initialise empty list
value <- list()
for (i in 1:length(file_list)) {
  mydataFrame = readLines(file_list[i])
  value[[i]] <- as.numeric(sub(".*ASF=(-[0-9]+\\.[0-9]+).*$","\\1",mydataFrame))
}

请注意，您不需要致电grep然后sub。只需sub。

我在包含两个文本文件的文件夹上测试了这个

> value
[[1]]
[1] -873.1421 -823.1421 -813.1421

[[2]]
[1] -573.1421 -223.1421 -713.1421

从文本中提取数值

3 个答案: