一个漫长的早晨后,我放弃了!
我有以下文本文件:StationLog.txt
包含以下内容:
Version = 2.0
StationName = STN67_P70
BeginTime = 2017-10-06.03:25:00
EndTime = 2017-10-06.03:55:00
IgnoreNo = 5000
PumpedVolume = 0
我需要提取BeginTime,EndTime和StationName,这些标题很好,因为它们的值会馈入另一段代码。
我的想法是我不必手动进行操作,因为这些文件会及时出现。
到目前为止,我还需要遵循其他各种指南:
a <- read.fwf("StationLog.txt", c(37,100), stringsAsFactors=FALSE)
a <- a[grep("=", a$V1), ]
a <- cbind(
do.call( rbind, strsplit(a$V1, "=\\s+") )
但是碰壁,任何帮助将不胜感激!
答案 0 :(得分:2)
基于read.table
函数,您可以使用参数来执行所需的操作。
如果每行只有一个=
,并且BeginTime,EndTime和StationName在所有文件中的书写都相同,则以下建议将起作用:
read.table(
file ="StationLog.txt",
header =FALSE, # No column names
sep ="=", # separator character
strip.white =TRUE, # remove multiple white character
row.names =1, # the first column contains the rownames
stringsAsFactors=FALSE
)[c("BeginTime", "EndTime", "StationName"), # extract the 3 infos based on their names corresponding to the rownames
,drop=FALSE] # keep the data.frame format
结果:
V2
BeginTime 2017-10-06.03:25:00
EndTime 2017-10-06.03:55:00
StationName STN67_P70
答案 1 :(得分:0)
如果您将整个内容读为多行字符串:
数据:
txt_string <- "Version = 2.0
StationName = STN67_P70
BeginTime = 2017-10-06.03:25:00
EndTime = 2017-10-06.03:55:00
IgnoreNo = 5000
PumpedVolume = 0"
stationName<- regmatches(txt_string, gregexpr("StationName\\s+=\\s+\\K\\S+" ,txt_string, perl = T))
beginTime <- regmatches(txt_string, gregexpr("BeginTime\\s+=\\s+\\K\\S+" ,txt_string, perl = T))
endTime <- regmatches(txt_string, gregexpr("EndTime\\s+=\\s+\\K\\S+" ,txt_string, perl = T))
do.call(cbind, c(stationName, beginTime, endTime))
# [,1] [,2] [,3]
#[1,] "STN67_P70" "2017-10-06.03:25:00" "2017-10-06.03:55:00"
答案 2 :(得分:0)
备用方法:
# use the filename vs this embedded example
station_info <- readLines(textConnection("Version = 2.0
StationName = STN67_P70
BeginTime = 2017-10-06.03:25:00
EndTime = 2017-10-06.03:55:00
IgnoreNo = 5000
PumpedVolume = 0"))
基础:
as.list(sapply(
strsplit(station_info, split = "[[:space:]]*=[[:space:]]*"),
function(x) setNames(x[2], x[1])
)) -> station_info
str(station_info, 1)
## List of 6
## $ Version : chr "2.0"
## $ StationName : chr "STN67_P70"
## $ BeginTime : chr "2017-10-06.03:25:00"
## $ EndTime : chr "2017-10-06.03:55:00"
## $ IgnoreNo : chr "5000"
## $ PumpedVolume: chr "0"
tidyverse:
library(tidyverse)
# use the filename vs this embedded example
station_info <- readLines(textConnection("Version = 2.0
StationName = STN67_P70
BeginTime = 2017-10-06.03:25:00
EndTime = 2017-10-06.03:55:00
IgnoreNo = 5000
PumpedVolume = 0"))
str_split(station_info, pattern = "[[:space:]]*=[[:space:]]*") %>%
map(~set_names(.x[2], .x[1])) %>%
flatten() %>%
str(1)
将基本版本(以避免依赖)包装为一个函数,以便您可以将其重新用于其他工作站:
read_station_metadata <- function(path) {
path <- path.expand(path)
stopifnot(file.exists(path))
station_info <- readLines(path, warn = FALSE)
as.list(sapply(
strsplit(station_info, split = "[[:space:]]*=[[:space:]]*"),
function(x) setNames(x[2], x[1])
))
}