我有几个txt文件是模型的输入文件,我需要更改一些模型参数才能进行一些实验。然而,有许多参数并且手动更改它们有点耗时。我想在R中使用readLines()和{grep}来搜索和替换参数值但不是很成功,希望有人可以帮助我。谢谢。
该文件包含以下行:
Bubbling Pressure 1 = 0.3389 .4423 .4118
Field Capacity 1 = 0.35 0.38 0.37
Wilting Point 1 = 0.13 0.14 0.13
Bulk Density 1 = 750. 1400. 1500.
Vertical Conductivity 1 = 2.904e-06 3.63e-05 3.63e-05
.....
Bubbling Pressure 3 = 0.2044 0.2876 0.2876
Field Capacity 3 = 0.31 0.33 0.33
Wilting Point 3 = 0.13 0.14 0.14
Bulk Density 3 = 750. 1400. 1500.
Vertical Conductivity 3 = 3.16e-06 3.95e-05 3.95e-05
...
我想将所有垂直电导率参数加倍...但我不确定如何用科学记数法(例如“3.16e-06”)隔离这些数字。
有没有办法隔离包含“垂直电导率”模式的行中的每个数字
Vertical Conductivity 3 = 3.16e-06 3.95e-05 3.95e-05
然后将每个数字加倍?
Vertical Conductivity 3 = 6.32e-06 7.90e-05 7.90e-05
我已经设法使用grep来隔离包含“垂直电导率”模式的每行文本,但我不知道如何获取数值......
谢谢, 尼克
答案 0 :(得分:1)
我们可以使用gsubfn
轻松完成此操作,而无需更改原始结构并将其修改为OP可能需要或可能不需要的内容。
在这里,我们使用readLines
读取数据集,得到'lines'的索引,其中'垂直电导率'子串与grepl
('i1')。然后,使用gsubfn
将这些值替换为其中的两倍。
library(gsubfn)
i1 <- grepl("Vertical Conductivity", lines)
lines[i1] <- gsubfn("[0-9.]+e[-+][0-9]+", ~format(as.numeric(x)*2,
scientific = TRUE), lines[i1])
lines
#[1] "Bubbling Pressure 1 = 0.3389 .4423 .4118"
#[2] "Field Capacity 1 = 0.35 0.38 0.37"
#[3] "Wilting Point 1 = 0.13 0.14 0.13"
#[4] "Bulk Density 1 = 750. 1400. 1500."
#[5] "Vertical Conductivity 1 = 5.808e-06 7.26e-05 7.26e-05"
#[6] "Bubbling Pressure 3 = 0.2044 0.2876 0.2876"
#[7] "Field Capacity 3 = 0.31 0.33 0.33"
#[8] "Wilting Point 3 = 0.13 0.14 0.14"
#[9] "Bulk Density 3 = 750. 1400. 1500."
#[10] "Vertical Conductivity 3 = 6.32e-06 7.9e-05 7.9e-05"
lines <- trimws(readLines(textConnection(
'Bubbling Pressure 1 = 0.3389 .4423 .4118
Field Capacity 1 = 0.35 0.38 0.37
Wilting Point 1 = 0.13 0.14 0.13
Bulk Density 1 = 750. 1400. 1500.
Vertical Conductivity 1 = 2.904e-06 3.63e-05 3.63e-05
Bubbling Pressure 3 = 0.2044 0.2876 0.2876
Field Capacity 3 = 0.31 0.33 0.33
Wilting Point 3 = 0.13 0.14 0.14
Bulk Density 3 = 750. 1400. 1500.
Vertical Conductivity 3 = 3.16e-06 3.95e-05 3.95e-05')))
我们也可以直接从文件
中读取lines <- readLines("yourfile.txt")
答案 1 :(得分:0)
您的数据并不整洁,因此第一步是将其变为有用的形式。 Hadley Wickham的tidyr
软件包具有您需要的工具,并与他的dplyr
软件包很好地结合在一起,可以让您将所关注的变量加倍。
# read in data
df <- read.csv(text = 'Bubbling Pressure 1 = 0.3389 .4423 .4118
Field Capacity 1 = 0.35 0.38 0.37
Wilting Point 1 = 0.13 0.14 0.13
Bulk Density 1 = 750. 1400. 1500.
Vertical Conductivity 1 = 2.904e-06 3.63e-05 3.63e-05
Bubbling Pressure 3 = 0.2044 0.2876 0.2876
Field Capacity 3 = 0.31 0.33 0.33
Wilting Point 3 = 0.13 0.14 0.14
Bulk Density 3 = 750. 1400. 1500.
Vertical Conductivity 3 = 3.16e-06 3.95e-05 3.95e-05',
sep = '=', header = FALSE, strip = TRUE)
现在整理:
library(tidyr)
library(dplyr)
# separate variable from identifier
df %>% separate(V1, c('var', 'var_id'), sep = ' (?=.$)', convert = TRUE) %>%
# separate values for each variable
separate(V2, 1:3, sep = ' +', convert = TRUE) %>%
# melt values to long form so there's one observation per row
gather(val_id, val, -var:-var_id, convert = TRUE) %>%
# spread variables so each column is one variable
spread(var, val) %>%
# use data.frame to make names without spaces
data.frame() %>%
# use dplyr::mutate to double vertical conductivity as desired
mutate(Vertical.Conductivity = Vertical.Conductivity * 2)
# var_id val_id Bubbling.Pressure Bulk.Density Field.Capacity Vertical.Conductivity
# 1 1 1 0.3389 750 0.35 5.808e-06
# 2 1 2 0.4423 1400 0.38 7.260e-05
# 3 1 3 0.4118 1500 0.37 7.260e-05
# 4 3 1 0.2044 750 0.31 6.320e-06
# 5 3 2 0.2876 1400 0.33 7.900e-05
# 6 3 3 0.2876 1500 0.33 7.900e-05
# Wilting.Point
# 1 0.13
# 2 0.14
# 3 0.13
# 4 0.13
# 5 0.14
# 6 0.14