我想从标题
中寻求提取信息的帮助我在一个文件(下面的示例)中有一个包含数百行和1000列(相等)的表,就像这个一样,我想创建一个循环来从标题部分(新列)中提取日期并重新排序行中的值。
R2n_19970919__105056604_2_BF.MER_A123_DAY_00.nc <- c(0.09,0.09,0.08,0.08,0.06,0.07,0.09,0.08,0.08,"NA")
R2n_19970920__105056604_2_BF.MER_A123_DAY_00.nc <- c(0.08,0.08,0.08,0.07,"NA",0.05,0.08,0.08,0.08,"NA")
R2n_19970921__105056604_2_BF.MER_A123_DAY_00.nc <- c(0.07,"NA",0.08,"NA","NA",0.07,0.06,"NA",0.08,"NA")
data <- data.frame(R2n_19970919__105056604_2_BF.MER_A123_DAY_00.nc,R2n_19970920__105056604_2_BF.MER_A123_DAY_00.nc,R2n_19970921__105056604_2_BF.MER_A123_DAY_00.nc)
如何做到最好?非常感谢帮助。
这是我的预期结果:
R2n_ 19970919 __ 105056604_2_BF.MER_A123_DAY_00.nc = 1997/09/19。
Date R2n.nc
1997/09/19 0.09
1997/09/19 0.09
1997/09/19 0.08
1997/09/19 0.08
1997/09/19 0.06
1997/09/19 0.07
1997/09/19 0.09
1997/09/19 0.08
1997/09/19 0.08
1997/09/19 NA
1999/09/20 0.08
1999/09/20 0.08
1999/09/20 0.08
1999/09/20 0.07
1999/09/20 NA
1999/09/20 0.05
1999/09/20 0.08
1999/09/20 0.08
1999/09/20 0.08
1999/09/20 NA
2001/09/21 ...
.
.
.
答案 0 :(得分:1)
这是一个解决方案,使用@RomanLuštrik建议的提示:
library(stringr) # str_sub() function
library(reshape2) # melt() function
# Modify columns names (if date information is always at the same position)
names(data) = paste0(str_sub(names(data), 5,8), "-", str_sub(names(data), 9,10), "-",str_sub(names(data), 11, 12))
data$id = seq(1,nrow(data))
# Melt the data
data_melt = melt(data, id = "id")
> data_melt
id variable value
1 1 1997-09-19 0.09
2 2 1997-09-19 0.09
3 3 1997-09-19 0.08
4 4 1997-09-19 0.08
5 5 1997-09-19 0.06
...
答案 1 :(得分:0)
library(anytime)
df <- stack(data)
df$ind <- anydate(substr(df$ind, 5, 12))
head(df)
## values ind
## 1 0.09 1997-09-19
## 2 0.09 1997-09-19
## 3 0.08 1997-09-19
## 4 0.08 1997-09-19
## 5 0.06 1997-09-19
## 6 0.07 1997-09-19
虽然我可能会这样做:
library(anytime)
library(dplyr)
tbl_df(data) %>%
stack() %>%
mutate(ind=anydate(substr(ind, 5, 12)))
## # A tibble: 30 × 2
## values ind
## <chr> <date>
## 1 0.09 1997-09-19
## 2 0.09 1997-09-19
## 3 0.08 1997-09-19
## 4 0.08 1997-09-19
## 5 0.06 1997-09-19
## 6 0.07 1997-09-19
## 7 0.09 1997-09-19
## 8 0.08 1997-09-19
## 9 0.08 1997-09-19
## 10 NA 1997-09-19
## # ... with 20 more rows
代替。