我的数据集是关于森林大火和NDVI值(该值介于0到1之间,表示表面的绿色程度)。它有一个第一列,该行表示第一行的森林大火发生的时间,随后的各列指示火灾发生前后不同日期的NDVI值。火灾前的NDVI值比火灾后的NDVI值高得多。像这样:
data1989 <- data.frame("date_fire" = c("1987-01-01", "1987-07-03", "1988-01-01"),
"1986-01-01" = c(0.5, 0.589, 0.66),
"1986-06-03" = c(0.56, 0.447, 0.75),
"1986-10-19" = c(0.8, NA, 0.83),
"1987-01-19" = c(0.75, 0.65,0.75),
"1987-06-19" = c(0.1, 0.55,0.811),
"1987-10-19" = c(0.15, 0.12, 0.780),
"1988-01-19" = c(0.2, 0.22,0.32),
"1988-06-19" = c(0.18, 0.21,0.23),
"1988-10-19" = c(0.21, 0.24, 0.250),
stringsAsFactors = FALSE)
> data1989
date_fire X1986.01.01 X1986.06.03 X1986.10.19 X1987.01.19 X1987.06.19 X1987.10.19 X1988.01.19 X1988.06.19 X1988.10.19
1 1987-01-01 0.500 0.560 0.80 0.75 0.100 0.15 0.20 0.18 0.21
2 1987-07-03 0.589 0.447 NA 0.65 0.550 0.12 0.22 0.21 0.24
3 1988-01-01 0.660 0.750 0.83 0.75 0.811 0.78 0.32 0.23 0.25
我想在森林火灾发生之前的新列中计算NDVI值的平均值。在第一种情况下,它是第2、3、4和5列的平均值。
我需要得到的是:
date_fire X1986.01.01 X1986.06.03 X1986.10.19 X1987.01.19 X1987.06.19 X1987.10.19 X1988.01.19 X1988.06.19 X1988.10.19 meanPreFire
1 1987-01-01 0.500 0.560 0.80 0.75 0.100 0.15 0.20 0.18 0.21 0.653
2 1987-07-03 0.589 0.447 NA 0.65 0.550 0.12 0.22 0.21 0.24 0.559
3 1988-01-01 0.660 0.750 0.83 0.75 0.811 0.78 0.32 0.23 0.25 0.764
谢谢!
编辑:解决方案
如何使用多于一列的代码来排除:
data1989 <- data.frame("date_fire" = c("1987-02-01", "1987-07-03", "1988-01-01"),
"type" = c("oak", "pine", "oak"),
"meanRainfall" = c(600, 300, 450),
"1986.01.01" = c(0.5, 0.589, 0.66),
"1986.06.03" = c(0.56, 0.447, 0.75),
"1986.10.19" = c(0.8, NA, 0.83),
"1987.01.19" = c(0.75, 0.65,0.75),
"1987.06.19" = c(0.1, 0.55,0.811),
"1987.10.19" = c(0.15, 0.12, 0.780),
"1988.01.19" = c(0.2, 0.22,0.32),
"1988.06.19" = c(0.18, 0.21,0.23),
"1988.10.19" = c(0.21, 0.24, 0.250),
check.names = FALSE,
stringsAsFactors = FALSE)
使用:
j1 <- findInterval(as.Date(data1989$date_fire), as.Date(names(data1989)[-(1:3)],format="%Y.%m.%d"))
m1 <- cbind(rep(seq_len(nrow(data1989)), j1), sequence(j1))
data1989$meanPreFire <- tapply(data1989[-(1:3)][m1], m1[,1], FUN = mean, na.rm = TRUE)
> data1989
date_fire type meanRainfall 1986.01.01 1986.06.03 1986.10.19 1987.01.19 1987.06.19 1987.10.19 1988.01.19 1988.06.19 1988.10.19 meanPreFire
1 1987-02-01 oak 600 0.500 0.560 0.80 0.75 0.100 0.15 0.20 0.18 0.21 0.6525
2 1987-07-03 pine 300 0.589 0.447 NA 0.65 0.550 0.12 0.22 0.21 0.24 0.5590
3 1988-01-01 oak 450 0.660 0.750 0.83 0.75 0.811 0.78 0.32 0.23 0.25 0.7635
答案 0 :(得分:3)
将数据重整为长格式并在森林大火发生之前过滤日期。
library(tidyverse)
data1989 %>%
pivot_longer(-date_fire, names_to = "date") %>%
mutate(date_fire = as.Date(date_fire),
date = as.Date(date, "X%Y.%m.%d")) %>%
filter(date < date_fire) %>%
group_by(date_fire) %>%
summarise(meanPreFire = mean(value, na.rm = T))
# # A tibble: 3 x 2
# date_fire meanPreFire
# <date> <dbl>
# 1 1987-01-01 0.62
# 2 1987-07-03 0.559
# 3 1988-01-01 0.764
答案 1 :(得分:2)
我们可以通过创建行/列索引来使用Private Sub btnBuscar4_Click()
Const DATA = "C:\Users\Bonito\Desktop\Plataforma\Datos.xlsm"
'declarar las variables
Dim rngToFilter As Range
Dim FindRow As Range
Dim LastRow As Integer
Dim cRow As String
Dim Datos As Worksheet
Set Datos = Workbooks.Open(DATA).Worksheets("Datos")
'Aplica la liberaci?n de las hojas para consultarlas
'SheetProtection
'Si hay filtros, los elimina de la hoja Datos
If ActiveSheet.AutoFilterMode Then ActiveSheet.AutoFilterMode = False
'Windows("Datos.xlsm").Visible = False 'Hace que no se muestre el excel externo (Datos)
'Makes external excel not show (Data)
'hold in memory and stop screen flicker
'Application.ScreenUpdating = False
If Me.bLeg3 <> "" And Me.bApe3 <> "" Then
' Please, enter a File or a Last Name
MsgBox "Por favor, ingresar un Legajo o un Apellido"
Exit Sub
End If
'error block
On Error GoTo errHandler:
'Filtrar solo por Legajo
If Me.bLeg3 <> "" Then
'Guardar el legajo en una variable
cRow = Me.bLeg3.Value
LastRow = Sheets("Datos").Range("A" & Rows.Count).End(xlUp).Row
Set rngToFilter = Worksheets("Datos").Range("A1:A" & LastRow)
'Filtrar solo por Apellido
ElseIf Me.bApe3 <> "" Then
'Encontrar la fila con la data
cRow = Me.bApe3.Value
LastRow = Sheets("Datos").Range("B" & Rows.Count).End(xlUp).Row
Set rngToFilter = Worksheets("Datos").Range("B1:B" & LastRow)
End If
' count filtered rows
rngToFilter.AutoFilter Field:=1, Criteria1:=cRow
Reg2.Value = rngToFilter.SpecialCells(xlCellTypeVisible).Cells.Count - 1
Set FindRow = rngToFilter.Find(What:=cRow, LookIn:=xlValues)
Me.CurrentAddress = FindRow.Address 'te trae la celda actual
'agregar los valores a las casillas correspondientes
Call SheetToForm(FindRow)
'error block
On Error GoTo 0
Exit Sub
errHandler:
' Verify the data entered, because they are not correct
MsgBox "Error! Verificar los datos ingresados, porque no son correctos!" & vbCrLf & Err.Description
End Sub
Sub SheetToForm(rng As Range)
Dim map As Variant, i As Integer
map = Array(0, "Leg3", 1, "Ape3", 2, "Nomb3", 3, "Pues3", _
4, "Fech3", 5, "ComboLiqui3", 6, "FechaDesde3", 7, "FechaHasta3", _
8, "Cant3", 9, "Obs3", 12, "Dia3", 13, "Dia4")
For i = LBound(map) To UBound(map) Step 2
Me.Controls(map(i + 1)).Value = rng.Columns(1).Offset(0, map(i))
Next
Me.CurrentAddress = rng.Address 'te trae la celda actual
End Sub
。可以从base R
获取列索引,其中包含列名称和'date_fire'
findInterval
或使用j1 <- findInterval(as.Date(data1989$date_fire), as.Date(names(data1989)[-1]))
l1 <- lapply(j1+1, `:`, ncol(data1989)-1)
m1 <- cbind(rep(seq_len(nrow(data1989)), j1), sequence(j1))
m2 <- cbind(rep(seq_len(nrow(data1989)), lengths(l1)), unlist(l1))
data1989$meanPreFire <- tapply(data1989[-1][m1], m1[,1], FUN = mean, na.rm = TRUE)
data1989$meanPostFire <- tapply(data1989[-1][m2], m2[,1], FUN = mean, na.rm = TRUE)
data1989
# date_fire 1986-01-01 1986-06-03 1986-10-19 1987-01-19 1987-06-19 1987-10-19 1988-01-19 1988-06-19 1988-10-19
#1 1987-01-01 0.500 0.560 0.80 0.75 0.100 0.15 0.20 0.18 0.21
#2 1987-07-03 0.589 0.447 NA 0.65 0.550 0.12 0.22 0.21 0.24
#3 1988-01-01 0.660 0.750 0.83 0.75 0.811 0.78 0.32 0.23 0.25
# meanPreFire meanPostFire
#1 0.6200 0.2650000
#2 0.5590 0.1975000
#3 0.7635 0.2666667
中的melt/dcast
data.table
library(data.table)
dcast(melt(setDT(data1989), id.var = 'date_fire')[,
.(value = mean(value, na.rm = TRUE)),
.(date_fire, grp = c('postFire', 'preFire')[1 + (as.IDate(variable) < as.IDate(date_fire))]) ], date_fire ~ grp)[data1989, on = .(date_fire)]
# date_fire postFire preFire 1986-01-01 1986-06-03 1986-10-19 1987-01-19 1987-06-19 1987-10-19 1988-01-19 1988-06-19
#1: 1987-01-01 0.2650000 0.6200 0.500 0.560 0.80 0.75 0.100 0.15 0.20 0.18
#2: 1987-07-03 0.1975000 0.5590 0.589 0.447 NA 0.65 0.550 0.12 0.22 0.21
#3: 1988-01-01 0.2666667 0.7635 0.660 0.750 0.83 0.75 0.811 0.78 0.32 0.23
# 1988-10-19
#1: 0.21
#2: 0.24
#3: 0.25
答案 2 :(得分:2)
如果我们将数据保留为长格式,则解决方案会更加简洁...但这会重现所需的输出:
library(dplyr)
library(tidyr)
data1989 %>%
pivot_longer(-date_fire, names_to = "date_NDVI", values_to = "value", names_prefix = "^X") %>%
mutate(date_fire = as.Date(date_fire, "%Y-%m-%d"),
date_NDVI = as.Date(date_NDVI, "%Y.%m.%d")) %>%
group_by(date_fire) %>%
mutate(period = ifelse(date_NDVI < date_fire, "before_fire", "after_fire")) %>%
group_by(date_fire, period) %>%
mutate(average_NDVI = mean(value, na.rm = TRUE)) %>%
pivot_wider(names_from = date_NDVI, names_prefix = "X", values_from = value) %>%
pivot_wider(names_from = period, values_from = average_NDVI) %>%
group_by(date_fire) %>%
summarise_all(funs(sum(., na.rm=T)))
返回:
# A tibble: 3 x 12
date_fire `X1986-01-01` `X1986-06-03` `X1986-10-19` `X1987-01-19` `X1987-06-19` `X1987-10-19` `X1988-01-19` `X1988-06-19` `X1988-10-19` before_fire after_fire
<date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1987-01-01 0.5 0.56 0.8 0.75 0.1 0.15 0.2 0.18 0.21 0.62 0.265
2 1987-07-03 0.589 0.447 0 0.65 0.55 0.12 0.22 0.21 0.24 0.559 0.198
3 1988-01-01 0.66 0.75 0.83 0.75 0.811 0.78 0.32 0.23 0.25 0.764 0.267
如果我们在计算平均值后立即停止表达式,则可以使用此结构中的数据轻松计算方差或解释观察次数的变化。我认为可以将date_fire
保留为自己的列,但我建议将其他日期保留为一列(因为它们与观察值相对应)。尤其是如果我们想使用ggplot2
和其他tidyverse
函数对数据进行更多分析。