如何根据年

时间:2017-11-05 16:56:23

标签: r loops matrix

在我开始之前是我正在使用的数据的一小部分,我提前道歉它是如此之大(注意这只是一个非常大的数据集的前30行:

mydata<-structure(list(ParkName = c("SEP", "CSSP", 
                        "SEP", "ONF", "SEP", 
                        "ONF", "SEP", 
                        "CSSP", "ONF", 
                        "SEP", "CSSP", 
                        "PPRSP", "PPRSP", 
                        "SEP", "ONF", 
                        "PPRSP", "ONF", 
                        "SEP", "SEP", 
                        "ONF"), 
           Year = c(2001, 2005, 1998,2011, 1991, 1991, 1991, 1991, 1991, 1992, 1992, 1992, 1992, 1992,
                                          1992, 1992, 1992, 1993, 1994, 1994), 
           LatinName = c("Mola mola", "Clarias batrachus", "Lithobates catesbeianus", "Rana catesbeiana", "Rana catesbeiana", 
                         "Rana yellowis", "Rana catesbeiana", "Solenopsis sp1","Rana catesbeiana", "Rana catesbeiana",
                         "Pratensis", "Rana catesbeiana",  "Rana catesbeiana", "sp2", "Orchidaceae",
                         "Rana catesbeiana","Formica", "Rana catesbeiana", "Rana catesbeiana", "sp2"), 
           NumTotal = c(1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,1, 100, 2, 1, 2)), Names = c("ParkName", "Year", "LatinName", 
                                                                                                                      "NumTotal"),
      row.names = c(NA, -20L), class = c("tbl_df", "tbl",  "data.frame"))

该数据集代表了多年来不同公园中不同物种的丰富程度。我基本上想要对这些数据做的是获取每年记录数据的物种X公园矩阵,然后你就可以获得“素食主义者”。计算每个公园每年的多样性指数。显然这不是一个平衡的数据集,因为不是每个公园都记录了每年的物种丰富度等等。现在我已经意识到这样做我需要运行循环。我需要每年获得一份公园清单以及每个公园每年的物种及其丰富度列表,以便创建这些矩阵。在运行循环时,我不是最好的,这个任务让我感到困惑。例如,我在数据集中创建了一个独立年份的单独向量。然后我创建了一个名为&#34; parkbyyear&#34;的空列表。从主数据框中按年填写公园列表

year<-as.vector(unique(data[,3]))
parkbyyear<-NULL

for (i in 1:year) {
  parkbyyear[i]<- mydata[mydata$ParkName[year == "i"]
}

循环无法运行。 任何帮助,将不胜感激。

1 个答案:

答案 0 :(得分:1)

只需使用by按需要的因子对数据帧进行切片,然后运行向量返回等操作:

parkbyyear_list <- by(mydata, mydata$Year, FUN=function(df) df$ParkName)

parkbyyear_list
# mydata$Year: 1991
# [1] "SEP"  "ONF"  "SEP"  "CSSP" "ONF" 
# ---------------------------------------------------------------------------
# mydata$Year: 1992
# [1] "SEP"   "CSSP"  "PPRSP" "PPRSP" "SEP"   "ONF"   "PPRSP" "ONF"  
# --------------------------------------------------------------------------- 
# mydata$Year: 1993
# [1] "SEP"
# ---------------------------------------------------------------------------
# mydata$Year: 1994
# [1] "SEP" "ONF"
# ---------------------------------------------------------------------------
# mydata$Year: 1998
# [1] "SEP"
# ---------------------------------------------------------------------------
# mydata$Year: 2001
# [1] "SEP"
# ---------------------------------------------------------------------------
# mydata$Year: 2005
# [1] "CSSP"
# ---------------------------------------------------------------------------
# mydata$Year: 2011
# [1] "ONF"

要获取的子集化数据框列表,只需使用split(或再次by):

dfList <- split(mydata, mydata$Year)
# dfList <- by(mydata, mydata$Year, FUN=function(df) df)   # SIMILAR CALL

dfList

# $`1991`
#   ParkName Year        LatinName NumTotal
# 5      SEP 1991 Rana catesbeiana        2
# 6      ONF 1991    Rana yellowis        1
# 7      SEP 1991 Rana catesbeiana        1
# 8     CSSP 1991   Solenopsis sp1        1
# 9      ONF 1991 Rana catesbeiana        1

# $`1992`
#    ParkName Year        LatinName NumTotal
# 10      SEP 1992 Rana catesbeiana        1
# 11     CSSP 1992        Pratensis        1
# 12    PPRSP 1992 Rana catesbeiana        1
# 13    PPRSP 1992 Rana catesbeiana        1
# 14      SEP 1992              sp2        1
# 15      ONF 1992      Orchidaceae        1
# 16    PPRSP 1992 Rana catesbeiana        1
# 17      ONF 1992          Formica      100
# 
# $`1993`
#    ParkName Year        LatinName NumTotal
# 18      SEP 1993 Rana catesbeiana        2
# 
# $`1994`
#    ParkName Year        LatinName NumTotal
# 19      SEP 1994 Rana catesbeiana        1
# 20      ONF 1994              sp2        2
# 
# $`1998`
#   ParkName Year               LatinName NumTotal
# 3      SEP 1998 Lithobates catesbeianus        1
# 
# $`2001`
#   ParkName Year LatinName NumTotal
# 1      SEP 2001 Mola mola        1
# 
# $`2005`
#   ParkName Year         LatinName NumTotal
# 2     CSSP 2005 Clarias batrachus        1
# 
# $`2011`
#   ParkName Year        LatinName NumTotal
# 4      ONF 2011 Rana catesbeiana        1