计算大型数据集R的共置索引

时间:2013-06-21 21:48:36

标签: r loops apply frequency-distribution

我有一个复杂的,多部分的问题。如果我不清楚,我道歉。我也是一个相当新手的R用户,所以请原谅我,如果这看起来很简陋。 我想计算鲸鱼潜水数据和猎物分布数据的主机托管指数。这需要:

  1. 计算潜水数据BY的鲸鱼深度的频率分布 从猎物(鱼和动物)数据潜入深度箱。
  2. 对于每次潜水,计算重心(CG)和惯性(I)。
  3. 对于每次潜水,计算全局托管指数(GIC)与每个猎物 类型。
  4. 我希望能够编写一个功能(或一系列功能),以便我不必通过潜水分离我的数据,并手动重新运行每次潜水的功能。

    鲸鱼数据示例,如果潜水号码(有时40次潜水),潜水等于深度,分类与潜水类型有关。 [IMG] http://i41.tinypic.com/33vc5rs.jpg[/IMG]

    深度分箱来自包含猎物信息的单独数据集:

    enter image description here

    我有以下代码作为整体用于潜水数据,但是需要编写一个循环或包含一个应用功能,这样我就可以为每个潜水中的数据运行这个,该潜水包含在一个文件中。因此,对于有40次潜水的鲸鱼,我需要40个鲸鱼频率,40头鲸鱼CG,40头鲸鱼等等。每次潜水时,猎物分布都是相同的!最后,我想要一个包含delta GIC值列表的表。

    #bin whale dive depths
    dive.cut=cut(whale,c(0 ,depths), right=FALSE) 
    dive.freq=table(dive.cut) 
    
    # compute CG 
    fish.CG=sum(depths*fish)/sum(fish)
    whale.CG=sum(depths*whale.freq)/sum(whale.freq)
    zoop.CG=sum(depths*zoop)/sum(zoop)
    
    # compute Inertia 
    fish.I=sum((depths-fish.CG)^2*fish)/sum(fish)
    whale.I=sum((depths-whale.CG)^2*whale.freq)/sum(whale.freq)
    zoop.I=sum((depths-zoop.CG)^2*zoop)/sum(zoop)
    
    #compute GIC as per 
    # compute delta CG
    deltaCG.fish_whale=fish.CG-whale.CG
    GIC.fish_whale= 1-((deltaCG.fish_whale)^2/((deltaCG.fish_whale)^2+fish.I+whale.I))
    deltaCG.zoop_whale=zoop.CG-whale.CG
    GIC.zoop_whale= 1-((deltaCG.zoop_whale)^2/((deltaCG.zoop_whale)^2+zoop.I+whale.I))
    

    更新 我已经粘贴了猎物和鲸鱼潜水的示例数据。

    猎物数据

     depths        fish       zoop
    1      5     0.00000    0.000000
    2     10     0.00000    0.000000
    3     15     0.00000    0.000000
    4     20    21.24194    0.000000
    5     25   149.51694   14.937945
    6     30   170.43214    0.000000
    7     35   296.93453    0.737109
    8     40    16.61643    4.295556
    9     45    92.68130   26.384844
    10    50    50.68548   55.902301
    11    55    37.47343  218.673781
    12    60    32.74443  204.452678
    13    65    20.62983  113.112452
    14    70    13.75121   83.014457
    15    75    16.15562   55.051358
    16    80    22.65562   96.746271
    17    85    42.99768  302.229135
    18    90 16315.65099  783.868978
    19    95 43006.20482 1713.133161
    20   100 23476.24740 3440.034642
    21   105 30513.66346 6667.914707
    22   110 17411.64500 9398.790964
    23   115 12127.70195 7580.233165
    24   120  4526.63393 7205.768739
    25   125  3328.89644 6567.175766
    26   130  1864.21486 4567.446886
    27   135  2202.07464 4295.772442
    28   140  2719.29417 4419.903403
    29   145  1710.75599 5102.689940
    30   150  2033.69552 4496.121974
    31   155  2796.81788 3269.193606
    32   160   967.09406 2310.203528
    33   165   437.30896  447.940140
    34   170   193.15526   63.731336
    35   175   143.88043   38.004799
    36   180   406.31373   22.565211
    37   185   786.30087   31.889927
    38   190  1643.52542   36.580063
    39   195  1665.69794   14.084152
    40   200  1281.15790    0.000000
    41   205   753.75309   35.343794
    42   210   252.48867    0.000000
    

    鲸鱼数据:

      Number Dive Class
    1       1 95.1     F
    2       1 95.9     F
    3       1 95.1     F
    4       1 95.9     F
    5       1 96.8     F
    6       1 97.2     F
    7       1 96.8     F
    8       2 95.5     N
    9       2 94.2     N
    10      3 94.7     F
    11      3 94.2     F
    12      3 94.2     F
    13      3 95.9     F
    14      3 95.9     F
    15      4 93.8     F
    16      4 97.7     F
    17      4 99.4     F
    18      4 94.7     F
    19      4 92.5     F
    20      4 98.1     F
    21      5 97.2     N
    22      5 98.5     N
    23      5 95.5     N
    24      5 97.2     N
    25      5 98.5     N
    26      5 96.4     N
    27      5 94.7     N
    28      5 95.5     N
    

1 个答案:

答案 0 :(得分:1)

尝试使用此代码。我测试了你发布的数据。我使用了猎物数据框的深度。不确定这是不是你想做的。而且,这次我猜你用了鲸鱼$ Dive来潜水.freq。如果没有,你将不得不改变它。 (注意,这个问题也被交叉发布到了r-help列表中。

prey <- structure(list(depths = c(5L, 10L, 15L, 20L, 25L, 30L, 35L, 40L, 
    45L, 50L, 55L, 60L, 65L, 70L, 75L, 80L, 85L, 90L, 95L, 100L, 
    105L, 110L, 115L, 120L, 125L, 130L, 135L, 140L, 145L, 150L, 155L, 
    160L, 165L, 170L, 175L, 180L, 185L, 190L, 195L, 200L, 205L, 210L
    ), fish = c(0, 0, 0, 21.24194, 149.51694, 170.43214, 296.93453, 
    16.61643, 92.6813, 50.68548, 37.47343, 32.74443, 20.62983, 13.75121, 
    16.15562, 22.65562, 42.99768, 16315.65099, 43006.20482, 23476.2474, 
    30513.66346, 17411.645, 12127.70195, 4526.63393, 3328.89644, 
    1864.21486, 2202.07464, 2719.29417, 1710.75599, 2033.69552, 2796.81788, 
    967.09406, 437.30896, 193.15526, 143.88043, 406.31373, 786.30087, 
    1643.52542, 1665.69794, 1281.1579, 753.75309, 252.48867), zoop = c(0, 
    0, 0, 0, 14.937945, 0, 0.737109, 4.295556, 26.384844, 55.902301, 
    218.673781, 204.452678, 113.112452, 83.014457, 55.051358, 96.746271, 
    302.229135, 783.868978, 1713.133161, 3440.034642, 6667.914707, 
    9398.790964, 7580.233165, 7205.768739, 6567.175766, 4567.446886, 
    4295.772442, 4419.903403, 5102.68994, 4496.121974, 3269.193606, 
    2310.203528, 447.94014, 63.731336, 38.004799, 22.565211, 31.889927, 
    36.580063, 14.084152, 0, 35.343794, 0)), .Names = c("depths", 
    "fish", "zoop"), class = "data.frame", row.names = c("1", "2", 
    "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", 
    "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", 
    "26", "27", "28", "29", "30", "31", "32", "33", "34", "35", "36", 
    "37", "38", "39", "40", "41", "42"))

whale <- structure(list(Number = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 
    3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 
    5L, 5L, 5L), Dive = c(95.1, 95.9, 95.1, 95.9, 96.8, 97.2, 96.8, 
    95.5, 94.2, 94.7, 94.2, 94.2, 95.9, 95.9, 93.8, 97.7, 99.4, 94.7, 
    92.5, 98.1, 97.2, 98.5, 95.5, 97.2, 98.5, 96.4, 94.7, 95.5), 
    Class = c("F", "F", "F", "F", "F", "F", "F", "N", "N", "F", 
    "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "N", "N", 
    "N", "N", "N", "N", "N", "N")), .Names = c("Number", "Dive", 
    "Class"), class = "data.frame", row.names = c("1", "2", "3", 
    "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", 
    "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", 
    "27", "28"))

# split the data frame into a list with a different element for each dive
dives <- split(whale, whale$Dive)

# define a single function that does all of your computations
compute <- function(whale, depths, fish, zoop) {
    # you don't say what part of the whale data you are counting ... I'll assume it's the dive
    dive.freq <- table(cut(whale$Dive, c(0, depths)))
    #compute Center of Gravity
    fish.CG <- sum(depths*fish)/sum(fish) #calculate CG for fish distribution ONCE for each whale
    zoop.CG <- sum(depths*zoop)/sum(zoop) #calculate CG for zoop distribution ONCE for each whale
    whale.CG <- sum(depths*dive.freq/sum(dive.freq)) #calculate for EACH dive
    #compute Inertia
    fish.I <- sum((depths-fish.CG)^2*fish)/sum(fish) 
    zoop.I <- sum((depths-zoop.CG)^2*zoop)/sum(zoop)
    whale.I <- sum((depths-whale.CG)^2*dive.freq)/sum(dive.freq) #needs to be calculated for EACH dive
    # compute delta CG
    deltaCG.fish_whale <- fish.CG-whale.CG
    GIC.fish_whale <- 1-((deltaCG.fish_whale)^2/((deltaCG.fish_whale)^2+fish.I+whale.I))
    deltaCG.zoop_whale <- zoop.CG-whale.CG
    GIC.zoop_whale <- 1-((deltaCG.zoop_whale)^2/((deltaCG.zoop_whale)^2+zoop.I+whale.I))
    # then list off all the variables you want to keep as output from the function here
    c(fish.CG=fish.CG, whale.CG=whale.CG, zoop.CG=zoop.CG, fish.I=fish.I, whale.I=whale.I, zoop.I=zoop.I, 
        GIC.fish_whale=GIC.fish_whale, GIC.zoop_whale=GIC.zoop_whale)
    }

# apply the compute function to each element of the dives list
t(sapply(dives, function(dat) compute(whale=dat, depths=prey$depths, fish=prey$fish, zoop=prey$zoop)))