其中R中的函数用于我的计算

时间:2018-02-10 01:37:45

标签: r apply

我有一个数据框,其中数据中的每一行代表足球比赛中的对决。以下是删除了一些列的摘要,仅适用于一季中的50场比赛:

PORT

看看它的样子:

dput(mydata)
structure(list(home_id = c(75L, 323L, 607L, 3627L, 3645L, 641L, 
204L, 111L, 287L, 179L, 1062L, 292L, 413L, 275L, 182L, 3639L, 
179L, 2649L, 111L, 478L, 383L, 3645L, 275L, 577L, 3639L, 75L, 
413L, 287L, 607L, 3627L, 1062L, 75L, 583L, 323L, 3736L, 577L, 
179L, 287L, 275L, 3645L, 3639L, 583L, 179L, 413L, 641L, 204L, 
478L, 292L, 607L, 323L), away_id = c(3645L, 3736L, 583L, 2649L, 
577L, 75L, 3736L, 182L, 323L, 607L, 3639L, 583L, 478L, 383L, 
3645L, 607L, 413L, 204L, 641L, 583L, 3627L, 179L, 182L, 3736L, 
292L, 204L, 323L, 1062L, 2649L, 3639L, 204L, 292L, 111L, 607L, 
182L, 3645L, 478L, 413L, 641L, 287L, 577L, 182L, 2649L, 1062L, 
383L, 111L, 3736L, 3627L, 75L, 275L), home_rating = c(1546.64167937943, 
1534.94287021653, 1514.51852002403, 1558.91823781777, 1555.76784458784, 
1518.37707748967, 1464.5264202735, 1642.57388443639, 1447.37725553409, 
1420.69724095008, 1428.51535356064, 1512.81896541907, 1463.29314217469, 
1492.70306452585, 1404.65235407107, 1418.03767059747, 1420.69724095008, 
1532.76811278441, 1642.57388443639, 1515.31896572792, 1498.7997953168, 
1555.76784458784, 1492.70306452585, 1519.94395373088, 1418.03767059747, 
1546.64167937943, 1463.29314217469, 1447.37725553409, 1514.51852002403, 
1558.91823781777, 1428.51535356064, 1546.64167937943, 1524.71735294388, 
1534.94287021653, 1484.09023843799, 1519.94395373088, 1420.69724095008, 
1447.37725553409, 1492.70306452585, 1555.76784458784, 1418.03767059747, 
1524.71735294388, 1420.69724095008, 1463.29314217469, 1518.37707748967, 
1464.5264202735, 1515.31896572792, 1512.81896541907, 1514.51852002403, 
1534.94287021653), away_rating = c(1555.76784458784, 1484.09023843799, 
1524.71735294388, 1532.76811278441, 1519.94395373088, 1546.64167937943, 
1484.09023843799, 1404.65235407107, 1534.94287021653, 1514.51852002403, 
1418.03767059747, 1524.71735294388, 1515.31896572792, 1498.7997953168, 
1555.76784458784, 1514.51852002403, 1463.29314217469, 1464.5264202735, 
1518.37707748967, 1524.71735294388, 1558.91823781777, 1420.69724095008, 
1404.65235407107, 1484.09023843799, 1512.81896541907, 1464.5264202735, 
1534.94287021653, 1428.51535356064, 1532.76811278441, 1418.03767059747, 
1464.5264202735, 1512.81896541907, 1642.57388443639, 1514.51852002403, 
1404.65235407107, 1555.76784458784, 1515.31896572792, 1463.29314217469, 
1518.37707748967, 1447.37725553409, 1519.94395373088, 1404.65235407107, 
1532.76811278441, 1428.51535356064, 1498.7997953168, 1642.57388443639, 
1484.09023843799, 1558.91823781777, 1546.64167937943, 1492.70306452585
)), .Names = c("home_id", "away_id", "home_rating", "away_rating"
), row.names = c(NA, 50L), class = "data.frame")

home_rating和away_rating列是反映每个团队有多好的分数,我想在apply函数中使用这些列。特别是,我有另一个名为use_ratings()的函数,如下所示:

> head(mydata)
  home_id away_id home_rating away_rating
1      75    3645    1546.642    1555.768
2     323    3736    1534.943    1484.090
3     607     583    1514.519    1524.717
4    3627    2649    1558.918    1532.768
5    3645     577    1555.768    1519.944
6     641      75    1518.377    1546.642

我想在mydata的每一行上应用此函数,使用home_rating和away_rating列中的值作为每次传递给use_ratings()的参数。我怎么能这样做,谢谢?

1 个答案:

答案 0 :(得分:2)

@SymbolixAU是绝对正确的,因为最好的方法(在速度和可读性方面)正在直接利用矢量化。但是如果你使用“应用函数”,那么该函数可能是.input-group-textmapply()

使用apply()

mapply()

使用mapply(use_ratings, home_rating = mydata$home_rating, away_rating = mydata$away_rating, is_cup = <a vector of booleans>)

apply()

多变量应用(apply(mydata, 1, function(row), use_ratings(row$home_rating, row$away_rating, <row$is_cup, which is missing>) )同时将多变量函数应用于与其参数对应的多个对象。 mapply在类矩阵对象的边缘应用函数。设置apply要求MARGIN=1对行进行操作。因此,我们必须修改函数以对行进行操作并将相关参数提供给apply