我希望能够将操作应用于包含类似S3列表的对象的数据框(tibble)列,以对列中每个对象的一个命名项进行操作。根据问题的底部,我在sapply()
中使用mutate()
工作,但这看起来应该是不必要的。
如果信息存储在包含原子数据的列中,则mutate()
等dplyr函数按预期工作。这有效,例如:
library(dplyr)
people_cols <- tibble(name = c("Fiona Foo", "Barry Bar", "Basil Baz"),
height_mm = c(1750, 1700, 1800),
weight_kg = c(75, 73, 74)) %>%
mutate(height_inch = height_mm / 25.4)
people_cols
# # A tibble: 3 × 4
# name height_mm weight_kg height_inch
# <chr> <dbl> <dbl> <dbl>
# 1 Fiona Foo 1750 75 68.89764
# 2 Barry Bar 1700 73 66.92913
# 3 Basil Baz 1800 74 70.86614
但我想处理S3列表对象中的数据。这是一个玩具示例:
person_stats <- function(name, height_mm, weight_kg) {
this_person <- structure(list(name = name,
height_mm = height_mm,
weight_kg = weight_kg),
class = "person_stats")
}
fiona <- person_stats("Fiona Foo", 1750, 75)
barry <- person_stats("Barry Bar", 1700, 73)
basil <- person_stats("Basil Baz", 1800, 74)
fiona$height_mm
# [1] 1750
我可以把这些对象放到像这样的tibble列中:
people <- tibble(personstat = list(fiona, barry, basil))
people
# # A tibble: 3 × 1
# personstat
# <list>
# 1 <S3: person_stats>
# 2 <S3: person_stats>
# 3 <S3: person_stats>
但是如果我尝试在包含这些对象的列上使用mutate(),我会收到错误:
people <- tibble(personstat = list(fiona, barry, basil)) %>%
mutate(height_inch = personstat$height_mm / 25.4)
# Error in mutate_impl(.data, dots) : object 'personstat' not found
尽量保持尽可能简单 - 如果我甚至可以自己引用命名项目,那么我至少可以将它们放到一个新列中,然后从中对它们执行任何操作:
people <- tibble(personstat = list(fiona, barry, basil)) %>%
mutate(height_mm = personstat$height_mm)
# Error in mutate_impl(.data, dots) :
# Unsupported type NILSXP for column "height_mm"
注意不同的错误,这很有趣 - 它不再抱怨找到列,只是在与命名项目斗争。
我可以使用基本函数cbind()
和sapply()
并使用[[
作为函数来使其工作:
people <- tibble(personstat = list(fiona, barry, basil)) %>%
cbind(height_mm = sapply(.$personstat, '[[', name="height_mm"))
people
# personstat height_mm
# 1 Fiona Foo, 1750, 75 1750
# 2 Barry Bar, 1700, 73 1700
# 3 Basil Baz, 1800, 74 1800
虽然失去了琐事。
class(people)
# [1] "data.frame"
最后,这让我对此有所帮助,但是感觉就像使用sapply()
一样,错过了dplyr mutate()
这一点,我认为它应该在没有列的情况下一直工作需要:
people <- tibble(personstat = list(fiona, barry, basil)) %>%
mutate(height_mm = sapply(.$personstat, '[[', name="height_mm"))
people
# A tibble: 3 x 2
# personstat height_mm
# <list> <dbl>
# 1 <S3: person_stats> 1750
# 2 <S3: person_stats> 1700
# 3 <S3: person_stats> 1800
有没有办法使用mutate()
获取上述输出,而不必依赖sapply()
之类的内容?或者,确实,从存储在tibble列中的类似列表的S3对象中提取命名值的任何其他明智方法?
答案 0 :(得分:2)
rowwise
可以处理此类情况:
people <- tibble(personstat = list(fiona, barry, basil))
people %>%
rowwise() %>%
mutate(height_mm = personstat$height_mm)
# # A tibble: 3 × 2
# personstat height_mm
# <list> <dbl>
# 1 <S3: person_stats> 1750
# 2 <S3: person_stats> 1700
# 3 <S3: person_stats> 1800
people %>%
rowwise() %>%
mutate(height_inch = personstat$height_mm / 25.4)
# # A tibble: 3 × 2
# personstat height_inch
# <list> <dbl>
# 1 <S3: person_stats> 68.89764
# 2 <S3: person_stats> 66.92913
# 3 <S3: person_stats> 70.86614
答案 1 :(得分:1)
如果您希望将其保留在tidyverse
中,可以在此使用purrr::map_dbl
:
library(tidyverse)
people %>% mutate(height = map_dbl(personstat, "height_mm"))