处理整洁数据框中的列表列

时间:2020-10-19 17:17:04

标签: r dataframe dplyr tidyverse tidyr

我们获得了1000个客户的乐高积木销售记录的数据,如下所示:

[
  {
    "gender": "Female",
    "first_name": "Kimberly",
    "last_name": "Beckstead",
    "age": 24,
    "phone_number": "216-555-2549",
    "hobbies": ["Ultimate Disc", "Shopping"],
    "purchases": [
      {
        "SetID": 24701,
        "Number": "76062",
        "Theme": "DC Comics Super Heroes",
        "Subtheme": "Mighty Micros",
        "Year": 2016,
        "Name": "Robin vs. Bane",
        "Pieces": 77,
        "USPrice": 9.99,
        "ImageURL": "http://images.brickset.com/sets/images/76062-1.jpg",
        "Quantity": 1
      }
    ]
  },

当我将销售对象转换为整洁的数据框时:

(purchases = sales %>%
    tibble::tibble(item = .) %>%
    tidyr::unnest_wider(item) %>%
    select(gender, first_name, last_name, hobbies, phone_number, purchases) %>%
    tidyr::unnest_longer(purchases) %>%
    tidyr::unnest_wider(purchases))

我尝试计算出乐高购买者最喜欢的五种爱好:

purchases %>%
  select(hobbies) %>%
  distinct() %>%
  summarise(
    n = n(),
    most_pop_hobbies = sum(hobbies),
    .groups = "drop_last") %>%
  select(hobbies, most_pop_hobbies) %>%
  slice_max(most_pop_hobbies, n=5) %>%
  distinct()

我收到错误消息:

Error in UseMethod("collapse") : no applicable method for 'collapse' applied to an object of class "character"

我认为此错误是由于变量hobbies是一个列表而引起的,但是我不确定如何解决此问题。任何帮助将不胜感激。

0 个答案:

没有答案