映射嵌套的小对象,进行变异并应用函数

时间:2019-10-11 18:17:17

标签: r

这是我要问的here的补充问题/补充问题,但问题有所不同。

我想知道我的代码在嵌套嵌套小对象上的错误之处。可以使用以下方式生成数据:

library(tidyquant)
library(lubridate)
tickers <- c("GIS", "KR", "MKC", "SJM", "EL", "HRL", "HSY", "K", 
             "KMB", "MDLZ", "MNST", "PEP", "PG", "PM", "SYY", "TAP", "TSN", "WBA", "WMT",
             "MMM", "ABMD", "ACN", "AMD", "AES", "AON", "ANTM", "APA", "CSCO", "CMS", "KO", "GRMN", "GPS",
             "JEC", "SJM", "JPM", "JNPR", "KSU", "KEYS", "KIM", "NBL", "NEM", "NWL", "NFLX", "NEE", "NOC", "TMO", "TXN", "TWTR")


data <- tq_get(tickers,
               get = "stock.prices",              # Collect the stock price data from 2010 - 2015
               from = "2010-01-01",
               to = "2015-01-01") %>%
  group_by(symbol) %>%
  tq_transmute(select = adjusted,                 # Convert the data from daily prices to monthly prices
               mutate_fun = periodReturn,
               period = "monthly",
               type = "arithmetic")

df_monthly <- data %>%
  mutate(year = year(date)) %>%
  group_by(symbol, year) %>%                     # I group_by and nest the data in order to create the event data which remains fixed over the monthly periods
  nest() %>%
  mutate(                                        # Here I randomly create the dates
    release_date = paste(year,
                         str_pad(ceiling(runif(row_number(), min = 1, max = 12)), 2, pad = "0"),    # Create the months 1 - 12 months
                         str_pad(ceiling(runif(row_number(), min = 1, max = 27)), 2, pad = "0"),    # Create the days - I choose 27 days in a month since later I set the days to the end of month day
                         sep = "-"),
    score = runif(row_number(), min = 0, max = 1),                                                  # Randomly generate some scoring function
    release_date = as.Date(release_date),
    release_date = ceiling_date(release_date, "month") - days(1) # This gives the end of month date
    ) %>%
  unnest() %>%                                   # unnest to expand the yearly release_date and score to the monthly data
  ungroup() %>%
  mutate_if(is.integer, as.numeric) %>%
  arrange(release_date)

我停留的部分是这个部分:

d <- df_monthly %>%
  group_by(release_date) %>%
  nest() %>%
  map(~data %>%
      mutate(ntile_score = ntile(score, 2))
    )

也不起作用:

df_monthly %>%
  group_by(release_date) %>%
  nest() %>%
  map(~data %>%
      mutate(ntile_score = ~ntile(.x$score, 2))
    )

我想做的是映射嵌套的data并计算ntiles。我正在尝试多种方法,但似乎无法正常工作。

1 个答案:

答案 0 :(得分:2)

我们需要在mutate内部使用它,或者如果需要将独立的data应用于'data,则用pull.$提取map '

library(dplyr)
library(purrr)
out <- df_monthly %>%
         group_by(release_date) %>%
         nest %>%
         mutate(data =  map(data, ~
                        .x %>%
                         mutate(ntile_score = ntile(score, 2))))
out
# A tibble: 55 x 2
# Groups:   release_date [55]
#   release_date data             
#   <date>       <list>           
# 1 2010-02-28   <tibble [72 × 6]>
# 2 2010-03-31   <tibble [24 × 6]>
# 3 2010-04-30   <tibble [96 × 6]>
# 4 2010-05-31   <tibble [12 × 6]>
# 5 2010-06-30   <tibble [60 × 6]>
# 6 2010-07-31   <tibble [72 × 6]>
# 7 2010-08-31   <tibble [48 × 6]>
# 8 2010-09-30   <tibble [24 × 6]>
# 9 2010-10-31   <tibble [12 × 6]>
#10 2010-11-30   <tibble [72 × 6]>
# … with 45 more rows

-检查list元素之一

out$data[[1]]
# A tibble: 72 x 6
#   symbol  year date       monthly.returns score ntile_score
#   <chr>  <dbl> <date>               <dbl> <dbl>       <int>
# 1 PM      2010 2010-01-29        -0.0778  0.450           1
# 2 PM      2010 2010-02-26         0.0762  0.450           1
# 3 PM      2010 2010-03-31         0.0767  0.450           1
# 4 PM      2010 2010-04-30        -0.0590  0.450           1
# 5 PM      2010 2010-05-28        -0.101   0.450           1
# 6 PM      2010 2010-06-30         0.0522  0.450           1
# 7 PM      2010 2010-07-30         0.113   0.450           1
# 8 PM      2010 2010-08-31         0.00627 0.450           1
# 9 PM      2010 2010-09-30         0.103   0.450           1
#10 PM      2010 2010-10-29         0.0444  0.450           1
# … with 62 more rows