如何将map2与不等长的向量一起使用

时间:2019-06-04 02:48:02

标签: r purrr

问题

我正在尝试计算收入在$ 1到$ 200,000之间且应增加$ 100(2000值)的所得税。

我已经收集了有关税率的信息,从而得到了34个数据框的列表。

我有一个根据收入和适用税率计算应纳税额的功能。

使用该函数,我想生成一个显示应纳税额的向量:

  1. 每个收入水平(2000个值)
  2. 每套(34套房价)

如果能在数据帧/小标题中返回此输出,那就太好了。

数据

#This scrapes the website of the tax administrator and returns a list of tidy data frames showing tax rates for income years between 2016 and 1983
url <- "https://www.ato.gov.au/Rates/Individual-income-tax-for-prior-years/"
pit_sch <- url %>%
  read_html() %>%
  html_table() %>%
  setNames(., url %>%
             read_html() %>%
             html_nodes("caption") %>%
             html_text()) %>% 
  map(.%>%
    mutate(`Tax on this income` = gsub(",", "", `Tax on this income`), 
            cumm_tax_amt = str_extract(`Tax on this income`, "(?<=^\\$)\\d+") %>% as.numeric(), 
            tax_rate = str_extract(`Tax on this income`, "\\d+.(\\d+)?(?=(\\s+)?c)") %>% as.numeric(), 
            threshold = str_extract(`Tax on this income`, "(?<=\\$)\\d+$") %>% as.numeric()
           )
    ) %>%
  map(~drop_na(.x, threshold)) %>% 
  map(function(x) { mutate_each(x, funs(replace(., is.na(.), 0))) })

#Defining income 
income <- seq(from = 1, to = 200000, by = 100)

#The function for calculating tax payable
tax_calc <- function(data, income) {
  i <-tail(which(income >= data[, 5]), 1)
  if (length(i) > 0) 
    return(((income - data[i,5]) * (data[i,4]/100)) + data[i,3])
  else
    return(0)
}

我的尝试

> map2(pit_sch, income, tax_calc)
Error: Mapped vectors must have consistent lengths:
* `.x` has length 34
* `.y` has length 2000
    enter code here

1 个答案:

答案 0 :(得分:1)

正确区分不同的income和计算该年份的年份。我建议让tax_calc函数返回具有tibbleincome计算的tax

library(tidyverse)

tax_calc <- function(data, income) {
   i <-tail(which(income >= data[, 5]), 1)
  if (length(i) > 0) 
    return(tibble(income = income, 
          tax = (income - data[i,5]) * (data[i,4]/100) + data[i,3]))
  else
    return(tibble(income = income, tax = 0))
}

由于您希望每个tax_calc的{​​{1}}都income,因此可以使用

pit_sch

检查map(pit_sch,~map_df(income, tax_calc, data = .)) %>% bind_rows(., .id = "id") 即可得到

tail(income)