从字符串中减去数据

时间:2017-03-14 15:17:20

标签: r string split

抱歉我的英语不好。我有一个促销活动表,如下所示:

promotion_campaign <- data.table(product = letters[1:3],description = c('30-5,40-7,50-9','20-5,30-6,40-8','20-4,30-5,50-8'),tagged_price = c(30,21,52))

'description'表示如果tagged_price高于' - '之前的第一个数字(或等于数字),它应该被' - '后的数字打折。例如,产品a的tagged_price = 30,实际价格= 30-5 = 25.广告系列与每种产品不同。结果表应该是这样的:

promotion_campaign <- data.table(product = letters[1:3],description = c('30-5,40-7,50-9','20-5,30-6,40-8','20-4,30-5,50-8'),tagged_price = c(30,20,52),actual_price = c(25,15,44))

要获得'actual_price'涉及字符串拆分,请将标记价格查找到正确的类别并减去折扣,任何人都可以启发我吗?

3 个答案:

答案 0 :(得分:2)

tidyverse(特别是dplyrtidyr)解决方案。希望这是可以理解的。在任何情况下,都可以逐行执行,看看每一步都会发生什么。

library(tidyverse)
promotion_campaign %>% 
    mutate(description = strsplit(description, ",")) %>% 
    unnest(description) %>% 
    separate(description, c("price_point", "discount"), "-", convert = T) %>% 
    filter(tagged_price >= price_point) %>% 
    arrange(product, -price_point) %>% 
    group_by(product) %>%
    slice(1) %>% 
    mutate(actual_price = tagged_price - discount)

答案 1 :(得分:1)

我们可以尝试

i1 <- as.integer(sub("-.*", "", promotion_campaign$description)) < 
                  promotion_campaign$tagged_price
m1 <- do.call(rbind, lapply(strsplit(promotion_campaign$description, 
                     '[-,]'), function(x) as.numeric(x)[c(2, 6)]))
promotion_campaign$actual_price <-  ifelse(i1, promotion_campaign$tagged_price - m1[,2], 
               promotion_campaign$tagged_price - m1[,1])

答案 2 :(得分:1)

此答案使用dplyrtidyr,而非data.table,如果您有不同数量的折扣,则不易推广,但它确实可以回答您的问题。

library(dplyr)
library(tidyr)
promotion_campaign <- data.frame(product = letters[1:3],description = c('30-5,40-7,50-9','20-5,30-6,40-8','20-4,30-5,50-8'),tagged_price = c(30,21,52))

promotion_campaign2 <- promotion_campaign %>% 
  separate(description,
           c("cut1", "discount1", "cut2", "discount2", "cut3", "discount3"),
           convert = TRUE) %>% 
  mutate(actual_price = ifelse(tagged_price >= cut1, tagged_price - discount1,
                               ifelse(tagged_price >= cut2, tagged_price - discount2,
                                      ifelse(tagged_price >= cut3, tagged_price - discount3,
                                             tagged_price))))

> promotion_campaign2
  product cut1 discount1 cut2 discount2 cut3 discount3 tagged_price actual_price
1       a   30         5   40         7   50         9           30           25
2       b   20         5   30         6   40         8           21           16
3       c   20         4   30         5   50         8           52           48