具有其他维度的新列

时间:2018-09-13 11:37:46

标签: r

Des          Price                  New column             New column 2 

a   27.82 / 27.82 / 23.65 / 27.82   27.82 / 23.65 / 27.82  price decreased and increased
b   19.87 / 19.87 / 19.14 / 19.87   19.87 / 19.14 / 19.87  price decreased and increased
c   32.25 / 32.25 / 31 / 32.25 / 31 32.25 / 31 / 32.25 / 31 price decre, incre and decre 
d   32.25                           32.25                   Constant

我们可以向数据添加新维度吗?例如,在新列中,我们有27.82 / 23.65 / 27.82,因此我们可以添加另一列,然后判断价格下降还是上升。我的意思是,初始值是27.82,然后先降低然后升高。同样,对于32.25 / 31 / 32.25 / 31,在这里它减少,增加,减少。

2 个答案:

答案 0 :(得分:0)

原始问题

df = data.frame(Des = c("a","c"),
                    NewColumn = c("27.82 / 23.65 / 27.82", "32.25 / 31 / 32.25 / 31"))

library(tidyverse)

df %>% 
  mutate(v = NewColumn) %>%                                   # duplicate the column
  separate_rows(v, convert = T) %>%                           # separate that column to rows
  group_by(Des) %>%                                           # for each Des value
  mutate(flag = ifelse(v < lag(v), "Dec", "Inc")) %>%         # flag if the value was decreased on increased
  na.omit() %>%                                               # remove rows with NAs
  group_by(Des, NewColumn) %>%                                # for each Des and NewColumn combination
  summarise(NewColumn2 = paste0(flag, collapse = ", ")) %>%   # create a string sequence of flags
  ungroup()

# # A tibble: 2 x 3
#   Des   NewColumn               NewColumn2   
#   <fct> <fct>                   <chr>        
# 1 a     27.82 / 23.65 / 27.82   Dec, Inc     
# 2 c     32.25 / 31 / 32.25 / 31 Dec, Inc, Dec

请注意,根据您提供的示例,我假设

a)您的计算不需要Price

b)在任何情况下,NewColumn中的两个连续值都不相同(即,您总是会有“增加”或“减少”的意思)

修改

如果您只有一个值,则可以使用此值:

df = data.frame(Des = c("a","b","c"),
                NewColumn = c("27.82 / 23.65 / 27.82", "12", "32.25 / 31 / 32.25 / 31"))

library(tidyverse)

df %>% 
  mutate(v = NewColumn) %>%                                  
  separate_rows(v, convert = T) %>%                         
  group_by(Des) %>%                                         
  mutate(flag = ifelse(v < lag(v), "Dec", "Inc"),
         NumValues = n(),
         flag = ifelse(NumValues == 1, "Const", flag)) %>%        
  na.omit() %>%                                              
  group_by(Des, NewColumn) %>%                                
  summarise(NewColumn2 = paste0(flag, collapse = ", ")) %>%   
  ungroup()

# # A tibble: 3 x 3
#   Des   NewColumn               NewColumn2   
#   <fct> <fct>                   <chr>        
# 1 a     27.82 / 23.65 / 27.82   Dec, Inc     
# 2 b     12                      Const        
# 3 c     32.25 / 31 / 32.25 / 31 Dec, Inc, Dec

答案 1 :(得分:0)

使用基本R:

df$NewColumn1 <- sapply(df$NewColumn, function(x){
  temp <- as.numeric(strsplit(as.character(x), split = "/")[[1]])
  if(length(temp) > 1){
    return(paste(ifelse(temp[-1] < head(temp, -1), "Dec", 
                 ifelse(temp[-1] > head(temp, -1), "Inc", "Const")), 
          collapse = ", "))
  }else{
    return("Const")
  }
})

输出:

  Des               NewColumn      NewColumn1
1   a   27.82 / 23.65 / 27.82        Dec, Inc
2   b                      12           Const
3   c 32.25 / 31 / 32.25 / 31   Dec, Inc, Dec
4   d       16 / 13 / 13 / 15 Dec, Const, Inc

数据(对@AntoniosK数据的修改):

df <- structure(list(Des = structure(1:4, .Label = c("a", "b", "c", 
"d"), class = "factor"), NewColumn = structure(c(3L, 1L, 4L, 
2L), .Label = c("12", "16 / 13 / 13 / 15", "27.82 / 23.65 / 27.82", 
"32.25 / 31 / 32.25 / 31"), class = "factor")), .Names = c("Des", 
"NewColumn"), row.names = c(NA, -4L), class = "data.frame")