Type Network Show Placement Cost Dates (chr)
VVVV AJSS XGAF BHGHF 103.00 3/21,3/23
我的数æ®çœ‹èµ·æ¥åƒä¸Šé¢çš„æ•°æ®æ¡†æ‘˜å½•ã€‚我需è¦æŸ¥çœ‹æ—¥æœŸåˆ—ï¼Œå¹¶æ ¹æ®è§‚察ä¸æŒ‡å®šçš„日期数(总是采用逗å·åˆ†éš”çš„æ ¼å¼ï¼Œå¦‚:1 / 20,1 / 23,1 / 30)我需è¦å¤åˆ¶è¡Œç”¨é€—å·åˆ†éš”的唯一日期。最åŽï¼Œæ‰€æœ‰ä¸Šè¿°ä»£ç 段都应转æ¢ä¸ºä»¥ä¸‹å†…容。
Type Network Show Placement Cost Dates (chr)
VVVV AJSS XGAF BHGHF 103.00 3/21
VVVV AJSS XGAF BHGHF 103.00 3/23
我已ç»æœ‰å¾ˆå¤šè§£å†³æ–¹æ¡ˆæ¥ç”Ÿæˆâ€œï¼Œâ€ç„¶åŽæˆ‘å¯ä»¥ä½¿ç”¨å®ƒæ¥å¤åˆ¶è¡Œæ•°ï¼Œä½†æ˜¯æ— 法在日期列ä¸èŽ·å–唯一日期。如果å¯èƒ½çš„è¯ï¼Œæˆ‘想åšæŒä½¿ç”¨åŸºç¡€R,但是如果它是完æˆç›®æ ‡æ‰€éœ€è¦çš„è¯ï¼Œæˆ‘会对所有包装开放。
为了清晰度: 对于é‡å¤æ ‡å¿—。我知é“如何å¤åˆ¶è¡Œï¼Œä½†æ˜¯å¦‚何将用逗å·åˆ†éš”的唯一日期应用于æ¯ä¸ªæ–°åˆ›å»ºçš„行?
æå‰æ„Ÿè°¢æ‚¨ï¼Œè¯·æ供我需è¦æ供的任何其他信æ¯ã€‚
ç”案 0 :(得分:1)
unnest()
å¯èƒ½å°±æ˜¯ä½ è¦æ‰¾çš„东西。
library(dplyr); library(tidyr)
df %>% transform(Dates = strsplit(as.character(Dates), ",")) %>% unnest(Dates)
Source: local data frame [2 x 6]
Type Network Show Placement Cost Dates
(fctr) (fctr) (fctr) (fctr) (dbl) (chr)
1 VVVV AJSS XGAF BHGHF 103 3/21
2 VVVV AJSS XGAF BHGHF 103 3/23
ç”案 1 :(得分:1)
这是基础R解决方案:
## generate data
set.seed(1L); N <- 4L; df <- data.frame(Type=replicate(N,paste(collapse='',rep(sample(LETTERS,1L),4L))),Network=replicate(N,paste(collapse='',sample(LETTERS,4L))),Placement=replicate(N,paste(collapse='',sample(LETTERS,5L))),Cost=round(runif(N,50,150),2L),`Dates (chr)`=replicate(N,paste(collapse=',',gsub('\\b0','',format(format='%m/%d',sample(seq(as.Date('2016-01-01'),as.Date('2016-12-31'),1L),sample(1:4,1L)))))),stringsAsFactors=F,check.names=F);
df;
## Type Network Placement Cost Dates (chr)
## 1 GGGG FWYP YFPCZ 132.09 10/15,1/9,6/22
## 2 JJJJ QBEX KAJUH 114.71 9/10,6/23,11/9
## 3 OOOO RJSL MOLES 128.29 3/30,1/26
## 4 XXXX SYJR RTCQJ 105.30 4/25
## solution #1
ds <- strsplit(df$`Dates (chr)`,',');
data.frame(c(lapply(df[!names(df)%in%'Dates (chr)'],function(col) rep(col,sapply(ds,length))),list(`Dates (chr)`=unlist(ds))),check.names=F,stringsAsFactors=F);
## Type Network Placement Cost Dates (chr)
## 1 GGGG FWYP YFPCZ 132.09 10/15
## 2 GGGG FWYP YFPCZ 132.09 1/9
## 3 GGGG FWYP YFPCZ 132.09 6/22
## 4 JJJJ QBEX KAJUH 114.71 9/10
## 5 JJJJ QBEX KAJUH 114.71 6/23
## 6 JJJJ QBEX KAJUH 114.71 11/9
## 7 OOOO RJSL MOLES 128.29 3/30
## 8 OOOO RJSL MOLES 128.29 1/26
## 9 XXXX SYJR RTCQJ 105.30 4/25
å¦ä¸€ç§å¯èƒ½æ€§ï¼š
## solution #2
ds <- strsplit(df$`Dates (chr)`,',');
df2 <- df[rep(seq_len(nrow(df)),sapply(ds,length)),];
df2$`Dates (chr)` <- unlist(ds);
df2;
## Type Network Placement Cost Dates (chr)
## 1 GGGG FWYP YFPCZ 132.09 10/15
## 1.1 GGGG FWYP YFPCZ 132.09 1/9
## 1.2 GGGG FWYP YFPCZ 132.09 6/22
## 2 JJJJ QBEX KAJUH 114.71 9/10
## 2.1 JJJJ QBEX KAJUH 114.71 6/23
## 2.2 JJJJ QBEX KAJUH 114.71 11/9
## 3 OOOO RJSL MOLES 128.29 3/30
## 3.1 OOOO RJSL MOLES 128.29 1/26
## 4 XXXX SYJR RTCQJ 105.30 4/25
ç”案 2 :(得分:0)
这是一个data.table解决方案。 使用了bgoldstç”案的data.frame
library(data.table);
set.seed(1L); N <- 4L; df <- data.frame(Type=replicate(N,paste(collapse='',rep(sample(LETTERS,1L),4L))),Network=replicate(N,paste(collapse='',sample(LETTERS,4L))),Placement=replicate(N,paste(collapse='',sample(LETTERS,5L))),Cost=round(runif(N,50,150),2L),`Dates (chr)`=replicate(N,paste(collapse=',',gsub('\\b0','',format(format='%m/%d',sample(seq(as.Date('2016-01-01'),as.Date('2016-12-31'),1L),sample(1:4,1L)))))),stringsAsFactors=F,check.names=F);
dd <- as.data.table(df)
merge(dd[,.(Type,Network,Placement,Cost)],dd[,(strsplit(`Dates (chr)`,split = "[,]")),Type],by = "Type");
Type Network Placement Cost V1
1: GGGG FWYP YFPCZ 132.09 10/15
2: GGGG FWYP YFPCZ 132.09 1/9
3: GGGG FWYP YFPCZ 132.09 6/22
4: JJJJ QBEX KAJUH 114.71 9/10
5: JJJJ QBEX KAJUH 114.71 6/23
6: JJJJ QBEX KAJUH 114.71 11/9
7: OOOO RJSL MOLES 128.29 3/30
8: OOOO RJSL MOLES 128.29 1/26
9: XXXX SYJR RTCQJ 105.30 4/25
ç”案 3 :(得分:0)
使用data.table
çš„å¦ä¸€ä¸ªè§£å†³æ–¹æ¡ˆä¹Ÿç‰¢è®°æ›´æ–°ï¼š
test <- function(x){
return (unique(unlist(strsplit(x[['Dates (chr)']],','))))
}
library(data.table);
## creating sample dataset
set.seed(1L); N <- 4L; df <- data.frame(Type=replicate(N,paste(collapse='',rep(sample(LETTERS,1L),4L))),Network=replicate(N,paste(collapse='',sample(LETTERS,4L))),Placement=replicate(N,paste(collapse='',sample(LETTERS,5L))),Cost=round(runif(N,50,150),2L),`Dates (chr)`=replicate(N,paste(collapse=',',gsub('\\b0','',format(format='%m/%d',sample(seq(as.Date('2016-01-01'),as.Date('2016-12-31'),1L),sample(1:4,1L)))))),stringsAsFactors=F,check.names=F);
setDT(df)
df[,test(.SD),by=c('Type','Network','Placement','Cost')]
## Type Network Placement Cost V1
##1: GGGG FWYP YFPCZ 132.09 10/15
##2: GGGG FWYP YFPCZ 132.09 1/9
##3: GGGG FWYP YFPCZ 132.09 6/22
##4: JJJJ QBEX KAJUH 114.71 9/10
##5: JJJJ QBEX KAJUH 114.71 6/23
##6: JJJJ QBEX KAJUH 114.71 11/9
##7: OOOO RJSL MOLES 128.29 3/30
##8: OOOO RJSL MOLES 128.29 1/26
##9: XXXX SYJR RTCQJ 105.30 4/25