我有一个包含两列的数据框。其中一列是一串聚合ID,我需要打破每个ID并给它自己的行。我可以单独为每一行执行此操作,如下所示:
> UPI.download[1:20,]
Design.ID Glados.SKU
5 KCLRI7-00VU-INSGN1
6 KPBK4K-00VU-INSGN1
7 KGLGI7-00VU-FETTI1
8 KUWB08-00VU-INSGN1
9 KUWB08-00VU-INSGN1
10 KGLGI7-00VU-FETTI1
11 KPBK4K-00VU-INSGN1
12 KCLRI7-00VU-INSGN1
13 KBAMI7-00VU-INSGN1
14 KCLRI7-0FLA-WAVE01
15 K510WL-0WEB-PRIME1 K510WL-0WEB-PRIME1
16 K110MS-0WEB-PRIME1 K110MS-0WEB-PRIME1
17 KCLRI6-0GON-INSGN1 KCLRI6-0GON-INSGN1;KCLRI7-0GON-INSGN1;KCLR7X-0GON-INSGN1;KCLR6X-0GON-INSGN1
18 KCLRI6-0GON-INSGN1
19 KBAMI7-0GSU-INSGN1
20 KCLRI7-0GSU-INSGN1 KCLR7X-0GSU-INSGN1;KCLR6X-0GSU-INSGN1;KCLRI7-0GSU-INSGN1;KCLRI6-0GSU-INSGN1
21 KUSB08-0TAM-BRICK1 KUSB08-0TAM-BRICK1
22 K510WD-0LSU-PASLY1 K510WD-0LSU-PASLY1
23 KCLR8P-0MST-INSGN1 KCLR8P-0MST-INSGN1
24 KCLRI6-0TCU-INSGN1 KCLR6X-0TCU-INSGN1;KCLRI6-0TCU-INSGN1;KCLR7X-0TCU-INSGN1;KCLRI7-0TCU-INSGN1
> y = strsplit(UPI.download[13,2],";")
> z = data.frame(UPI.download[13,1],y)
> colnames(z) = c("Design.ID","Glados.SKU")
> z
Design.ID Glados.SKU
1 KCLRI6-0GON-INSGN1 KCLRI6-0GON-INSGN1
2 KCLRI6-0GON-INSGN1 KCLRI7-0GON-INSGN1
3 KCLRI6-0GON-INSGN1 KCLR7X-0GON-INSGN1
4 KCLRI6-0GON-INSGN1 KCLR6X-0GON-INSGN1
但是当我尝试创建循环时,我得到以下内容:
> for(i in nrow(UPI.download)){
+ UPIs = strsplit(UPI.download[i,2],";")
+ new.frame = data.frame(UPI.download[i,1],UPIs)
+ }
Error in data.frame(UPI.download[i, 1], UPIs) :
arguments imply differing number of rows: 1, 0
要么我离开了,要么只是一个我不理解的小调整
答案 0 :(得分:0)
假设您的数据位于名为data.frame
的{{1}}中。您的原始代码存在一些问题:首先(正如@Yannis所提到的)您没有正确迭代,因此您需要编写dat
而不是1:nrow(dat)
。接下来,信息性错误消息表示行长度不匹配。这是由nrow(dat)
character(0)
的结果导致的空白值。我们需要检查strsplit
ed length
结果的unlist
。这是在strsplit
循环的if
语句中捕获的。最后,我们将for
结果放在一起。请注意,对于较大的数据集,此rbind
循环效率较低。一般来说,我不建议以这种方式进行迭代,但它完成了工作。
for
# initialize big data frame
big.frame <- data.frame(
'Design.ID' = character(0),
'Glados.SKU' = character(0),
stringsAsFactors = FALSE
)
# for each row
for(i in 1:nrow(dat)){
# split the string and unlist the result
UPIs = unlist(strsplit(dat[i,2],";"))
if(length(UPIs) == 0){ # if it's blank, just make column blank
new.frame = data.frame('Design.ID' = dat[i,1],
'Glados.SKU' = '',
stringsAsFactors = FALSE)
}else{ #otherwise insert the vector
new.frame = data.frame('Design.ID' = dat[i,1],
'Glados.SKU' = UPIs,
stringsAsFactors = FALSE)
}
# rbind to bigger frame
big.frame <- rbind.data.frame(big.frame,
new.frame,
stringsAsFactors = FALSE)
}
Design.ID Glados.SKU
1 KCLRI7-00VU-INSGN1
2 KPBK4K-00VU-INSGN1
3 KGLGI7-00VU-FETTI1
4 KUWB08-00VU-INSGN1
5 KUWB08-00VU-INSGN1
6 KGLGI7-00VU-FETTI1
7 KPBK4K-00VU-INSGN1
8 KCLRI7-00VU-INSGN1
9 KBAMI7-00VU-INSGN1
10 KCLRI7-0FLA-WAVE01
11 K510WL-0WEB-PRIME1 K510WL-0WEB-PRIME1
12 K110MS-0WEB-PRIME1 K110MS-0WEB-PRIME1
13 KCLRI6-0GON-INSGN1 KCLRI6-0GON-INSGN1
14 KCLRI6-0GON-INSGN1 KCLRI7-0GON-INSGN1
15 KCLRI6-0GON-INSGN1 KCLR7X-0GON-INSGN1
16 KCLRI6-0GON-INSGN1 KCLR6X-0GON-INSGN1