structure(list(Other = c(NA_character_, NA_character_, NA_character_,
NA_character_, NA_character_),
Years = c("2005, 2005, 2006, 2006, 2007", "2011, 2014",
"2007", "2011, 2011, 2011, 2012, 2012, 2012",
"2006, 2006, 2012, 2012, 2015")),
.Names = c("Other", "Years"), row.names = 1:4, class = "data.frame")
鉴于上述数据框,第二列具有以逗号排列的一堆年份。我想创建一个新列,该列将列中每个元素的年总数相加。因此,最终的数据帧如下所示:
structure(list(Other = c(NA_character_, NA_character_, NA_character_,
NA_character_, NA_character_),
Years = c("2005, 2005, 2006, 2006, 2007","2011, 2014", "2007",
"2011, 2011, 2011, 2012, 2012, 2012",
"2006, 2006, 2012, 2012, 2015"),
yearlength = c(5, 2, 1, 6, 5)),
.Names = c("Other", "Years", "yearlength"), row.names = 1:4, class = "data.frame")
我尝试使用stack$yearlength <- count.fields(textConnection(stack), sep = ",")
之类的解决方案,但无法完全解决问题。
答案 0 :(得分:2)
一种方法是计算逗号并添加import { Component, Input, Output, EventEmitter, OnDestroy } from '@angular/core';
@Component({
selector: 'child',
template: ` <br><br>I'm a child<br>`,
styles: [`h1 { font-family: Lato; }`]
})
export class ChildComponent implements OnDestroy {
@Output() beingDestroyed = new EventEmitter<boolean>();
ngOnDestroy(): void {
this.beingDestroyed.emit();
}
}
1
另一种方法是计算数字的跨度:
df$yearlength <- stringr::str_count(df$Years, ",")+1
df
#output
Other Years yearlength
1 <NA> 2005, 2005, 2006, 2006, 2007 5
2 <NA> 2011, 2014 2
3 <NA> 2007 1
4 <NA> 2011, 2011, 2011, 2012, 2012, 2012 6
5 <NA> 2006, 2006, 2012, 2012, 2015 5
第三个选择(要感谢Sotos的评论)是对单词进行计数:
df$yearlength <- stringr::str_count(df$Years, "\\d+")
或
stringi::stri_count_words(df$Years)
第四个选项是计算非空格:
stringr::str_count(df$Years, "\\w+")
编辑:当数据集中不存在NA:
stringr::str_count(df$Years, "\\S+")
all.equal(stringr::str_count(df$Years, ",")+1,
stringr::str_count(df$Years, "\\d+"),
stringi::stri_count_words(df$Years),
stringr::str_count(df$Years, "\\w+"),
stringr::str_count(df$Years, "\\S+"))
以上所有解决方案产生 #输出 5 2 NA 6 5
将NA更改为0:
df[3,2] <- NA
数据(由于问题中的数据已损坏):
df$yearlength[is.na(df$yearlength)] <- 0
#output
Other Years yearlength
1 <NA> 2005, 2005, 2006, 2006, 2007 5
2 <NA> 2011, 2014 2
3 <NA> <NA> 0
4 <NA> 2011, 2011, 2011, 2012, 2012, 2012 6
5 <NA> 2006, 2006, 2012, 2012, 2015 5
答案 1 :(得分:1)
您可以根据逗号进行分割,然后只需找到向量的长度即可。
> sapply(strsplit(xy$Years, ","), length)
[1] 5 2 1 6 5
已添加到不适用帐户(例如@missuse的帐户):
xy <- structure(list(Other = c(NA_character_, NA_character_, NA_character_,
NA_character_, NA_character_), Years = c("2005, 2005, 2006, 2006, 2007",
"2011, 2014", "2007", "2011, 2011, 2011, 2012, 2012, 2012", "2006, 2006, 2012, 2012, 2015"
)), .Names = c("Other", "Years"), row.names = 1:4, class = "data.frame")
xy[3, 2] <- NA
sapply(strsplit(xy$Years, ","), FUN = function(x) {
length(na.omit(x))
})
[1] 5 2 0 6 5