我需要用$ Country的因子填充缺少序列值的$ Year。 $ Count列可以用0填充。
Country Year Count
A 1 1
A 2 1
A 4 2
B 1 1
B 3 1
所以我最终得到了
Country Year Count
A 1 1
A 2 1
A 3 0
A 4 2
B 1 1
B 2 0
B 3 1
希望这是明确的家伙,提前谢谢!
答案 0 :(得分:5)
这是使用dplyr
和tidyr
的{{1}} / complete
解决方案:
full_seq
答案 1 :(得分:4)
library(data.table)
# d is your original data.frame
setDT(d)
foo <- d[, .(Year = min(Year):max(Year)), Country]
res <- merge(d, foo, all.y = TRUE)[is.na(Count), Count := 0]
答案 2 :(得分:4)
类似于@ PoGibas&#39;回答:
tshark tcp port 6633
给出了
library(data.table)
# set default values
def = list(Count = 0L)
# create table with all levels
fullDT = setkey(DT[, .(Year = seq(min(Year), max(Year))), by=Country])
# initialize to defaults
fullDT[, names(def) := def ]
# overwrite from data
fullDT[DT, names(def) := mget(sprintf("i.%s", names(def))) ]
这概括为包含更多列( Country Year Count
1: A 1 1
2: A 2 1
3: A 3 0
4: A 4 2
5: B 1 1
6: B 2 0
7: B 3 1
除外)。我想类似的功能存在于&#34; tidyverse&#34;中,其名称类似于&#34; expand&#34;或者&#34;完成&#34;。
答案 3 :(得分:4)
另一个基础R想法可以分为国家/地区,使用d:
@rem note that's important to change the drive 'permanently'
cd d:\home
zsh.exe
查找setdiff
中的缺失值,并seq(max(Year))
将它们转换为原始数据框。使用rbind
将do.call
列表返回到数据框,即
rbind
给出,
d1 <- do.call(rbind, c(lapply(split(df, df$Country), function(i){ x <- rbind(i, data.frame(Country = i$Country[1], Year = setdiff(seq(max(i$Year)), i$Year), Count = 0)); x[with(x, order(Year)),]}), make.row.names = FALSE))
答案 4 :(得分:2)
> setkey(DT,Country,Year)
> DT[setkey(DT[, .(min(Year):max(Year)), by = Country], Country, V1)]
Country Year Count
1: A 1 1
2: A 2 1
3: A 3 NA
4: A 4 2
5: B 1 1
6: B 2 NA
7: B 3 1
答案 5 :(得分:2)
另一个dplyr
和tidyr
解决方案。
library(dplyr)
library(tidyr)
dt2 <- dt %>%
group_by(Country) %>%
do(data_frame(Country = unique(.$Country),
Year = full_seq(.$Year, 1))) %>%
full_join(dt, by = c("Country", "Year")) %>%
replace_na(list(Count = 0))
答案 6 :(得分:2)
以下是基础R中使用<div class="card mb-3" ng-if="key > 1"
ng-repeat="(key, game) in scoreboard.games.game">
<div class="card-header" align="center">
{{ game.away_team_name }}
({{ game.away_win }}-{{ game.away_loss }}) At
{{ game.home_team_name }}
({{ game.home_win }}-{{ game.home_loss }})<br />
<small>{{ game.time }}</small>
</div>
<div class="card-block"></div>
</div>
,tapply
,do.call
和range
来计算年度序列的方法。然后从返回的命名列表构造一个data.frame,将其合并到添加所需行的原始文件上,最后填写缺失值。
seq
返回
# get named list with year sequences
temp <- tapply(dat$Year, dat$Country, function(x) do.call(seq, as.list(range(x))))
# construct data.frame
mydf <- data.frame(Year=unlist(temp), Country=rep(names(temp), lengths(temp)))
# merge onto original
mydf <- merge(dat, mydf, all=TRUE)
# fill in missing values
mydf[is.na(mydf)] <- 0