我有一个看起来像这样的表:
ID Start_year Status_2005 Status_2006 Status_2007
1 2005 GBR GBR FRA
2 2006 NA FRA FRA
3 2007 NA NA GBR
4 2006 NA UKR RUS
我想对数据进行整形,以便给出起始年份之后年份的状态。所以上面看起来像这样:
ID Year_0 Year_1 Year_2
1 GBR GBR GBR
2 FRA FRA NA
3 GBR NA NA
4 UKR RUS NA
我一直在尝试在R中使用tidyverse,结合使用collect和“ starts_with”,并进行变异以创建新列。但是,我始终以“ years_since_start_year”这一列作为结尾,并且无法弄清楚如何扩展此列以构成最终表。
非常感谢任何帮助
答案 0 :(得分:2)
这是data.table方法:
exports.obter_lugares_zona = (req, res, next) => {
Zona.find({_id: req.params.id_area})
.then(data => {
let flat = data.reduce((data, {description, streets}) => {
streets.forEach(({name, places}) => {
places.forEach(({loc, spotType, used}) => {
if (req.query.type === "Handicap") {
data.push(loc, spotType, used)
}
})
})
return data
}, []);
res.json(flat);
}
)
.catch(error => {
return next(error);
});
}
答案 1 :(得分:1)
这就是我使用tidyverse的方式
library(tidyverse)
# create data
df_raw <- data.frame(ID = c(1:4),
Start_year = c(2005,2006,2007,2006),
Status_2005 =c("GBR", NA, NA, NA),
Status_2006 =c("GBR", "FRA", NA, "UKR"),
Status_2007 =c("FRA", "FRA", "GBR", "RUS"),
stringsAsFactors = F)
df <- df_raw %>%
gather(starts_with("Status"), key = Key, value = Value ) %>%
arrange(ID) %>%
drop_na(Value) %>%
mutate(cnt = unlist(map(rle(ID)$lengths-1, seq, from = 0, by =1 ))) %>%
mutate(Key = paste0("Year_", cnt)) %>%
select(-Start_year, -cnt) %>%
spread(key = Key, value = Value)
df
#> ID Year_0 Year_1 Year_2
#> 1 1 GBR GBR FRA
#> 2 2 FRA FRA <NA>
#> 3 3 GBR <NA> <NA>
#> 4 4 UKR RUS <NA>
答案 2 :(得分:1)
这里有一个粗略的基础R
+ dplyr
:
df %>%
select(starts_with("Status")) %>%
apply(1, function(x) {x <- x[!is.na(x)]; length(x) <- 3; x}) %>%
t() %>%
as.data.frame() %>%
cbind(df[["ID"]], .) %>%
setNames(c("ID", paste0("Year_", 1:3)))
ID Year_1 Year_2 Year_3
1 1 GBR GBR FRA
2 2 FRA FRA <NA>
3 3 GBR <NA> <NA>
4 4 UKR RUS <NA>
Tidyverse样式:
library(tidyr)
library(dplyr)
df %>%
select(-Start_year) %>%
gather(key = "year", value = "country", -ID) %>%
filter(!is.na(country)) %>%
group_by(ID) %>%
mutate(year = paste0("year_", 1:length(year))) %>%
spread(key = "year", value = "country")
# A tibble: 4 x 4
# Groups: ID [4]
ID year_1 year_2 year_3
<int> <chr> <chr> <chr>
1 1 GBR GBR FRA
2 2 FRA FRA NA
3 3 GBR NA NA
4 4 UKR RUS NA