我有一个具有以下结构的数据框:`
var1 var2 var3
año: 2005 km: 128000 marca: chevrolet
año: 2019 marca: hyundai km: 50000
marca: toyota año: 2012 km: 340000
` 我需要在分配了相应信息的地方创建新变量
año marca km
2005 chevrolet 128000
2019 hyundai 50000
2012 toyota 340000
如果有人可以为此目的帮助我,我会喜欢的。
答案 0 :(得分:0)
library(tidyverse)
df <- tibble::tribble(
~var1, ~var2, ~var3,
"ano: 2005", "km: 128000", "marca: chevrolet",
"ano: 2019", "marca: hyundai", "km: 50000",
"marca: toyota", "ano: 2012", "km: 340000"
)
df %>%
stack() %>%
select(-ind) %>%
separate(values, into = c("column", "value")) %>%
pivot_wider(value, column, values_fn = list(value = list)) %>%
unnest(cols = c(ano, marca, km))
#> # A tibble: 3 x 3
#> ano marca km
#> <chr> <chr> <chr>
#> 1 2005 toyota 128000
#> 2 2019 hyundai 50000
#> 3 2012 chevrolet 340000
答案 1 :(得分:0)
这是基本的R代码
pat <- c("ano","marca","km")
dfout <- setNames(data.frame(t(apply(df,
1,
function(v) trimws(gsub(".*:","",v))[match(gsub(":.*","",v),pat)]))),pat)
这样
> dfout
ano marca km
1 2005 chevrolet 128000
2 2019 hyundai 50000
3 2012 toyota 340000
数据
df <- structure(list(var1 = c("ano: 2005", "ano: 2019", "marca: toyota"
), var2 = c("km: 128000", "marca: hyundai", "ano: 2012"), var3 = c("marca: chevrolet",
"km: 50000", "km: 340000")), class = "data.frame", row.names = c(NA,
-3L))
答案 2 :(得分:0)
使用purrr
,dplyr
和tidyr
解决此问题的一种方法可能是:
map_dfr(.x = split.default(df, 1:length(df)),
~ .x %>%
mutate(rowid = row_number()) %>%
separate(1, sep = ": ", into = c("column", "variable"))) %>%
pivot_wider(names_from = "column", values_from = "variable")
rowid ano marca km
<int> <chr> <chr> <chr>
1 1 2005 chevrolet 128000
2 2 2019 hyundai 50000
3 3 2012 toyota 340000