view(fastcars)
day car1 car2 car3
1 day1 red silver blue
2 day2 blue red green
3 day3 blue white green
4 day4 green black red
5 day5 black red silver
将所有颜色的汽车组合成一个具有独特名称的列表。
cars <- stack(fastcars[, c(2:4)])
cars <- t(unique(cars[,1]))
将颜色作为名称添加到数据帧的末尾
fastcars[c(cars)] <- NA
day car1 car2 car3 red blue green black silver white
1 day1 red silver blue NA NA NA NA NA NA
2 day2 blue red green NA NA NA NA NA NA
3 day3 blue white green NA NA NA NA NA NA
4 day4 green black red NA NA NA NA NA NA
5 day5 black red silver NA NA NA NA NA NA
如果组合名称与car1,car2和/或car3列中的变量匹配,则希望用1或0填充NA。
day car1 car2 car3 red blue green black silver white
day1 red silver blue 1 1 0 0 1 0
day2 blue red green 1 1 1 0 0 0
day3 blue white green 0 1 1 0 0 1
day4 green black red 1 0 1 1 0 0
day5 black red silver 1 0 0 1 1 0`
我相信这里的链接与我想要做的很接近,但无法弄清楚如何在我现有的数据框架中创建它。 https://stackoverflow.com/a/30274596/3837899
#Generate example dataframe with character column
example <- as.data.frame(c("A", "A", "B", "F", "C", "G", "C", "D", "E", "F"))
names(example) <- "strcol"
#For every unique value in the string column, create a new 1/0 column
#This is what Factors do "under-the-hood" automatically when passed to function requiring numeric data
for(level in unique(example$strcol)){
example[paste("dummy", level, sep = "_")] <- ifelse(example$strcol == level, 1, 0)
}
答案 0 :(得分:4)
以下是几个选项。
选项1:我们可以在重新转换融化数据后使用 data.table 合并。
library(data.table) # v1.9.6
## make 'df' a data.table
setDT(df)
## melt, cast, and merge on 'day'
df[dcast(melt(df, "day"), day ~ value, fun.aggregate = length), on = "day"]
# day car1 car2 car3 black blue green red silver white
# 1: day1 red silver blue 0 1 0 1 1 0
# 2: day2 blue red green 0 1 1 1 0 0
# 3: day3 blue white green 0 1 1 0 0 1
# 4: day4 green black red 1 0 1 1 0 0
# 5: day5 black red silver 1 0 0 1 1 0
选项2:这是一种不太吸引人的但基本的R基础方法。
## make sure car columns are character (may not be necessary)
df[-1] <- lapply(df[-1], as.character)
## get unique values of car columns
u <- unique(unlist(df[-1]))
## match 'u' with each row in 'df'
l <- lapply(seq_len(nrow(df)), function(i) as.numeric(u %in% df[i, -1]))
## bring the data together
cbind(df, setNames(do.call(rbind.data.frame, l), u))
# day car1 car2 car3 red blue green black silver white
# 1 day1 red silver blue 1 1 0 0 1 0
# 2 day2 blue red green 1 1 1 0 0 0
# 3 day3 blue white green 0 1 1 0 0 1
# 4 day4 green black red 1 0 1 1 0 0
# 5 day5 black red silver 1 0 0 1 1 0
数据:强>
df <-structure(list(day = structure(1:5, .Label = c("day1", "day2",
"day3", "day4", "day5"), class = "factor"), car1 = structure(c(4L,
2L, 2L, 3L, 1L), .Label = c("black", "blue", "green", "red"), class = "factor"),
car2 = structure(c(3L, 2L, 4L, 1L, 2L), .Label = c("black",
"red", "silver", "white"), class = "factor"), car3 = structure(c(1L,
2L, 2L, 3L, 4L), .Label = c("blue", "green", "red", "silver"
), class = "factor")), .Names = c("day", "car1", "car2",
"car3"), class = "data.frame", row.names = c("1", "2", "3", "4",
"5"))
答案 1 :(得分:2)
从这个data.frame开始:
> fastcars
day car1 car2 car3 red blue green black silver white
1 day1 red silver blue NA NA NA NA NA NA
2 day2 blue red green NA NA NA NA NA NA
3 day3 blue white green NA NA NA NA NA NA
4 day4 green black red NA NA NA NA NA NA
5 day5 black red silver NA NA NA NA NA NA
这看起来像使用base-R的一种方式:
#for every colour fill in each column
for (i in c('red','blue','green','black','silver','white')){
#a simple apply per row is returning 1 if any row has the corresponding colour
#or a 0 otherwise
fastcars[, i] <- apply(fastcars[2:4], 1, function(x) ifelse(any(x==i),1,0) )
}
输出:
> fastcars
day car1 car2 car3 red blue green black silver white
1 day1 red silver blue 1 1 0 0 1 0
2 day2 blue red green 1 1 1 0 0 0
3 day3 blue white green 0 1 1 0 0 1
4 day4 green black red 1 0 1 1 0 0
5 day5 black red silver 1 0 0 1 1 0
答案 2 :(得分:2)
使用dplyr
和tidyr
:
library(dplyr)
library(tidyr)
fastcars %>% gather(car, col, -day) %>%
spread(col, car) %>%
mutate_each(funs(+!is.na(.)), -day) %>%
left_join(fastcars, ., by = "day")
day car1 car2 car3 black blue green red silver white
1 day1 red silver blue 0 1 0 1 1 0
2 day2 blue red green 0 1 1 1 0 0
3 day3 blue white green 0 1 1 0 0 1
4 day4 green black red 1 0 1 1 0 0
5 day5 black red silver 1 0 0 1 1 0
答案 3 :(得分:1)
请给我们dput
。
使用reshape2
感谢Richard Scriven,我正在处理他的数据结构
dd2<-melt(df,id.vars="day")
dd3<-dcast(data=dd2, day ~ value, value.var="value", length)
merge(df, dd3, by="day")
day car1 car2 car3 black blue green red silver white
1 day1 red silver blue 0 1 0 1 1 0
2 day2 blue red green 0 1 1 1 0 0
3 day3 blue white green 0 1 1 0 0 1
4 day4 green black red 1 0 1 1 0 0
5 day5 black red silver 1 0 0 1 1 0
答案 4 :(得分:1)
已经有很多很棒的答案。这个有点神秘,但在基础R。
int a = 0;
double b = 0.0;
Scanner sc = new Scanner(System.in);
if(sc.hasNextInt()) {
a = sc.nextInt();
} else if(sc.hasNextDouble()) {
b = sc.nextDouble();
}
答案 5 :(得分:1)
我们也可以使用table
执行此操作。我们unlist
&#39;汽车&#39;第一列的cbind
列,获取table
,使用原始数据集将其转换为data.frame
和cbind
。
cbind(fastcars,as.data.frame.matrix(table(cbind(fastcars[1],
cars=unlist(fastcars[2:4])))))
# day car1 car2 car3 black blue green red silver white
#1 day1 red silver blue 0 1 0 1 1 0
#2 day2 blue red green 0 1 1 1 0 0
#3 day3 blue white green 0 1 1 0 0 1
#4 day4 green black red 1 0 1 1 0 0
#5 day5 black red silver 1 0 0 1 1 0
答案 6 :(得分:0)
我为高尔夫编码了这个:
h3 {
padding-top:20px;
margin-top:0;
font-weight: 800;
font-size: 26px;
white-space: nowrap;
}
对于fastcars <- cbind(fastcars,
do.call(pmax, lapply(fastcars[-1], outer, setNames(nm=unique(unlist(fastcars[-1]))), `==`))
)
中的每一个,使用car1, car2, car3
制作虚拟矩阵,然后outer
将它们全部组合在一起。