我有一个小的数据框:
gene_symbol<-c("DADA","SDAASD","SADDSD","SDADD","ASDAD","XCVXCVX","EQWESDA","DASDADS")
panel<-c("growth","growth","growth","growth","big","big","big","small")
Gene_states22<-data.frame(gene_symbol,panel)
和带有颜色的向量:
colors<-c("red","green","yellow").
我想创建一个像这样的数据框:
gene_symbol panel color
1 DADA growth red
2 SDAASD growth red
3 SADDSD growth red
4 SDADD growth red
5 ASDAD big green
6 XCVXCVX big green
7 EQWESDA big green
8 DASDADS small yellow
用几句话添加一个新列,其中“增长”与“红色”匹配,“大”与“绿色”匹配,“小”与“黄色”匹配。问题是面板名称每次都不会相同,例如,它们可能是“ bob”,“ sam”,“ bill”,并且最多可能有8个不同的名称(和颜色)。数据框的行也会不同。
答案 0 :(得分:3)
给向量命名,然后根据其名称提取颜色就很简单了。
names(colors) = c("growth", "big", "small")
Gene_states22$colors = colors[as.character(Gene_states22$panel)]
Gene_states22
# gene_symbol panel colors
#1 DADA growth red
#2 SDAASD growth red
#3 SADDSD growth red
#4 SDADD growth red
#5 ASDAD big green
#6 XCVXCVX big green
#7 EQWESDA big green
#8 DASDADS small yellow
答案 1 :(得分:1)
一种方法:用两列(面板和颜色)设置第二个数据框,然后将其合并到第一个数据框。无需手动为第二个数据帧键入panel
即可完成此操作。例如:
df1 <- data.frame(
gene = c("DADA","SDAASD","SADDSD","SDADD","ASDAD","XCVXCVX","EQWESDA","DASDADS"),
panel = c("growth","growth","growth","growth","big","big","big","small")
)
colors<-c("red","green","yellow")
df2 <- cbind(unique(df$panel), colors)
result <- merge(df1, df2, by="panel")
对于唯一数量的panel
值,请确保(或编写更多代码检查)您具有正确数量的颜色。
答案 2 :(得分:1)
只需创建一个命名的矢量映射面板即可上色。
all_pass_plays = all_nfl_data[all_nfl_data.PlayType == 'Pass']
passers_under_100 = all_pass_plays.groupby('Passer').size()<= 100
afterfilterdf=all_nfl_data[all_nfl_data['Passer'].isin(passers_under_100[passers_under_100].index)]