根据其他列值和向量值之间的匹配将新列添加到数据框

时间:2018-07-25 01:00:28

标签: r

我有一个小的数据框:

gene_symbol<-c("DADA","SDAASD","SADDSD","SDADD","ASDAD","XCVXCVX","EQWESDA","DASDADS")
panel<-c("growth","growth","growth","growth","big","big","big","small")
Gene_states22<-data.frame(gene_symbol,panel)

和带有颜色的向量:

 colors<-c("red","green","yellow").

我想创建一个像这样的数据框:

gene_symbol  panel  color
1        DADA growth    red
2      SDAASD growth    red
3      SADDSD growth    red
4       SDADD growth    red
5       ASDAD    big  green
6     XCVXCVX    big  green
7     EQWESDA    big  green
8     DASDADS  small yellow

用几句话添加一个新列,其中“增长”与“红色”匹配,“大”与“绿色”匹配,“小”与“黄色”匹配。问题是面板名称每次都不会相同,例如,它们可能是“ bob”,“ sam”,“ bill”,并且最多可能有8个不同的名称(和颜色)。数据框的行也会不同。

3 个答案:

答案 0 :(得分:3)

给向量命名,然后根据其名称提取颜色就很简单了。

names(colors) = c("growth", "big", "small")
Gene_states22$colors = colors[as.character(Gene_states22$panel)]
Gene_states22
#  gene_symbol  panel colors
#1        DADA growth    red
#2      SDAASD growth    red
#3      SADDSD growth    red
#4       SDADD growth    red
#5       ASDAD    big  green
#6     XCVXCVX    big  green
#7     EQWESDA    big  green
#8     DASDADS  small yellow

答案 1 :(得分:1)

一种方法:用两列(面板和颜色)设置第二个数据框,然后将其合并到第一个数据框。无需手动为第二个数据帧键入panel即可完成此操作。例如:

df1 <- data.frame(
    gene  = c("DADA","SDAASD","SADDSD","SDADD","ASDAD","XCVXCVX","EQWESDA","DASDADS"),
    panel = c("growth","growth","growth","growth","big","big","big","small")
)

 colors<-c("red","green","yellow")

 df2 <- cbind(unique(df$panel), colors)

 result <- merge(df1, df2, by="panel")

对于唯一数量的panel值,请确保(或编写更多代码检查)您具有正确数量的颜色。

答案 2 :(得分:1)

只需创建一个命名的矢量映射面板即可上色。

all_pass_plays = all_nfl_data[all_nfl_data.PlayType == 'Pass']
passers_under_100 = all_pass_plays.groupby('Passer').size()<= 100
afterfilterdf=all_nfl_data[all_nfl_data['Passer'].isin(passers_under_100[passers_under_100].index)]