希望你能帮助我
我想绘制每年的出版物数量(并按学科分类)。
如何在ggplot2中条形图而不复制数据?
如何为每个ID(x)绘制单个值?
我无法删除行,因为我的DF有其他列,其他图的数据需要像这样。
非常感谢您。
structure(list(x = c(1240L, 1251L, 1214L, 1222L, 1234L, 1235L,
1183L, 1197L, 1198L, 1162L, 1167L, 1169L, 1170L, 1171L, 1176L,
1104L, 1104L, 1113L, 1117L, 1119L, 1119L, 1063L, 1064L, 1065L,
1066L, 1072L, 1081L), year = c(1997L, 1997L, 1998L, 1998L, 1998L,
1998L, 1999L, 1999L, 1999L, 2000L, 2000L, 2000L, 2000L, 2000L,
2000L, 2002L, 2002L, 2002L, 2002L, 2002L, 2002L, 2003L, 2003L,
2003L, 2003L, 2003L, 2003L), discipline = structure(c(11L, 2L,
7L, 2L, 2L, 2L, 7L, 7L, 7L, 2L, 2L, 2L, 2L, 2L, 4L, 4L, 4L, 2L,
4L, 4L, 4L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("", "Biogeochemistry",
"Conservation", "Ecology", "Environmental sciences (interdisciplines)",
"Geochemical", "Geochemistry", "Geography", "Limnology", "Management",
"Oceanography", "Socioecology"), class = "factor"), es.type = c("no",
"no", "no", "Supporting", "no", "no", "no", "no", "no", "no",
"Regulating", "no", "no", "Supporting", "Supporting", "Supporting",
"Regulating", "Supporting", "Supporting", "Supporting", "Regulating",
"no", "no", "no", "Supporting", "Supporting", "Supporting")), row.names = c(NA,
-27L), class = "data.frame")
例如,在该图中,重复了2002年的生态数据。 Plot
问题2:
如果我想删除重复的数据但考虑两列怎么办?例如:
ID = c(1,1,1,1,2,2,3,4,5,5,5,5,6)
Year = c(1990, 1990, 1990, 1990, 1994, 1994,1994, 1995,1995, 1995,1995,1995,1996)
Discipline <- c("Ecology","Ecology","Oceanography", "Oceanography","Oceanography","Oceanography","Oceanography","Oceanography","Oceanography",
"Oceanography","Oceanography","Microbiology","Ecology")
df <-data.frame(ID, Year, Discipline)
#Build plot
p<-ggplot(data=df, aes(x=factor(Year), fill = Discipline)) + geom_bar()
p
在这种情况下,我想绘制ID1中的两个数据=生态学和海洋学。我的意思是我想删除df $ x中重复的学科。对于ID1,我要删除1行生态学和1行海洋学。 在这种情况下我该怎么办?
答案 0 :(得分:0)
您可能正在寻找这样的东西:
#Define data:
df = structure(list(x = c(1240L, 1251L, 1214L, 1222L, 1234L, 1235L,
1183L, 1197L, 1198L, 1162L, 1167L, 1169L,
1170L, 1171L, 1176L,
1104L, 1104L, 1113L, 1117L, 1119L, 1119L, 1063L, 1064L,
1065L,
1066L, 1072L, 1081L),
year = c(1997L, 1997L, 1998L, 1998L, 1998L,
1998L, 1999L, 1999L, 1999L, 2000L, 2000L, 2000L, 2000L, 2000L,
2000L, 2002L, 2002L, 2002L, 2002L, 2002L, 2002L, 2003L, 2003L,
2003L, 2003L, 2003L, 2003L),
discipline = structure(c(11L, 2L, 7L, 2L, 2L, 2L, 7L, 7L, 7L, 2L,
2L, 2L, 2L, 2L, 4L, 4L, 4L, 2L,
4L, 4L, 4L, 2L, 2L, 2L, 2L, 2L, 2L),
.Label = c("", "Biogeochemistry",
"Conservation", "Ecology", "Environmental sciences (interdisciplines)",
"Geochemical", "Geochemistry", "Geography", "Limnology", "Management",
"Oceanography", "Socioecology"), class = "factor"),
es.type = c("no", "no", "no", "Supporting", "no", "no", "no", "no", "no", "no", "Regulating", "no", "no", "Supporting", "Supporting", "Supporting",
"Regulating", "Supporting", "Supporting", "Supporting", "Regulating",
"no", "no", "no", "Supporting", "Supporting", "Supporting")),row.names = c(NA,
-27L), class = "data.frame")
#Build plot:
p<-ggplot(data=df[!duplicated(df$x),] , aes(x=factor(year), fill = discipline)) +
geom_bar(position = position_dodge())
p
最重要的部分是df[!duplicated(df$x),]
,它仅给您df
的行,其中x
列中的值是唯一的。
关于第二个问题,您可以执行以下操作:
p<-ggplot(data=df[!duplicated(df[,c("ID", "Discipline")]),], aes(x=factor(Year),
fill = Discipline)) +
geom_bar(position = position_dodge())
p
有效地,这会在所需的列上调用duplicated
。