从R中的csv提取并绘制条件信息

时间:2018-10-17 12:32:45

标签: r ggplot2 histogram

输入:我有一个CSV文件,其中包含许多有关巴塞罗那人口的信息。我想用直方图或绘图表示总人口与区域的关系。区域的字段重复了x次...

enter image description here

对于我的第二列,在直方图或平台上显示并不重要,仅在x轴上显示区域的数量,在y轴上显示该区域的总人口。

```{r}
# Allows to define graphics efficiently, elegantly and simply.
library(ggplot2)
v_file <- "../../dataset.csv"
data <- read.csv(file=v_file, sep=',', header = TRUE)
population <- data[2:73, 1:3]
population
dput(population[, ,])
```

并且dput显示给我:

structure(list(Dte. = structure(c(1L, 1L, 1L, 1L, 3L, 3L, 3L, 
3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 
6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L), .Label = c("1", "10", "2", "3", "4", "5", "6", "7", 
"8", "9", "BARCELONA"), class = "factor"), Barris = structure(c(2L, 
13L, 24L, 35L, 46L, 57L, 68L, 73L, 74L, 3L, 4L, 5L, 6L, 7L, 8L, 
9L, 10L, 11L, 12L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 
23L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 33L, 34L, 36L, 37L, 
38L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 47L, 48L, 49L, 50L, 51L, 
52L, 53L, 54L, 55L, 56L, 58L, 59L, 60L, 61L, 62L, 63L, 64L, 65L, 
66L, 67L, 69L, 70L, 71L), .Label = c("", "1. el Raval", "10. Sant Antoni", 
"11. el Poble Sec - AEI Parc Montjuïc", "12. la Marina del Prat Vermell - AEI Zona Franca", 
"13. la Marina de Port", "14. la Font de la Guatlla", "15. Hostafrancs", 
"16. la Bordeta", "17. Sants - Badal", "18. Sants", "19. les Corts", 
"2. el Barri Gòtic", "20. la Maternitat i Sant Ramon", "21. Pedralbes", 
"22. Vallvidrera, el Tibidabo i les Planes", "23. Sarrià", "24. les Tres Torres", 
"25. Sant Gervasi - la Bonanova", "26. Sant Gervasi - Galvany", 
"27. el Putxet i el Farró", "28. Vallcarca i els Penitents", 
"29. el Coll", "3. la Barceloneta", "30. la Salut", "31. la Vila de Gràcia", 
"32. el Camp d'en Grassot i Gràcia Nova", "33. el Baix Guinardó", 
"34. Can Baró", "35. el Guinardó", "36. la Font d'en Fargues", 
"37. el Carmel", "38. la Teixonera", "39. Sant Genís dels Agudells", 
"4. Sant Pere, Santa Caterina i la Ribera", "40. Montbau", "41. la Vall d'Hebron", 
"42. la Clota", "43. Horta", "44. Vilapicina i la Torre Llobeta", 
"45. Porta", "46. el Turó de la Peira", "47. Can Peguera", "48. la Guineueta", 
"49. Canyelles", "5. el Fort Pienc", "50. les Roquetes", "51. Verdun", 
"52. la Prosperitat", "53. la Trinitat Nova", "54. Torre Baró", 
"55. Ciutat Meridiana", "56. Vallbona", "57. la Trinitat Vella", 
"58. Baró de Viver", "59. el Bon Pastor", "6. la Sagrada Família", 
"60. Sant Andreu", "61. la Sagrera", "62. el Congrés i els Indians", 
"63. Navas", "64. el Camp de l'Arpa del Clot", "65. el Clot", 
"66. el Parc i la Llacuna del Poblenou", "67. la Vila Olímpica del Poblenou", 
"68. el Poblenou", "69. Diagonal Mar i el Front Marítim del Poblenou", 
"7. la Dreta de l'Eixample", "70. el Besòs i el Maresme", "71. Provençals del Poblenou", 
"72. Sant Martí de Provençals", "73. la Verneda i la Pau", 
"8. l'Antiga Esquerra de l'Eixample", "9. la Nova Esquerra de l'Eixample"
), class = "factor"), TOTAL = c(47986L, 16240L, 15101L, 22923L, 
32048L, 51651L, 44246L, 42512L, 58315L, 38412L, 40358L, 1151L, 
30622L, 10422L, 15949L, 18561L, 24047L, 41244L, 46104L, 23980L, 
12117L, 4689L, 25106L, 16660L, 25909L, 47753L, 29617L, 15615L, 
7428L, 13207L, 50885L, 34431L, 25734L, 9020L, 36538L, 9390L, 
31583L, 11634L, 6971L, 5171L, 5792L, 611L, 26743L, 25618L, 25046L, 
15506L, 2233L, 15247L, 6863L, 15648L, 12368L, 26398L, 7271L, 
2859L, 10369L, 1379L, 10006L, 2539L, 12582L, 57223L, 29031L, 
14141L, 22171L, 38371L, 27089L, 15204L, 9404L, 33931L, 13710L, 
22893L, 20649L, 26187L)), .Names = c("Dte.", "Barris", "TOTAL"
), row.names = 2:73, class = "data.frame")

输出:带有ggplot2库的直方图,它在x轴上显示区域数,在y轴上显示属于该区域的TOTAL字段的总和。

[![在此处输入图片描述] [2]] [2]

2 个答案:

答案 0 :(得分:2)

喜欢吗?

library(tidyverse)
 df%>%
  group_by(Dte.)%>%
  summarise(total=sum(TOTAL))%>%
  mutate(Dte.=as.numeric( as.character(Dte.) ))%>%
  arrange(Dte.)%>%
  ggplot(aes(x=as.factor(Dte.),y=total))+geom_col()+
  labs(x="Dte.",y="TOTAL")

enter image description here

答案 1 :(得分:1)

library(tidyverse)

data %>% 
  group_by(Dte.) %>% 
  summarise(Population = sum(TOTAL)) %>% 
  ggplot(aes(x = Dte., y = Population)) + 
    geom_bar(stat = "identity")