读取嵌套的json并将其转换为整洁的数据

时间:2018-02-01 15:33:55

标签: json r dplyr jsonlite

我正在尝试将json转换为遵循Tidy Data原则的tibble。

网站http://pv.servelelecciones.cl/有一些不错的区域数据。因为我不能提供直接链接到“艾森地区”,因为网址没有改变,这是他们显示的数据:

public int getCER(String originalteks,String extractteks){
  int end;
  int different_char=0;

  //define the shorter end
  if(originalteks.length < extractteks.length)
    end = originalteks.length();
  else
    end = extractteks.length();

  //no if needed -> same length, diff will be 0
  different_char=Math.abs(originalteks.length()-extractteks.length());

  for(int start = 0; start < end; start++){
      if(originalteks.charAt(start)!=extractteks.charAt(start))
          different_char++;//jumlah diferent chart
  }

  return different_char;
}

我想阅读json数据,这些数据可以提供给他们的网站,以获得类似上表的信息,并从那里转移到一个整洁的结构。

如果我只是尝试直接读取数据:

|listapacto                          |partido  |votos|porcentaje|electo|
|------------------------------------|---------|-----|----------|------|
|H. SUMEMOS                          |         |365  |4,52%     |      |
|TODOS                               |         |113  |1,40%     |      |
|50. EDUARDO ROMO LAFOY              |IND-TODOS|69   |0,86%     |      |
|51. SARA MARTINEZ MONDELO           |IND-TODOS|44   |0,55%     |      |
|CIUDADANOS                          |         |252  |3,12%     |      |
|52. VICTOR MANUEL BORQUEZ FINCKE    |CIUD.    |53   |0,66%     |      |
|53. MARISOL LUSDEMIA PINILLA VEJAR  |CIUD.    |199  |2,47%     |      |
|K. COALICIÓN REGIONALISTA VERDE     |         |200  |2,48%     |      |
|DEMOCRACIA REGIONAL PATAGONICA      |         |200  |2,48%     |      |
|54. ELSON BORQUEZ YAÑEZ             |DRP      |59   |0,73%     |      |
|55. PEDRO ANTONIO VERGARA ROJAS     |DRP      |42   |0,52%     |      |
|56. JESSICA ANDREA TORRES BORQUEZ   |IND-DRP  |59   |0,73%     |      |
|57. TAMARA ANDREA ESPINOZA GUTIERREZ|DRP      |40   |0,50%     |      |
|N. LA FUERZA DE LA MAYORIA          |         |2.958|36,66%    |      |
|PARTIDO RADICAL SOCIALDEMOCRATA     |         |346  |4,29%     |      |
|58. JORGE CALDERON NUÑEZ            |PRSD     |346  |4,29%     |      |
|PARTIDO SOCIALISTA DE CHILE         |         |1.651|20,46%    |      |
|59. MARISOL MARTINEZ SANCHEZ        |PSCH     |1.651|20,46%    |      |
|PARTIDO COMUNISTA DE CHILE          |         |322  |3,99%     |      |
|60. ROXANA PEY TUMANOFF             |IND-PCCH |322  |3,99%     |      |
|PARTIDO POR LA DEMOCRACIA           |         |639  |7,92%     |      |
|61. RENE OSVALDO ALINCO BUSTOS      |IND-PPD  |639  |7,92%     |*     |
|O. CONVERGENCIA DEMOCRATICA         |         |2.297|28,47%    |      |
|PARTIDO DEMOCRATA CRISTIANO         |         |2.297|28,47%    |      |
|62. MIGUEL ANGEL CALISTO AGUILA     |PDC      |1.882|23,32%    |*     |
|63. CARMEN GLORIA MARTINEZ CARDENAS |PDC      |224  |2,78%     |      |
|64. RENE ANSELMO LEGUE CARDENAS     |PDC      |191  |2,37%     |      |
|P. CHILE VAMOS                      |         |1.963|24,33%    |      |
|UNION DEMOCRATA INDEPENDIENTE       |         |605  |7,50%     |      |
|65. NESTOR MERA MUÑOZ               |UDI      |605  |7,50%     |      |
|PARTIDO REGIONALISTA INDEPENDIENTE  |         |132  |1,64%     |      |
|66. PATRICIO HENRIQUEZ BARRIENTOS   |PRI      |132  |1,64%     |      |
|EVOLUCION POLITICA                  |         |365  |4,52%     |      |
|67. GEOCONDA NAVARRETE ARRATIA      |EVOP.    |365  |4,52%     |      |
|RENOVACION NACIONAL                 |         |861  |10,67%    |      |
|68. ARACELY LEUQUEN URIBE           |RN       |861  |10,67%    |*     |
|CANDIDATURA INDEPENDIENTE           |         |286  |3,54%     |      |
|69. CECILIO AGUILAR GALINDO         |IND      |286  |3,54%     |      |

然后我在最后一栏中获得了一些锁定信息:

require(data.table)
require(jsonlite)
require(dplyr)

x <- fromJSON("http://www.servelelecciones.cl/data/elecciones_diputados/computo/comunas/114501.json")

y <- as_tibble(x$data)

当然我试过

非常欢迎任何帮助。

1 个答案:

答案 0 :(得分:2)

除了编写函数之外,一种可能性是使用tidyr并根据需要多次使用。

就我而言:

require(data.table)
require(jsonlite)
require(dplyr)
require(tidyr)

x <- fromJSON("http://www.servelelecciones.cl/data/elecciones_diputados/computo/comunas/114501.json")

y <- as_tibble(x$data)

y1 <- y %>% filter(a == "CANDIDATURA INDEPENDIENTE")
y2 <- y %>% filter(a != "CANDIDATURA INDEPENDIENTE") %>% unnest(sd)