我有以下提到的数据框:
import $ from 'jquery'
alert()
Json数据示例:
ID Rank Name Json_Data
IR-122 RE AFG {as below sample}
IR-122 UI SSw {as below sample}
IR-123 RF HEr {as below sample}
IR-123 RO djf {as below sample}
IR-124 RE der {as below sample}
IR-125 RF fet {as below sample}
我的模式json通过使用我想提取的上述日期框架携带Month并研究该月的每个日期:
最长月份(以MMM-YY格式)显示在一列中
第1、7、14、21和28日的值
{"Jan-2018":{"10":50000.0,"11":50000.0,"12":15202.0,"13":10089.0,"14":10089.0,"15":9589.0,"16":9589.0,"17":18941.0,"18":15246.75,"19":5053.75,"20":44092.75,"21":36630.75,"22":9334.75,"23":5254.75,"24":4357.25,"25":3357.25,"26":44626.25,"27":49292.25,"28":48292.25,"29":43371.8,"30":38675.8,"31":37988.12},"Mar-2018":{"1":30799.02,"2":20775.42,"3":20657.42,"4":20657.42,"5":12657.42,"6":11110.22,"7":11110.22,"8":11110.22,"9":11111.22,"10":30272.22,"11":30272.22,"12":25316.22,"13":25316.22,"14":25316.22,"15":25316.22,"16":25316.22,"17":25316.22,"18":25316.22,"19":25316.22,"20":25316.22,"21":15316.22,"22":15316.22,"23":15316.22,"24":15316.22,"25":15204.12,"26":14791.12,"27":14791.12,"28":14791.12,"29":14791.12,"30":14791.12,"31":14791.12},"Feb-2018":{"1":36749.12,"2":36483.37,"3":35254.87,"4":27254.87,"5":15880.87,"6":14173.87,"7":7934.87,"8":7091.87,"9":5797.87,"10":5797.87,"11":5797.87,"12":283841.87,"13":283418.87,"14":283418.87,"15":253426.37,"16":242226.37,"17":227226.37,"18":197226.37,"19":147226.37,"20":111799.02,"21":111799.02,"22":66799.02,"23":64799.02,"24":64799.02,"25":63799.02,"26":53799.02,"27":36799.02,"28":36799.02},"Apr-2018":{"1":14791.12,"2":14791.12,"3":14791.12,"4":14791.12,"5":10791.12,"6":10791.12,"7":10791.12,"8":10791.12,"9":10755.72,"10":5799.72,"11":5799.72,"12":5799.72,"13":5799.72,"14":5799.72,"15":5799.72,"16":5799.72,"17":5799.72,"18":5799.72,"19":5728.92,"26":728.92,"27":728.92,"28":728.92,"29":728.92,"30":728.92}}
值将是A_1
在1日的值:Apr-2018
14791.12
的值应为2018年4月7日:A_2
依此类推。
我需要从最长月份(不包括最长月份)起的4个月内提供此服务。
在10791.12
以下是大多数月份的实际前一个月,而A_1
是A_2
的精确前一个月,依此类推,我仅从A_1
到{{ 1}}相同的列将重复A_1
,A_28
和B_2
月。
在A_1中,分析将是A_1月份的第一个日期,在A_1月份的第7天的A_7读数中,其他三个月也是如此。并且这些值必须为group_by C_3
和C_4
。
在我的样本Json数据中,只有4个月,最大月份为2018年4月,因此在这种情况下,A_1为2018年3月,B_2为2018年2月,C_3为2018年1月,D_4为为2017年12月(其中D_1,D_7_D_14,D_21和D_28为ID
)。
在以下模式中,我预计最大月份为5月18日。
样本输出:
Rank
样本输出数据帧:
NA
答案 0 :(得分:3)
json_to_df <- function(data){
json_as_list <- jsonlite::fromJSON(data)
months <- names(json_as_list)
last4months <- tail(months[order(lubridate::myd(paste0(months,"-01")))],4)
max_month <- tail(last4months,1)
other_months <- head(last4months,-1)
other_months_suffixes <- paste0(LETTERS[seq_along(other_months)],"_")
last_month <- tail(other_months,1)
days <- c('1','7','14','21','28')
get_month_list <- function(x) json_as_list[[x]][days]
list_subset <- Map(function(x,y) setNames(get_month_list(x),paste0(y,days)),
rev(other_months), other_months_suffixes)
list_subset <- unlist(list_subset, recursive = FALSE)
names(list_subset) <- gsub("^.*?\\.","",names(list_subset))
list_subset <- map_if(list_subset, is.null,~NA)
only_nas <- setNames(replicate(20,NA,F),paste(sep="_",rep(LETTERS[1:4],each=5),rep(days,4)))
missing <- names(only_nas)[! names(only_nas) %in% names(list_subset)]
list_subset <- c(list_subset, only_nas[missing])
list_months <- setNames(as.list(other_months),paste0(other_months_suffixes,0))
only_nas2 <- setNames(replicate(4,NA,F),paste(sep="_",LETTERS[1:4],0))
missing2 <- names(only_nas2)[! names(only_nas2) %in% names(list_months)]
list_months <- c(list_months, only_nas2[missing2])
output_list <- c(
Max_Month = max_month,
list_months,
list_subset)
data.frame(output_list)
}
library(jsonlite)
library(lubridate)
library(tidyverse)
df %>%
mutate(Json_Data = map(Json_Data,json_to_df)) %>%
unnest
# ID Rank Name Max_Month A_0 B_0 C_0 D_0 A_1 A_7 A_14 A_21 A_28 B_1
# 1 IR-122 RE AFG Apr-2018 Jan-2018 Feb-2018 Mar-2018 NA 30799.02 11110.22 25316.22 15316.22 14791.12 36749.12
# B_7 B_14 B_21 B_28 C_1 C_7 C_14 C_21 C_28 D_1 D_7 D_14 D_21 D_28
# 1 7934.87 283418.9 111799 36799.02 NA NA 10089 36630.75 48292.25 NA NA NA NA NA