我无所适从,我觉得分开和聚集应该足以完成此任务,但我可能会遗漏某些东西...我已经尝试过
done <- gather(diseases, Patientdays, Seperations, c(1, 3))
done <- separate(fixdiseases, "Separations_Y2016-17", into = c("Y2016-17", "Separations"), sep = "_")
只是为了让我了解我一直在尝试的事情...我停在这里,因为如果我对其余各列进行相同的操作,看来最终都无法解决
正确,数据。希望这符合此处的礼节,但我已将csv上传到此链接 http://www.filedropper.com/diseases
答案 0 :(得分:1)
我相信这可以完成工作:
library(dplyr)
library(reshape2)
# read .csv
diseases <- read.csv('diseases.csv')
# melt the dataframe
diseases_melted <- diseases %>% melt(id.var = "Diseases")
diseases_melted$variable %>%
as.character() %>%
strsplit('_') %>% # split the year from the variable name
do.call(rbind, .) %>% # bind them together
`colnames<-`(c('Variable_name', 'Year')) %>% # set the names here for easier access
cbind(diseases_melted) %>% # add the new columns to the melted dataframe
dcast(Diseases + Year ~ Variable_name, # spread the variables again
value.var = 'value')
数据
对于感兴趣的人,以下是数据:
diseases <- structure(list(Diseases = c("1 Certain infectious and parasitic diseases (A00-B99)",
"2 Neoplasms (C00-D48)", "3 Diseases of the blood and blood−forming organs and certain disorders involving the immune mechanism (D50-D89)",
"4 Endocrine, nutritional and metabolic diseases (E00-E89)",
"5 Mental and behavioural disorders (F00-F99)", "6 Diseases of the nervous system (G00-G99)",
"7 Diseases of the eye and adnexa (H00-H59)", "8 Diseases of the ear and mastoid process (H60-H95)",
"9 Diseases of the circulatory system (I00-I99)", "10 Diseases of the respiratory system (J00-J99)",
"11 Diseases of the digestive system (K00-K93)", "12 Diseases of the skin and subcutaneous tissue (L00-L99)",
"13 Diseases of the musculoskeletal system and connective tissue (M00-M99)",
"14 Diseases of the genitourinary system (N00-N99)", "15 Pregnancy, childbirth and the puerperium (O00-O99)",
"16 Certain conditions originating in the perinatal period (P00-P96)",
"17 Congenital malformations, deformations and chromosomal abnormalities (Q00-Q99)",
"18 Symptoms, signs and abnormal clinical and laboratory findings, not elsewhere classified (R00-R99)",
"19 Injury, poisoning and certain other consequences of external causes (S00-T98)",
"21 Factors influencing health status and contact with health services (Z00-Z99)",
"Not reported"), Patientdays_Y2015.16 = c("694,007", "2,223,563",
"317,085", "582,936", "3,778,574", "884,703", "423,577", "99,880",
"2,611,423", "1,700,645", "2,136,743", "597,145", "2,369,828",
"1,062,051", "1,304,805", "581,789", "125,345", "1,603,775",
"3,175,895", "3,522,214", "50,407"), Separations_Y2015.16 = c("170,095",
"666,594", "175,590", "169,247", "429,244", "322,843", "397,342",
"67,185", "556,638", "467,780", "1,042,625", "173,374", "763,336",
"490,394", "498,823", "69,601", "39,771", "841,423", "747,792",
"2,508,250", "1,821"), Patientdays_Y2016.17 = c("771,770", "2,235,045",
"335,699", "612,602", "4,465,669", "868,598", "437,673", "106,969",
"2,663,249", "1,788,798", "2,162,150", "618,352", "2,402,038",
"1,052,440", "1,286,556", "573,388", "126,279", "1,694,416",
"3,249,710", "3,524,083", "15,540"), Separations_Y2016.17 = c("186,034",
"684,075", "190,568", "184,092", "456,027", "330,698", "410,184",
"71,962", "576,516", "498,853", "1,059,981", "182,114", "773,279",
"498,635", "499,408", "70,254", "40,014", "903,760", "782,964",
"2,613,993", "404")), class = "data.frame", row.names = c(NA,
-21L))