基于多列对数据框进行排序 - 排序问题

时间:2017-10-06 17:51:58

标签: r sorting dataframe multiple-columns columnsorting

我有一个数据框如下

Provider.Number Hospital.Name   State   Mortality
210001  MERITUS MEDICAL CENTER  MD  12.5
210002  UNIVERSITY OF MARYLAND MEDICAL CENTER   MD  12.7
210003  PRINCE GEORGES HOSPITAL CENTER  MD  13
210004  HOLY CROSS HOSPITAL MD  9.6
210005  FREDERICK MEMORIAL HOSPITAL MD  9.8
210006  HARFORD MEMORIAL HOSPITAL   MD  11.5
210007  SAINT JOSEPH MEDICAL CENTER MD  9.5
210008  MERCY MEDICAL CENTER INC    MD  11.2
210009  JOHNS HOPKINS HOSPITAL, THE MD  10.2
210011  SAINT AGNES HOSPITAL    MD  11.1
210012  SINAI HOSPITAL OF BALTIMORE MD  9.7
210013  BON SECOURS HOSPITAL    MD  9.6
210015  MEDSTAR FRANKLIN SQUARE MEDICAL CENTER  MD  9.3
210016  WASHINGTON ADVENTIST HOSPITAL   MD  11
210017  GARRETT COUNTY MEMORIAL HOSPITAL    MD  13.5
210018  MEDSTAR MONTGOMERY MEDICAL CENTER   MD  9.3
210019  PENINSULA REGIONAL MEDICAL CENTER   MD  10.6
210022  SUBURBAN HOSPITAL   MD  9.9
210023  ANNE ARUNDEL MEDICAL CENTER MD  12
210024  MEDSTAR UNION MEMORIAL HOSPITAL MD  11.3
210027  WESTERN MARYLAND REGIONAL MEDICAL CENTER    MD  12.6
210028  MEDSTAR SAINT MARY'S HOSPITAL   MD  13.1
210029  JOHNS HOPKINS BAYVIEW MEDICAL CENTER    MD  10.7
210030  CHESTER RIVER HOSPITAL CENTER   MD  11.2
210032  UNION HOSPITAL OF CECIL COUNTY  MD  9.9
210033  CARROLL HOSPITAL CENTER MD  9.7
210034  MEDSTAR HARBOR HOSPITAL MD  9.2
210035  CIVISTA MEDICAL CENTER  MD  14.2
210037  MEMORIAL HOSPITAL AT EASTON MD  10.6
210038  MARYLAND GENERAL  HOSPITAL  MD  10.8
210039  CALVERT MEMORIAL HOSPITAL   MD  10.1
210040  NORTHWEST HOSPITAL CENTER   MD  12.6
210043  BALTIMORE WASHINGTON  MEDICAL CENTER    MD  12.7
210044  GREATER BALTIMORE MEDICAL CENTER    MD  7.4
210045  EDWARD MCCREADY MEMORIAL HOSPITAL   MD  12.9
210048  HOWARD COUNTY GENERAL HOSPITAL  MD  10.1
210049  UPPER CHESAPEAKE MEDICAL CENTER MD  12.9
210051  DOCTORS'  COMMUNITY HOSPITAL    MD  11
210054  SOUTHERN MARYLAND HOSPITAL CENTER   MD  11.7
210055  LAUREL REGIONAL MEDICAL CENTER  MD  10.6
210056  MEDSTAR GOOD SAMARITAN HOSPITAL MD  8.4
210057  SHADY GROVE ADVENTIST HOSPITAL  MD  11.7
210060  FORT WASHINGTON HOSPITAL    MD  11
210061  ATLANTIC GENERAL HOSPITAL   MD  10.8
21020F  VA MARYLAND HEALTHCARE SYSTEM - BALTIMORE   MD  12.6

我希望根据死亡率列对DF进行排序,然后按医院列的字母顺序排序以处理关系。

我厌倦了使用不同的排序功能,比如

sorted <- work[order(work$Mortality,work$Hosptial),]

with(work, order(Mortality, Hospital))

但最终排序的结果是错误的。 我得到的输出是

Hosptial    State   Mortality
CALVERT MEMORIAL HOSPITAL   MD  10.1
HOWARD COUNTY GENERAL HOSPITAL  MD  10.1
JOHNS HOPKINS HOSPITAL, THE MD  10.2
LAUREL REGIONAL MEDICAL CENTER  MD  10.6
MEMORIAL HOSPITAL AT EASTON MD  10.6
PENINSULA REGIONAL MEDICAL CENTER   MD  10.6
JOHNS HOPKINS BAYVIEW MEDICAL CENTER    MD  10.7
ATLANTIC GENERAL HOSPITAL   MD  10.8
MARYLAND GENERAL  HOSPITAL  MD  10.8
DOCTORS'  COMMUNITY HOSPITAL    MD  11
FORT WASHINGTON HOSPITAL    MD  11
WASHINGTON ADVENTIST HOSPITAL   MD  11
SAINT AGNES HOSPITAL    MD  11.1
CHESTER RIVER HOSPITAL CENTER   MD  11.2
MERCY MEDICAL CENTER INC    MD  11.2
MEDSTAR UNION MEMORIAL HOSPITAL MD  11.3
HARFORD MEMORIAL HOSPITAL   MD  11.5
SHADY GROVE ADVENTIST HOSPITAL  MD  11.7
SOUTHERN MARYLAND HOSPITAL CENTER   MD  11.7
ANNE ARUNDEL MEDICAL CENTER MD  12
MERITUS MEDICAL CENTER  MD  12.5
NORTHWEST HOSPITAL CENTER   MD  12.6
VA MARYLAND HEALTHCARE SYSTEM - BALTIMORE   MD  12.6
WESTERN MARYLAND REGIONAL MEDICAL CENTER    MD  12.6
BALTIMORE WASHINGTON  MEDICAL CENTER    MD  12.7
UNIVERSITY OF MARYLAND MEDICAL CENTER   MD  12.7
EDWARD MCCREADY MEMORIAL HOSPITAL   MD  12.9
UPPER CHESAPEAKE MEDICAL CENTER MD  12.9
PRINCE GEORGES HOSPITAL CENTER  MD  13
MEDSTAR SAINT MARY'S HOSPITAL   MD  13.1
GARRETT COUNTY MEMORIAL HOSPITAL    MD  13.5
CIVISTA MEDICAL CENTER  MD  14.2
GREATER BALTIMORE MEDICAL CENTER    MD  7.4
MEDSTAR GOOD SAMARITAN HOSPITAL MD  8.4
MEDSTAR HARBOR HOSPITAL MD  9.2
MEDSTAR FRANKLIN SQUARE MEDICAL CENTER  MD  9.3
MEDSTAR MONTGOMERY MEDICAL CENTER   MD  9.3
SAINT JOSEPH MEDICAL CENTER MD  9.5
BON SECOURS HOSPITAL    MD  9.6
HOLY CROSS HOSPITAL MD  9.6
CARROLL HOSPITAL CENTER MD  9.7
SINAI HOSPITAL OF BALTIMORE MD  9.7
FREDERICK MEMORIAL HOSPITAL MD  9.8
SUBURBAN HOSPITAL   MD  9.9
UNION HOSPITAL OF CECIL COUNTY  MD  9.9

获得2个参数的函数:

  1. 状态(检查原始CSV数据)
  2. 心脏病发作/热衰竭/肺炎
  3. 我的完整代码是

    best <- function(states,outcomes)  
    {  
    
        #patterns is obtained to use them in the regex function 
        patterns<-paste("^Hospital.*",outcomes, sep="")
         Readcsv<-read.csv("outcome-of-care-measures.csv", check.names = F)
        columnname<-colnames(Readcsv)
        #regex operation going on
        regex1<-grep(patterns,columnname,ignore.case=TRUE, value = T)
        #another regex operation
        Extracted<-grep("Mortality",regex1,ignore.case=TRUE, value = T)
        #extract dataframe based on the state and final extracted column name using the regex function
        dfe<-subset(Readcsv, Readcsv$State == states & Readcsv[[Extracted]]!="Not Available")
         #create a vector
        b<-c("Hospital Name","State", Extracted)
        #extract only those columns seen in the vector
        work<-dfe[,b]
        #change column name
        colnames(work)<-c("Hosptial","State","Mortality")
        # stuck after this point
        Ascorder<-work[with(work, order(Mortality, Hosptial)),]
    
    }
    

    我对Stack溢出相对较新,请注意我的格式问题。我想知道我哪里出错了。

1 个答案:

答案 0 :(得分:1)

您可以使用dplyr:

require(dplyr)
work <- work %>%
    arrange(Mortality, Hospital)

我无法测试它,因为您没有提供可重现的数据示例,但它应该可以解决问题。