简化数据框列表的提取

时间:2015-05-01 17:05:02

标签: r dataframe lapply

我的问题是以下问题的延续。 (由于声誉限制,无法评论该线程)

Print the Nth Row in a List of Data Frames

我希望结果打印为数据框而不是列表(假设我有多列而不是示例中的单列)。有人能告诉我我需要做些什么来获得这个输出吗?

输入样本列表

    $AK
                    HospitalName State HeartAttack HeartFailure Pneumonia
    99  PROVIDENCE ALASKA MEDICAL CENTER    AK        13.4         12.4      10.5
    103         ALASKA REGIONAL HOSPITAL    AK        14.5         13.4      12.5
    102      FAIRBANKS MEMORIAL HOSPITAL    AK        15.5         15.6      13.4
    106     ALASKA NATIVE MEDICAL CENTER    AK        15.7         11.6      15.5
    100   MAT-SU REGIONAL MEDICAL CENTER    AK        17.7         11.4      12.1

    $AL
                                  HospitalName State HeartAttack HeartFailure Pneumonia
    78                        CRESTWOOD MEDICAL CENTER    AL        13.3         13.8      10.4
    85                     BAPTIST MEDICAL CENTER EAST    AL        14.2          9.6      10.2
    1                 SOUTHEAST ALABAMA MEDICAL CENTER    AL        14.3         11.4      10.9
    31                              GEORGIANA HOSPITAL    AL        14.5         10.8      11.3
    65                     PRATTVILLE BAPTIST HOSPITAL    AL        14.6         14.8      14.2
    60                                 THOMAS HOSPITAL    AL        14.7         12.8      13.1
    71           VAUGHAN REG MED CENTER PARKWAY CAMPUS    AL        14.7         12.0      14.0

预期输出样本(假设num = 4,即提取每个数据帧的第4行)

    HospitalName State HeartAttack HeartFailure Pneumonia
    106     ALASKA NATIVE MEDICAL CENTER    AK        15.7         11.6      15.5
    65                     PRATTVILLE BAPTIST HOSPITAL    AL        14.6         14.8      14.2

我使用的lapply代码是printtab< -lapply(finaltab,'[',num ,, drop = FALSE)

finaltab是一个数据帧列表,其中每个数据帧包含5列,num用于从每个数据帧中提取1个特定行,printtab是输出列表

我尝试过的事情:

  1. 在lapply中添加simplify = TRUE:它会出错
  2. 使用sapply作为printtab< -sapply(finaltab,'[',num ,, drop = FALSE):它 说参数缺失,没有默认值。没有跌倒的尝试 好。
  3. 使用as.data.frame():它做了我不理解的事情

    AK.HospitalName AK.State AK.HeartAttack AK.HeartFailure AK.Pneumonia                AL.HospitalName AL.State
    NA            <NA>     <NA>           <NA>            <NA>         <NA> D W MCMILLAN MEMORIAL HOSPITAL       AL
       AL.HeartAttack AL.HeartFailure AL.Pneumonia                   AR.HospitalName AR.State AR.HeartAttack AR.HeartFailure
    NA           15.7            14.8         12.6 ARKANSAS METHODIST MEDICAL CENTER       AR           17.1            14.4
       AR.Pneumonia                     AZ.HospitalName AZ.State AZ.HeartAttack AZ.HeartFailure AZ.Pneumonia
    NA         11.7 JOHN C LINCOLN DEER VALLEY HOSPITAL       AZ           14.9            11.9         10.0
             CA.HospitalName CA.State CA.HeartAttack CA.HeartFailure CA.Pneumonia          CO.HospitalName CO.State
    NA SHERMAN OAKS HOSPITAL       CA           13.3             9.7          9.3 SKY RIDGE MEDICAL CENTER       CO
       CO.HeartAttack CO.HeartFailure CO.Pneumonia         CT.HospitalName CT.State CT.HeartAttack CT.HeartFailure CT.Pneumonia
    NA           15.0             9.9         10.5 MIDSTATE MEDICAL CENTER       CT           15.6            12.1         11.4
       DC.HospitalName DC.State DC.HeartAttack DC.HeartFailure DC.Pneumonia DE.HospitalName DE.State DE.HeartAttack
    NA            <NA>     <NA>           <NA>            <NA>         <NA>            <NA>     <NA>           <NA>
       DE.HeartFailure DE.Pneumonia                FL.HospitalName FL.State FL.HeartAttack FL.HeartFailure FL.Pneumonia
    
  4. 编辑:

    dput的示例输出(head(finaltab))

        structure(list(AK = structure(list(HospitalName = c("PROVIDENCE ALASKA  MEDICAL CENTER", 
    "ALASKA REGIONAL HOSPITAL", "FAIRBANKS MEMORIAL HOSPITAL", "ALASKA NATIVE MEDICAL CENTER", 
    "MAT-SU REGIONAL MEDICAL CENTER"), State = c("AK", "AK", "AK", 
    "AK", "AK"), HeartAttack = c("13.4", "14.5", "15.5", "15.7", 
    "17.7"), HeartFailure = c("12.4", "13.4", "15.6", "11.6", "11.4"
    ), Pneumonia = c("10.5", "12.5", "13.4", "15.5", "12.1")), .Names = c("HospitalName", 
    "State", "HeartAttack", "HeartFailure", "Pneumonia"), row.names = c(99L, 
    103L, 102L, 106L, 100L), class = "data.frame"), AL = structure(list(
        HospitalName = c("CRESTWOOD MEDICAL CENTER", "BAPTIST MEDICAL CENTER EAST", 
        "SOUTHEAST ALABAMA MEDICAL CENTER", "GEORGIANA HOSPITAL", 
    

    lapply上的rbind输出

    AK     AL     AR     AZ     CA     CO     CT     DC     DE     FL     GA     GU     HI     IA     ID     IL     IN    
    [1,] List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5
         KS     KY     LA     MA     MD     ME     MI     MN     MO     MS     MT     NC     ND     NE     NH     NJ     NM    
    [1,] List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5
         NV     NY     OH     OK     OR     PA     PR     RI     SC     SD     TN     TX     UT     VA     VI     VT     WA    
    [1,] List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5 List,5
         WI     WV     WY    
    [1,] List,5 List,5 List,5
    

1 个答案:

答案 0 :(得分:0)

df.meeshu有输出的输出。您的代码没有任何关于将您拥有的csv文件转换为dput输出的信息。所以我只是使用了dput。 使用您自己的数据框,这是finaltab代替df.meeshu

df.list.select <-lapply(df.meeshu, function(x) x[4,])
df.select <-do.call("rbind", df.list.select)
head(df.select)

你也可以使用plyr,这可能会更快

library(plyr)
rbind.fill(df.list.select)