数据框列名称中的句点后的大写字母

时间:2018-07-31 04:44:32

标签: r regex

我有一个R数据框,其中包含一些体育数据以及以下列名称:

 colnames(my_dataframe)
 [1] "id"                               "firstName"                        "lastName"                        
 [4] "position"                         "jerseyNumber"                     "currentTeam.id"                  
 [7] "currentTeam.abbreviation"         "currentRosterStatus"              "height"                          
[10] "weight"                           "birthDate"                        "age"                             
[13] "birthCity"                        "birthCountry"                     "rookie"                          
[16] "handedness.shoots"                "college"                          "twitter"                         
[19] "currentInjury.description"        "currentInjury.playingProbability" "id"                              
[22] "abbreviation"                     "fg2PtAtt"                         "fg3PtAtt"                        

某些列名称中包含句点。对于这些名称,我想删除句点,并在列名中的任何句点之后大写字母。例如,这里的第6列是currentTeam.id,我想将其更新为currentTeamId

my_dataframe %>% dplyr::rename_all(. %>% gsub('\\.', '', .))

...这只是删除列名中的所有句点,但不将句点后面的字母大写。

2 个答案:

答案 0 :(得分:5)

我们可以使用sub来匹配.,后跟一个字符(作为一个组捕获),然后在替换中,更改后向引用(\\1)的大小写

sub("[.](.)", "\\U\\1", names(my_dataframe), perl = TRUE)
# [1] "id"                              "firstName"                      
# [3] "lastName"                        "position"                       
# [5] "jerseyNumber"                    "currentTeamId"                  
# [7] "currentTeamAbbreviation"         "currentRosterStatus"            
# [9] "height"                          "weight"                         
#[11] "birthDate"                       "age"                            
#[13] "birthCity"                       "birthCountry"                   
#[15] "rookie"                          "handednessShoots"               
#[17] "college"                         "twitter"                        
#[19] "currentInjuryDescription"        "currentInjuryPlayingProbability"
#[21] "id"                              "abbreviation"                   
#[23] "fg2PtAtt"                        "fg3PtAtt"     

答案 1 :(得分:3)

您可能需要查看janitor软件包。特别是clean_names函数。

library(janitor)
data.frame(currentTeam.id = 1:5, 
           currentInjury.playingProbability = 6:10) %>% 
  clean_names(case = "lower_camel")

  currentTeamId currentInjuryPlayingProbability
              1                               6
              2                               7
              3                               8
              4                               9
              5                              10

为您的数据,请尝试:

my_dataframe <- my_dataframe %>%
  clean_names(case = "lower_camel")