如何通过删除名称开头的数字按字母顺序对数据进行排序

时间:2019-04-20 23:36:01

标签: r gsub

我试图按字母顺序对这些状态名称进行排序,同时保留状态名称左侧的数字。我目前无法弄清楚该如何做。

我尝试使用各种形式的gsub来尝试在排序之前删除数字而没有成功。

这是状态如下的数据集:

print(StateRankings)

# [1] "1. Arizona"         "10. Missouri"       "11. Tennessee"      "12. Florida"       
# [5] "13. West Virginia"  "14. Kentucky"       "15. New Hampshire"  "16. Mississippi"   
# [9] "17. Wyoming"        "18. Alabama"        "19. Idaho"           "2. Alaska"         
#[13] "20. Vermont"        "21. Indiana"        "22. Arkansas"       "23. Wisconsin"     
#[17] "24. South Carolina" "25. Nevada"         "26. North Carolina" "27. Michigan"      
#[21] "28. Louisiana"      "29. Ohio"           "3. Kansas"          "30. Maine"         
#[25] "31. Virginia"       "32. South Dakota"   "33. Pennsylvania"   "34. Oregon"        
#[29] "35. Nebraska"       "36. Iowa"           "37. New Mexico"     "38. Washington"    
#[33] "39. Colorado"       "4. Oklahoma"        "40. Illinois"       "41. Minnesota"     
#[37] "42. Delaware"       "43. Rhode Island"   "44. Maryland"       "45. Connecticut"   
#[41] "46. California"     "47. Hawaii"         "48. New Jersey"     "49. Massachusetts" 
#[45] "5. Montana"         "50. New York"       "6. Utah"             "7. North Dakota"   
#[49] "8. Texas"           "9. Georgia"

1 个答案:

答案 0 :(得分:2)

我们可以从字符向量中删除数字和点,然后使用order仅对名称进行排序,并对原始向量进行子集化。

StateRankings[order(sub("^\\d+\\.\\s+", "", StateRankings))]

#[1] "18. Alabama"  "2. Alaska"  "1. Arizona"  "12. Florida"  "19. Idaho"        
# 6] "14. Kentucky"  "16. Mississippi"  "10. Missouri" "15. New Hampshire"     
#[10] "11. Tennessee" "13. West Virginia" "17. Wyoming" 

仅供参考,R具有内置状态名称,该名称以升序存储在state.name

state.name
#[1] "Alabama"   "Alaska"  "Arizona"  "Arkansas"  "California" "Colorado"
#[7] "Connecticut"  "Delaware"  "Florida"   "Georgia"  "Hawaii" "Idaho"........

数据

StateRankings <- c("1. Arizona", "10. Missouri", "11. Tennessee" ,"12. Florida",
 "13. West Virginia" ,"14. Kentucky", "15. New Hampshire", "16. Mississippi",
 "17. Wyoming", "18. Alabama", "19. Idaho" ,"2. Alaska")