R:选择具有相同通用模式的字符串

时间:2016-06-28 09:36:22

标签: r pattern-matching stringr

我有一个strings列表,如下所示:

> with(providers, head(Provider.Name, 30))
 [1] 1st Care (UK) Limited                 
 [2] 1st Care Limited                      
 [3] 229 Mitcham Lane Limited              
 [4] 24-7 Care Ltd                         
 [5] 3 Dimensions Care Limited             
 [6] 3 Trees Community Support Limited     
 [7] 365 Care Homes Limited                
 [8] 3A Care (Solihull) Limited            
 [9] 3L Care Limited                       
[10] 5 Star TLC Limited                    
[11] 92 Higher Drive Limited               
[12] A & I Care Home Ltd                   
[13] A & L Care Homes Limited              
[14] A & N Kachra                          
[15] A & R Care Limited                    
[16] A Better Carehome Ltd                 
[17] A.G.E. Nursing Homes Limited          
[18] A.R.M. Healthcare Limited             
[19] AAA Elderly Care Limited              
[20] AAA Medics Ltd                        
[21] Aadams Residential Care Home Limited  
[22] Abacus Quality Care Ltd               
[23] Abberdale Limited                     
[24] Abbeville RCH Limited                 
[25] Abbey Care Centre Limited             
[26] Abbey Care Direct Ltd                 
[27] Abbey Care Home Limited               
[28] Abbey Healthcare (Aaron Court) Limited
[29] Abbey Healthcare (Kendal) Limited     
[30] Abbey Healthcare (Knebworth) Ltd  

我的目标是确定遵循类似模式的观察结果,然后使用此模式对其进行相应重命名。理想情况下,输出应类似于以下内容(请特别注意观察结果122530

> with(providers, head(Provider.Name, 30))
     [1] 1st Care Limited                 
     [2] 1st Care Limited                      
     [3] 229 Mitcham Lane Limited              
     [4] 24-7 Care Ltd                         
     [5] 3 Dimensions Care Limited             
     [6] 3 Trees Community Support Limited     
     [7] 365 Care Homes Limited                
     [8] 3A Care (Solihull) Limited            
     [9] 3L Care Limited                       
    [10] 5 Star TLC Limited                    
    [11] 92 Higher Drive Limited               
    [12] A & I Care Home Ltd                   
    [13] A & L Care Homes Limited              
    [14] A & N Kachra                          
    [15] A & R Care Limited                    
    [16] A Better Carehome Ltd                 
    [17] A.G.E. Nursing Homes Limited          
    [18] A.R.M. Healthcare Limited             
    [19] AAA Elderly Care Limited              
    [20] AAA Medics Ltd                        
    [21] Aadams Residential Care Home Limited  
    [22] Abacus Quality Care Ltd               
    [23] Abberdale Limited                     
    [24] Abbeville RCH Limited                 
    [25] Abbey Care             
    [26] Abbey Care                 
    [27] Abbey Care              
    [28] Abbey Healthcare 
    [29] Abbey Healthcare    
    [30] Abbey Healthcare  

我的问题是我怎么能写出类似于"一般模式"这使得能够提取有效具有相同模式的观察结果。我尝试了str_extract,但我认为在编写一般模式时我遗漏了一些东西。

library(stringr)
home = "[a-zA-Z]{2,}" # Select general pattern that where the first 2 words are similar
test = with(providers, str_extract(Provider.Name, home))

有人知道R中是否有一个能够识别模式的功能吗?提前谢谢了。

0 个答案:

没有答案