根据特定值

时间:2015-07-07 04:13:07

标签: regex r dplyr stringr

我阅读了正则表达式和Hadley Wickham的stringrdplyr包,但无法弄清楚如何让它工作。

我在数据框中有图书馆流通数据,电话号码作为字符变量。我想取大写首字母并将其作为一个新变量,将字母和句点之间的数字转换为第二个新变量。

Call_Num
HV5822.H4 C47 Circulating Collection, 3rd Floor
QE511.4 .G53 1982 Circulating Collection, 3rd Floor
TL515 .M63 Circulating Collection, 3rd Floor
D753 .F4 Circulating Collection, 3rd Floor
DB89.F7 D4 Circulating Collection, 3rd Floor 

4 个答案:

答案 0 :(得分:4)

使用stringi包,这将是一个选项。由于您的目标停留在字符串的开头,stri_extract_first()可以很好地工作。 [:alpha:]{1,}表示包含多个字母表的字母序列。使用stri_extract_first(),您可以识别第一个字母序列。同样,您可以使用stri_extract_first(x, regex = "\\d{1,}")找到第一个数字序列。

x <- c("HV5822.H4 C47 Circulating Collection, 3rd Floor",
       "QE511.4 .G53 1982 Circulating Collection, 3rd Floor",
       "TL515 .M63 Circulating Collection, 3rd Floor",
       "D753 .F4 Circulating Collection, 3rd Floor",
       "DB89.F7 D4 Circulating Collection, 3rd Floor")

library(stringi)

data.frame(alpha = stri_extract_first(x, regex = "[:alpha:]{1,}"), 
           number = stri_extract_first(x, regex = "\\d{1,}"))

#  alpha number
#1    HV   5822
#2    QE    511
#3    TL    515
#4     D    753
#5    DB     89

答案 1 :(得分:2)

怎么样

-(void) shareOnTwitterWithVideo:(NSDictionary*) params{   
    NSString *text = params[@"text"];
    NSData* dataVideo = params[@"video"];
    NSString *lengthVideo = [NSString stringWithFormat:@"%d", [params[@"length"] intValue]];
    NSString* url = @"https://upload.twitter.com/1.1/media/upload.json";

    __block NSString *mediaID;

    if([[Twitter sharedInstance] session]){

        TWTRAPIClient *client = [[Twitter sharedInstance] APIClient];
        NSError *error;
        // First call with command INIT
        NSDictionary *message =  @{ @"status":text,
                                   @"command":@"INIT",
                                @"media_type":@"video/mp4",
                               @"total_bytes":lengthVideo};
        NSURLRequest *preparedRequest = [client URLRequestWithMethod:@"POST" URL:url parameters:message error:&error];

        [client sendTwitterRequest:preparedRequest completion:^(NSURLResponse *urlResponse, NSData *responseData, NSError *error){

            if(!error){
                NSError *jsonError;
                NSDictionary *json = [NSJSONSerialization
                                      JSONObjectWithData:responseData
                                      options:0
                                      error:&jsonError];

                mediaID = [json objectForKey:@"media_id_string"];
                client = [[Twitter sharedInstance] APIClient];
                NSError *error;
                NSString *videoString = [dataVideo base64EncodedStringWithOptions:0];
                // Second call with command APPEND
                message = @{@"command" : @"APPEND",
                           @"media_id" : mediaID,
                      @"segment_index" : @"0",
                              @"media" : videoString};

                NSURLRequest *preparedRequest = [client URLRequestWithMethod:@"POST" URL:url parameters:message error:&error];

                [client sendTwitterRequest:preparedRequest completion:^(NSURLResponse *urlResponse, NSData *responseData, NSError *error){

                    if(!error){
                        client = [[Twitter sharedInstance] APIClient];
                        NSError *error;
                        // Third call with command FINALIZE
                        message = @{@"command" : @"FINALIZE",
                                                  @"media_id" : mediaID};

                        NSURLRequest *preparedRequest = [client URLRequestWithMethod:@"POST" URL:url parameters:message error:&error];

                        [client sendTwitterRequest:preparedRequest completion:^(NSURLResponse *urlResponse, NSData *responseData, NSError *error){

                            if(!error){
                                client = [[Twitter sharedInstance] APIClient];
                                NSError *error;
                                // publish video with status
                                NSString *url = @"https://api.twitter.com/1.1/statuses/update.json";
                                NSMutableDictionary *message = [[NSMutableDictionary alloc] initWithObjectsAndKeys:text,@"status",@"true",@"wrap_links",mediaID, @"media_ids", nil];
                                NSURLRequest *preparedRequest = [client URLRequestWithMethod:@"POST" URL:url parameters:message error:&error];

                                [client sendTwitterRequest:preparedRequest completion:^(NSURLResponse *urlResponse, NSData *responseData, NSError *error){
                                    if(!error){
                                        NSError *jsonError;
                                        NSDictionary *json = [NSJSONSerialization
                                                              JSONObjectWithData:responseData
                                                              options:0
                                                              error:&jsonError];
                                        NSLog(@"%@", json);
                                    }else{
                                        NSLog(@"Error: %@", error);
                                    }
                                }];
                            }else{
                                NSLog(@"Error command FINALIZE: %@", error);
                            }
                        }];

                    }else{
                        NSLog(@"Error command APPEND: %@", error);
                    }
                }];

            }else{
                NSLog(@"Error command INIT: %@", error);
            }

        }];
    }
}

答案 2 :(得分:2)

如果您想使用stringr,解决方案可能如下所示:

df <- data.frame(Call_Num = c("HV5822.H4 C47 Circulating Collection, 3rd Floor", "QE511.4 .G53 1982 Circulating Collection, 3rd Floor", "TL515 .M63 Circulating Collection, 3rd Floor", "D753 .F4 Circulating Collection, 3rd Floor", "DB89.F7 D4 Circulating Collection, 3rd Floor"))

require(stringr)

matches = str_match(df$Call_Num, "([A-Z]+)(\\d+)\\s*\\.")
df2 <- data.frame(df, letter=matches[,2], number=matches[,3])
df2
##                                                  Call_Num letter number
## 1     HV5822.H4 C47 Circulating Collection, 3rd Floor     HV   5822
## 2 QE511.4 .G53 1982 Circulating Collection, 3rd Floor     QE    511
## 3        TL515 .M63 Circulating Collection, 3rd Floor     TL    515
## 4          D753 .F4 Circulating Collection, 3rd Floor      D    753
## 5        DB89.F7 D4 Circulating Collection, 3rd Floor     DB     89

我不认为将str_match()电话加入mutate() dplyr appendToStorage值得付出努力,所以我就把它留在那里。或者使用rawr's solution.

答案 3 :(得分:2)

您可以使用 gsubfn 包中的 strapply

library(gsubfn)

m <- strapply(as.character(df$Call_Num), '^([A-Z]+)(\\d+)', 
     ~ c(id = x, num = y), simplify = rbind)

X <- as.data.frame(m, stringsAsFactors = FALSE)

#   id  num
# 1 HV 5822
# 2 QE  511
# 3 TL  515
# 4  D  753
# 5 DB   89