我阅读了正则表达式和Hadley Wickham的stringr
和dplyr
包,但无法弄清楚如何让它工作。
我在数据框中有图书馆流通数据,电话号码作为字符变量。我想取大写首字母并将其作为一个新变量,将字母和句点之间的数字转换为第二个新变量。
Call_Num
HV5822.H4 C47 Circulating Collection, 3rd Floor
QE511.4 .G53 1982 Circulating Collection, 3rd Floor
TL515 .M63 Circulating Collection, 3rd Floor
D753 .F4 Circulating Collection, 3rd Floor
DB89.F7 D4 Circulating Collection, 3rd Floor
答案 0 :(得分:4)
使用stringi
包,这将是一个选项。由于您的目标停留在字符串的开头,stri_extract_first()
可以很好地工作。 [:alpha:]{1,}
表示包含多个字母表的字母序列。使用stri_extract_first()
,您可以识别第一个字母序列。同样,您可以使用stri_extract_first(x, regex = "\\d{1,}")
找到第一个数字序列。
x <- c("HV5822.H4 C47 Circulating Collection, 3rd Floor",
"QE511.4 .G53 1982 Circulating Collection, 3rd Floor",
"TL515 .M63 Circulating Collection, 3rd Floor",
"D753 .F4 Circulating Collection, 3rd Floor",
"DB89.F7 D4 Circulating Collection, 3rd Floor")
library(stringi)
data.frame(alpha = stri_extract_first(x, regex = "[:alpha:]{1,}"),
number = stri_extract_first(x, regex = "\\d{1,}"))
# alpha number
#1 HV 5822
#2 QE 511
#3 TL 515
#4 D 753
#5 DB 89
答案 1 :(得分:2)
怎么样
-(void) shareOnTwitterWithVideo:(NSDictionary*) params{
NSString *text = params[@"text"];
NSData* dataVideo = params[@"video"];
NSString *lengthVideo = [NSString stringWithFormat:@"%d", [params[@"length"] intValue]];
NSString* url = @"https://upload.twitter.com/1.1/media/upload.json";
__block NSString *mediaID;
if([[Twitter sharedInstance] session]){
TWTRAPIClient *client = [[Twitter sharedInstance] APIClient];
NSError *error;
// First call with command INIT
NSDictionary *message = @{ @"status":text,
@"command":@"INIT",
@"media_type":@"video/mp4",
@"total_bytes":lengthVideo};
NSURLRequest *preparedRequest = [client URLRequestWithMethod:@"POST" URL:url parameters:message error:&error];
[client sendTwitterRequest:preparedRequest completion:^(NSURLResponse *urlResponse, NSData *responseData, NSError *error){
if(!error){
NSError *jsonError;
NSDictionary *json = [NSJSONSerialization
JSONObjectWithData:responseData
options:0
error:&jsonError];
mediaID = [json objectForKey:@"media_id_string"];
client = [[Twitter sharedInstance] APIClient];
NSError *error;
NSString *videoString = [dataVideo base64EncodedStringWithOptions:0];
// Second call with command APPEND
message = @{@"command" : @"APPEND",
@"media_id" : mediaID,
@"segment_index" : @"0",
@"media" : videoString};
NSURLRequest *preparedRequest = [client URLRequestWithMethod:@"POST" URL:url parameters:message error:&error];
[client sendTwitterRequest:preparedRequest completion:^(NSURLResponse *urlResponse, NSData *responseData, NSError *error){
if(!error){
client = [[Twitter sharedInstance] APIClient];
NSError *error;
// Third call with command FINALIZE
message = @{@"command" : @"FINALIZE",
@"media_id" : mediaID};
NSURLRequest *preparedRequest = [client URLRequestWithMethod:@"POST" URL:url parameters:message error:&error];
[client sendTwitterRequest:preparedRequest completion:^(NSURLResponse *urlResponse, NSData *responseData, NSError *error){
if(!error){
client = [[Twitter sharedInstance] APIClient];
NSError *error;
// publish video with status
NSString *url = @"https://api.twitter.com/1.1/statuses/update.json";
NSMutableDictionary *message = [[NSMutableDictionary alloc] initWithObjectsAndKeys:text,@"status",@"true",@"wrap_links",mediaID, @"media_ids", nil];
NSURLRequest *preparedRequest = [client URLRequestWithMethod:@"POST" URL:url parameters:message error:&error];
[client sendTwitterRequest:preparedRequest completion:^(NSURLResponse *urlResponse, NSData *responseData, NSError *error){
if(!error){
NSError *jsonError;
NSDictionary *json = [NSJSONSerialization
JSONObjectWithData:responseData
options:0
error:&jsonError];
NSLog(@"%@", json);
}else{
NSLog(@"Error: %@", error);
}
}];
}else{
NSLog(@"Error command FINALIZE: %@", error);
}
}];
}else{
NSLog(@"Error command APPEND: %@", error);
}
}];
}else{
NSLog(@"Error command INIT: %@", error);
}
}];
}
}
答案 2 :(得分:2)
如果您想使用stringr
,解决方案可能如下所示:
df <- data.frame(Call_Num = c("HV5822.H4 C47 Circulating Collection, 3rd Floor", "QE511.4 .G53 1982 Circulating Collection, 3rd Floor", "TL515 .M63 Circulating Collection, 3rd Floor", "D753 .F4 Circulating Collection, 3rd Floor", "DB89.F7 D4 Circulating Collection, 3rd Floor"))
require(stringr)
matches = str_match(df$Call_Num, "([A-Z]+)(\\d+)\\s*\\.")
df2 <- data.frame(df, letter=matches[,2], number=matches[,3])
df2
## Call_Num letter number
## 1 HV5822.H4 C47 Circulating Collection, 3rd Floor HV 5822
## 2 QE511.4 .G53 1982 Circulating Collection, 3rd Floor QE 511
## 3 TL515 .M63 Circulating Collection, 3rd Floor TL 515
## 4 D753 .F4 Circulating Collection, 3rd Floor D 753
## 5 DB89.F7 D4 Circulating Collection, 3rd Floor DB 89
我不认为将str_match()
电话加入mutate()
dplyr
appendToStorage
值得付出努力,所以我就把它留在那里。或者使用rawr's solution.
答案 3 :(得分:2)
您可以使用 gsubfn 包中的 strapply :
library(gsubfn)
m <- strapply(as.character(df$Call_Num), '^([A-Z]+)(\\d+)',
~ c(id = x, num = y), simplify = rbind)
X <- as.data.frame(m, stringsAsFactors = FALSE)
# id num
# 1 HV 5822
# 2 QE 511
# 3 TL 515
# 4 D 753
# 5 DB 89