在凌乱的字符串数据库中解析和修复时间

时间:2017-08-05 08:02:23

标签: ios objective-c cocoa nsstring

我必须解析数据库中的10K条目。

该数据库有一个名为工作时间的字段,显示了汽车经销商的办公时间。

问题是这个字段包含这样的描述性内容:

Office working hours from 10 am - 4 pm
Open from 9AM to 5PM
Main Showroom from 10:00AM - 5:00PM
Open 10 AM to 13 PM
Office 10AM to 3PM -- Showroom 9AM to 4PM

所以你可以看到风格各不相同,ampm小写和大写,有空和无空,小时有零和冒号,没有它,甚至13 PM两者之间的混合产生错误时间的风格。换句话说,一团糟。此外,每行还有多个时间范围。

我想将整个事物转换为24小时格式,例如。

Office working hours from 10:00 - 16:00 hours
Open from 9:00 to 17:00 hours
Main Showroom from 10:00 - 17:00 hours
Open 10:00 to 13:00 hours
Office 10:00 to 15:00 -- Showroom 9:00 to 16:00 hours

我每小时都可以无限次地使用这些if

  if ([text containsString:@"7 PM"]) {
    text = [text stringByReplacingOccurrencesOfString:@"7 PM" withString:@"19:00"];
  }

但这将有数十亿行,效率不高。我将不得不测试大写,小写,有空格和没有空格以及错误的条目。

这一定是一种更简单的方法......

有什么想法吗?

1 个答案:

答案 0 :(得分:1)

The combination of NSRegularExpression and NSDateFormatter will come in handy for this job. The results are like this:

Results

13PM needs a manual editing, or there may be a way to fix that automatically.

Regex is not perfect, it will also capture things like 36 AM, but date formatter will return nil.

Here's the code, ready to run:

NSString *test1 = @"Office working hours from 10 am - 4 pm";
NSString *test2 = @"Open from 9AM to 5PM";
NSString *test3 = @"Main Showroom from 10:00AM - 5:00PM";
NSString *test4 = @"Open 10 AM to 13 PM";
NSString *test5 = @"Office 10AM to 3PM -- Showroom 9AM to 4PM";

NSMutableArray *newStrings = [NSMutableArray array];

// [0-9]+ -> Capture 1 or more digit
// (?:\\:[0-9]+)? -> Capture ":" optionally, if so capture 1 or more digit
// ( )* -> Capture 0 or more whitespace
// (am|pm) -> Case insensitive search, captures aM, Pm, AM, pm

NSString *hourPattern = @"([0-9]+(?:\\:[0-9]+)?( )*(am|pm))";
NSError *error = nil;
NSRegularExpression *miniFormatter = [NSRegularExpression
                                      regularExpressionWithPattern:hourPattern
                                      options:NSRegularExpressionCaseInsensitive | NSRegularExpressionSearch
                                      error:&error];

if(error)
{
    NSLog(@"%@", error.localizedDescription);
    return;
}

for(NSString *text in @[test1, test2, test3, test4, test5])
{
    NSArray<NSTextCheckingResult *> *matches = [miniFormatter matchesInString:text
                                                                      options:kNilOptions
                                                                        range:NSMakeRange(0, text.length)];

    NSString *textToChange = [text copy];

    for(NSTextCheckingResult *result in matches)
    {
        NSString *foundTime = [text substringWithRange:result.range];

        NSString *foundTimeOriginal = [foundTime copy]; // This will be used when finding the current range of the text.

        // Step 1: Remove whitespace for parsing.

        foundTime = [foundTime stringByReplacingOccurrencesOfString:@" " withString:@""];

        // Step 2: Make am/pm uppercase.

        foundTime = [foundTime uppercaseString];

        NSDateFormatter *dateFormatter = [NSDateFormatter new];
        [dateFormatter setTimeZone:[NSTimeZone timeZoneWithName:@"GMT"]];   // You may change it accordingly.

        NSDate *foundDate;

        // Step 3: Detect if it's in hh:mm format or hh format.

        if([foundTime containsString:@":"])
        {
            // hh:mm format

            [dateFormatter setDateFormat:@"hh:mma"];
        }
        else
        {
            // hh format

            [dateFormatter setDateFormat:@"hha"];
        }

        foundDate = [dateFormatter dateFromString:foundTime];

        if(!foundDate)
        {
            // There's a problem with parsing (such as 13PM).
            // Proceeding manually...

            continue;
        }

        //NSLog(@"%@ : %@", foundTime, foundDate);

        // Step 4: Convert to 24-Hour

        [dateFormatter setDateFormat:@"HH:mm"];

        NSString *convertedTime = [dateFormatter stringFromDate:foundDate];

        NSRange currentRange = [textToChange rangeOfString:foundTimeOriginal];
        textToChange = [textToChange stringByReplacingCharactersInRange:currentRange withString:convertedTime];
    }

    [newStrings addObject:textToChange];
}

for(NSString *text in newStrings)
{
    NSLog(@"%@", text);
}

Hope this helps.