消除python中的重音

时间:2015-04-28 00:16:43

标签: python unicode diacritics

我有这个功能来删除单词中的重音

def remove_accents(word):
    return ''.join(x for x in unicodedata.normalize('NFKD', word) if x in string.ascii_letters)

但是当我运行它时会显示错误

UnicodeDecodeError: 'ascii' codec can't decode byte 0xf3 in position 3: ordinal not in range(128)

位置3中的角色是:ó

1 个答案:

答案 0 :(得分:1)

如果您的输入是unicode字符串,则可以:

- (void)prepareForSegue:(UIStoryboardSegue *)segue sender:(id)sender {

    if ( [segue isKindOfClass: [SWRevealViewControllerSegue class]] ) {
        SWRevealViewControllerSegue *swSegue = (SWRevealViewControllerSegue*) segue;

        swSegue.performBlock = ^(SWRevealViewControllerSegue* rvc_segue, UIViewController* svc, UIViewController* dvc) {

        UINavigationController* navController = (UINavigationController*)self.revealViewController.frontViewController;
        [navController setViewControllers: @[dvc] animated: NO ];
        self.revealViewController.rearViewRevealWidth=160;
        self.revealViewController.rightViewRevealWidth=160;
        [self.revealViewController setFrontViewPosition: FrontViewPositionLeft animated: YES];
    };

    NSIndexPath *indexPath = [self.tableView indexPathForSelectedRow];
    NSUInteger selectedSection = indexPath.section;
    WebViewController *webController = segue.destinationViewController;

    switch (selectedSection) {
        case 0:
            webController.webPage = [firstSectionURLs objectAtIndex:indexPath.row];
            break;
        case 1:
            webController.webPage = [secondSectionURLs objectAtIndex:indexPath.row];
            break;
        case 2:
            [self createMail];
            break;
        default:
            break;
    }
}

如果不是,则不然。我没有得到你描述的错误,我得到一个TypeError,只有当我尝试通过执行

将其转换为unicode时才获取UnicodeDecodeError
>>> remove_accents(u"foóbar")
u'foobar'

如果那是你的问题,即你有Python 2 >>> remove_accents(unicode("foóbar")) Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2: ordinal not in range(128) 个对象作为输入,你可以先把它解码为utf-8来解决它:

str