解析来自javascript前端的base64字符串(data-uri,rfc2397)

时间:2013-12-11 11:04:43

标签: javascript python base64

我的javascript前端正在发送base64编码的字符串:

data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAM8AAADkCAIAAACwiOf9AAAAA3NCSVQICAjb4U/gAAAgAElEQVR4nO...`

我需要获取base64数据,这意味着iVBORw0KGgoAAAANSUhEUgAAAM8AAADkCAIAAACwiOf9AAAAA3NCSVQICAjb4U/gAAAgAElEQVR4nO...。基本上需要删除data:image/png;base64,。我可以使用一些标准的python库来执行此操作,还是必须使用自己的正则表达式?

base64库只提供对编码/解码的支持,这不是我需要的:我想保留base64编码数据,只是没有前缀。

1 个答案:

答案 0 :(得分:0)

为了其他人的参考,我为此准备了一个小型库:

_compiled  = False
_compiled1 = None
_compiled2 = None
def compile_it():
    global _compiled, _compiled1, _compiled2
    if not _compiled:
        regex1 = r'^data:(?P<mediatype>[^\;]*);base64,(?P<data>.*)'
        regex2 = r'^data:(?P<mediatype>[^\;]*),(?P<data>.*)'
        _compiled = True
        _compiled1 = re.compile(regex1)
        _compiled2 = re.compile(regex2)

def clean_data_uri(data_in):
    # Clean base64 data coming from the frontend
    #     data:image/png;base64,iVBORw0KGgoAAAA... -> iVBORw0KGgoAAAA...
    # As specified in RFC 2397
    # http://stackoverflow.com/q/20517429/647991
    # http://en.wikipedia.org/wiki/Data_URI_scheme
    # http://tools.ietf.org/html/rfc2397
    #   Format is : data:[<mediatype>][;base64],<data>
    compile_it()
    try:
        m         = _compiled1.match(data_in)
        success   = True
        base64    = True
        mediatype = m.group('mediatype')
        data      = m.group('data')
    except:
        try:
            m         = _compiled2.match(data_in)
            success   = True
            base64    = False
            mediatype = m.group('mediatype')
            data      = m.group('data')
        except Exception, e:
            log.warning('clean_data_uri > Not possible to parse data_in > %s', e)
            success   = False
            base64    = False
            mediatype = None
            data      = None
    if not success:
        log.error('clean_data_uri > Problems splitting data')
    return success, mediatype, base64, data