在Python中如何将文件名中的某些字符列入白名单?

时间:2015-09-04 18:09:57

标签: python regex string

为了保护上传的图片名称,我想从除string.ascii_lettersstring.digits,点和(一)空格之外的任何内容中删除图片的文件名。

所以我想知道检查文本与其他角色的最佳方法是什么?

3 个答案:

答案 0 :(得分:3)

import re
import os
s = 'asodgnasAIDID12313%*(@&(!$ 1231'
result = re.sub('[^a-zA-Z\d\. ]|( ){2,}','',s )
if result =='' or os.path.splitext(result)[0].isspace():
    print "not a valid name"
else:
    print "valid name"

编辑:

更改了它,因此它还会将一个空白列入白名单+添加导入重新

答案 1 :(得分:1)

不确定这是否是您需要的,但请尝试一下:

import sys, os

fileName, fileExtension = os.path.splitext('image  11%%22.jpg')
fileExtension = fileExtension.encode('ascii', 'ignore')
fileName = fileName.encode('ascii', 'ignore')
if fileExtension[1:] in ['jpg', 'jpeg', 'png', 'gif', 'bmp', 'tiff', 'tga']:
    fileName = ''.join(e for e in fileName if e.isalnum())
    print fileName+fileExtension
    #image1122.jpg
else:
    print "Extension not supported"
isalnum()

https://docs.python.org/2/library/stdtypes.html#str.isalnum

答案 2 :(得分:0)

我不会使用正则表达式。唯一棘手的要求是单一空间,但也可以这样做。

import string

whitelist = set(string.ascii_letters + string.digits)
good_filename = "herearesomelettersand123numbers andonespace"
bad_filename = "symbols&#! and more than one space"

def strip_filename(fname, whitelist):
    """Strips a filename

    Removes any character from string `fname` and removes all but one
    whitespace.
    """

    whitelist.add(" ")

    stripped = ''.join([ch for ch in fname if ch in whitelist])
    split = stripped.split()
    result = " ".join([split[0], ''.join(split[1:])])
    return result

然后用:

调用它
good_sanitized = strip_filename(good_filename, whitelist)
bad_sanitized = strip_filename(bad_filename, whitelist)
print(good_sanitized)
# 'herearesomelettersand123numbers andonespace'
print(bad_sanitized)
# 'symbols andmorethanonespace'