Python - 在文件路径中搜索文件夹名称

时间:2018-04-27 08:43:46

标签: python python-2.7

我需要根据文件地址中的文件夹名称识别“episodeNumber”。该名称包含2个字母“ep”,后跟3个数字,从“001”到“999”

示例路径:

print(episodeNumber)
'ep001'

我正在寻找一种方法来获得一个导致剧集编号的变量(本例中为ep001)

newPath = "Y://work/" + episodeNumber + "/comp/"
print(newPath)
'Y://work/ep001/comp/'

然后我可以把“episodeNumber”放在另一个地址中,比如

.img-zoom-container {
  display: inline;
}

如何识别路径中的剧集编号以将其放入另一个路径?

我正在使用python 2.7

谢谢

2 个答案:

答案 0 :(得分:2)

我发现正则表达式功能强大但可读性较差,特别是如果你暂时没有使用它,所以这里有一个没有它使用其他原生模块的解决方案。

import os
import fnmatch


def get_episode(path):
    # Split the path (should be OS independent).
    path_split = os.path.normpath(path).split(os.sep)

    # Find the episode using fnmatch.
    # This enforces your conventions that it must follow 3 numbers after 'ep'.
    episodes = fnmatch.filter(path_split, "ep[0-9][0-9][0-9]")
    if not episodes:
        raise RuntimeError, "Unable to detect episode in the supplied path."

    # In theory the path may yield multiple episodes from it, so just return the first one.
    return episodes[0]


episode = get_episode("N://out/ep001/FX/maya/file4984.ma")

# Use os.path.join to build your new path. 
new_path = os.path.join(os.path.normpath("Y://"), "work", episode, "comp")

此示例生成此结果:

  

' EP001' #episode

     

' Y:\工作\ EP001 \排版' #new_path(我在Windows上,所以我得到了双倍   反斜杠)

最好使用os.path方法使其跨平台工作,而不是使用+来构建路径。

这已经在Python 2.7.11上进行了测试

答案 1 :(得分:1)

from __future__ import print_function
# the above import is for python2 compatibility (question author request)
# it has to be at the top of the module
# you may need to install it with `pip install future`

import re

file_address = 'N://sessionY/ep001/out-montageFX/maya/'

# ok let's write a regex to find the part we want
# here (ep) part means that we want to capture `ep` to a `capture group`
# the (\d{3}) part means match the exactly 3 integers and capture them
# [\\/] matches backward or forward slashes (for windows support)
# more on regexes here: https://docs.python.org/3/library/re.html#module-re
# on match objects here: 
# https://docs.python.org/3/library/re.html#match-objects

regex = r'[\\/](ep)(\d{3})[\\/]'

# next we apply the expression with re.search
# (which means find the first matching occurrence)
# details here: 
# https://docs.python.org/3/library/re.html#regular-expression-objects

m = re.search(regex, file_address, flags=re.IGNORECASE)

# the flags=re.IGNORECASE part - to match case-insensitivelly

if m:  # if a match has been found

    # we can get the folder_name `ep` part from the capture group 1
    # and the episode number from the capture group 2

    folder_name = m.group(1) + m.group(2)

    # as explained above the episode number is within group 1
    # and we also need to convert it to integer (if desired)
    # the group is guaranteed to have the match in it, because 
    # we are inside the above if statement.
    episode_number = int(m.group(2))

    # lets print the results:
    print('folder_name:', folder_name)
    print('episode_number:', episode_number)
else:
    # obviously no match
    print('no match')