将子文件夹名称与文件名和输出文件夹python匹配

时间:2020-10-22 15:36:50

标签: python directory

我具有以下文件夹文件结构:

  • mainfolder_segment_polygon
    • folder_poly5numSeg
      • subfolder_compactness40
        • subfolder_aoi1
          • file_aoi1_seg0.shp
          • file_aoi1_seg1.shp
        • subfolder_aoi2
          • file_aoi2_seg0.shp
          • file_aoi2_seg1.shp
    • folder_poly6numSeg
      • subfolder_compactness40
        • subfolder_aoi1
          • file_aoi1_seg0.shp
          • file_aoi1_seg1.shp
        • subfolder_aoi2
          • file_aoi2_seg0.shp
          • file_aoi2_seg1.shp

我希望能够从同一文件夹(segment_polygon)加载所有文件,对其应用一个功能,然后导出到具有相同结构的另一组文件夹(segment_multipoly)。

  • r".\segmentation_aoi\segment_multipoly\multi5numSeg\compactness40\aoi1"中的文件应一起处理并导出到r".\segmentation_aoi\segment_polygon\poly6numSeg\compactness40\aoi2"

  • r".\segmentation_aoi\segment_multipoly\multi6numSeg\compactness40\aoi2"中的文件应一起处理并导出到input_path = os.path.join(src, "segment_polygon\\") output_path = os.path.join(src, "segment_multipoly\\") root = Path(input_path) for maindir, subdirs, shpfiles in os.walk(input_path): for shp in shpfiles: aoi_root, shp_ext = shp.split("_") for file in root.glob("*/*/*/*.shp"): part_path = Path(file).parts folder_numSeg_name = part_path[9] #here I get the subfolder "poly5numSeg", "poly6numSeg", etc folder_aoi_name = part_path[11] #here I get the subfolder "aoi1", "aoi2", etc... aoiprep_seg = part_path[12] # here I get the name of the file "aoi1_seg0.shp", aoi1_seg1.shp", etc if aoi_root == folder_aoi_name: '''apply a function to shp''' shp.to_file(os.path.join(output_path, folder_numSeg_name, "compactness40\\", folder_aoi_name, shp)

以此类推...

名称“ mainfolder”,“ folder”,“ subfolder”,“ file”仅用于指示名称所属的级别,但它们不是文件夹标签的一部分。

segment_polygon = os.path.join(output, "segment_polygon\\") # input path
segment_multipoly = os.path.join(output, "segment_multipoly\\") # output path

# 1. get aoi directories
aoi_dir = [path for path in glob.glob(os.path.join(segment_polygon, "*/*/*"))
           if os.path.isdir(path)]

# list to store the shapefiles to be intersected
input_list = []

for path in aoi_dir:
    # 2. get the files
    shp_paths = glob.glob(path + os.sep + '*.shp')
    for shp_path in shp_paths:
        # 3. do things with shp_path
        full_path, seg_shp = os.path.split(shp_path)
        aoi_folder = full_path[-5:] # aoi01, aoi02, aoi03....aoi25
        if seg_shp.startswith(aoi_folder):
            input_list.append(shp_path) # creates the new list with shapefiles that start with the same aoiX value
        auto_inter = gpd.GeoDataFrame.from_file(input_list[0]) #process shp
        for i in range(len(input_list)-1):
            mp = gpd.GeoDataFrame.from_file(input_list[i+1]) # process shp
            auto_inter = gpd.overlay(auto_inter, mp, how='intersection') #process shp
        print(f"shp included in the list:\n {input_list}")
            # 4. create your output file path
        print(full_path)
        output_path = full_path.replace("poly", "multi")
        N_output_path = output_path.replace("gon", "polygon")
        print(f"output_path:\n {N_output_path}")
        # make sure the directories exist
        if not os.path.exists(os.path.dirname(N_output_path)):
            os.makedirs(os.path.dirname(N_output_path), exist_ok=True)
            # create output file name
            multipoly_name = aoi_folder + ".shp"
            # export
            auto_inter.to_file(os.path.join(N_output_path, multipoly_name)) #export shp

我有点迷茫。 在Windows 10,Python 3中工作。感谢您的所有帮助。

脚本的更新

{{1}}

来自ygorg的合并更改。但是,它将所有shapefile用于交集。我只需要aoi1文件进行交集并保存在aoi1文件夹中。然后,使用aoi2 shapefile并将其保存在aoi2文件夹中,依此类推。这还行不通。

2 个答案:

答案 0 :(得分:1)

混合os.walkglob似乎很混乱。如果要处理每个aoiX文件夹。尝试首先列出所有这些目录,然后在每个目录中列出.shp文件,然后应用该函数,最后创建您的output_path并将其写入。

在处理文件时,最好分解不需要的东西。

# 1. get aoi directories
aoi_dir = [path for path in glob.glob('segment_polygon/*/*/*')
           if os.path.isdir(path)]
for path in aoi_dir:
    # 2. get the files
    shp_paths = glob.glob(path + os.sep + '*.shp')
    for shp_path in shp_paths:
        # 3. do things with shp_path
        # 4. create your output file path
        output_path = shp_path.replace('segment_polygon', 'segment_multipoly')
        # make sure the directories exist
        os.makedirs(os.path.dirname(output_path), exist_ok=True)
        # write in output file

并且总是在不进行任何处理或编写任何内容的情况下进行空运行,并打印路径,以便您确定会发生什么!

答案 1 :(得分:0)

我设法解决了这个问题。谢谢ygorg的投入。它引导我走了正确的路。

# Create a list of the subfolders of segment_polygon
poly_dir = [path for path in glob.glob(os.path.join(segment_polygon, "*/*/*"))
       if os.path.isdir(path)]

for aoi_poly in poly_dir:

    # define input folder
    input_subfolder = aoi_poly.split("segment_polygon\\")[1] # splits the path at "...\\" and keeps the tail (position:1)
    #print(f"input folder: {input_subfolder}")

    #define export folder
    export_subfolder = input_subfolder.replace("poly", "multi")
    export_folder = os.path.join(segment_multipoly, export_subfolder)
    #print(f"output folder: {export_folder}")

    # define name output shapefile
    numseg, compactness, aoi = [int(s) for s in re.findall(r'\d+', aoi_poly)] #extract only the integers from the "poly" path
    name_output = "aoi" + str(aoi)+ "_" + "numSeg"+ str(numseg) + "_c" + str(compactness) + ".shp" # str() is used to concatenate integers as part of the string
    #print(f"shapefile label: {name_output}")

    full_outputpath = os.path.join(export_folder, name_output)
    #print(f"full output path: {full_outputpath}")

    # intersect and merge all single polygons
    input_list = list(filter(lambda mpoly: mpoly.endswith('.shp'), os.listdir(aoi_poly)))

     ###### apply my function here ######

    # export
    filetoexport.to_file(full_outputpath)