使用Python在嵌套结构中的两个大括号之间提取文本

时间:2019-01-20 13:29:49

标签: python

作为为飞行模拟器创建“任务生成器”的第一步,我希望能够提取任务模板的片段,以便我可以更改或删除某些内容,将其重新放在一起,然后生成一个新的“任务”文件。 我的Python技能很少。我不需要一个可行的解决方案,但希望有一个进一步研究的方向。这是挑战:

这是输入文件的(简化)样本:

test_str = ("Group\n"
    "{\n"
    "   Name = \"Group 1\";\n"
    "   Index = 2;\n"
    "   Desc = \"Description\";\n"
    "   Block\n"
    "   {\n"
    "       Name = \"Block 1\";\n"
    "       Index = 497;\n"
    "       XPos = 171568.472;\n"
    "       YPos = 0.000;\n"
    "       ZPos = 204878.718;\n"
    "   }\n"
    "\n"
    "   Block\n"
    "   {\n"
    "       Name = \"Block 2\";\n"
    "       Index = 321;\n"
    "       XPos = 162268.472;\n"
    "       YPos = 0.000;\n"
    "       ZPos = 203478.718;\n"
    "   }\n"
    "\n"
    "}\n"
    "\n"
    "Group\n"
    "{\n"
    "   Name = \"Group 2\";\n"
    "   Index = 5;\n"
    "   Desc = \"Description\";\n"
    "   Block\n"
    "   {\n"
    "       Name = \"Block 3\";\n"
    "       Index = 112;\n"
    "       XPos = 122268.472;\n"
    "       YPos = 0.000;\n"
    "       ZPos = 208878.718;\n"
    "   }\n"
    "\n"
    "   Block\n"
    "   {\n"
    "       Name = \"Block 4\";\n"
    "       Index = 214;\n"
    "       XPos = 159868.472;\n"
    "       YPos = 0.000;\n"
    "       ZPos = 202678.718;\n"
    "   }\n"
    "\n"
    "}\n")

如您所见,该文件包含许多可以分组的对象(“块”)。这是一个嵌套结构,因为组也可以分组(此处未显示)。 如何根据某个名称将其隔离?

因此,假设我只想在输出文件中使用“第2组”,那么我想得到的结果是:

Group
{
   Name = "Group 2";
   Index = 5;
   Desc = "Description";
   Block
   {
       Name = "Block 3";
       Index = 112;
       XPos = 122268.472;
       YPos = 0.000;
       ZPos = 208878.718;
   }

   Block
   {
       Name = "Block 4";
       Index = 214;
       XPos = 159868.472;
       YPos = 0.000;
       ZPos = 202678.718;
   }

}

对于组内给定块的类似问题。

2 个答案:

答案 0 :(得分:0)

我对python也很陌生,但是我会尽力为您提出解决方案。我已经将您的test_str = ...复制到了input.txt文件,并用python加载了它,然后使用了read()方法来重新创建字符串。我相信您正在寻找的是find()方法,该方法返回您正在寻找的子字符串的确切位置(在这种情况下-组和块)。找到所需的组或块后,您可以使用字符串切片,就像我在第blockData = allData[iWantThisBlock:nextBlock]行中使用的那样,仅将字符串的一部分存储到新变量中。下面的代码将从您的字符串中打印出块1。您可以使用相同的方法从字符串中获取组,其他块或参数。我真的希望这至少对您有所帮助:)

import os

os.chdir('D:\\')
fileDir = 'D:\\input.txt'
inputFile = open(fileDir, 'r')

allData = inputFile.read()

iWantThisBlock = allData.find('Block 1')
nextBlock = allData.find('Block 2')

blockData = allData[iWantThisBlock:nextBlock]

print(blockData)

答案 1 :(得分:0)

我发现以下方法可以工作,因为它将根据其名称返回正确的组。不是很通用,但是现在可以使用:)。基本上,它会计算“ {}”的集合数,直到找到属于第一集合的集合(来自:https://stackoverflow.com/a/2780461/10940433):

def findGroup( mission, name ):
    start_group = mission.find("   Name = \""+name)
    mission ="Group\n{\n"+mission[start_group:len(mission)]
    if '{' in mission:
      match = mission.split('{',1)[1]
      open = 1
      for index in range(len(match)):
         if match[index] in '{}':
            open = (open + 1) if match[index] == '{' else (open - 1)
         if not open:
            return "Group\n{"+match[:index]+"}\n"

test_str = ("Group\n"
    "{\n"
    "   Name = \"Group 1\";\n"
    "   Index = 2;\n"
    "   Desc = \"Description\";\n"
    "   Block\n"
    "   {\n"
    "       Name = \"Block 1\";\n"
    "       Index = 497;\n"
    "       XPos = 171568.472;\n"
    "       YPos = 0.000;\n"
    "       ZPos = 204878.718;\n"
    "   }\n"
    "\n"
    "   Block\n"
    "   {\n"
    "       Name = \"Block 2\";\n"
    "       Index = 321;\n"
    "       XPos = 162268.472;\n"
    "       YPos = 0.000;\n"
    "       ZPos = 203478.718;\n"
    "   }\n"
    "\n"
    "}\n"
    "\n"
    "Group\n"
    "{\n"
    "   Name = \"Group 2\";\n"
    "   Index = 5;\n"
    "   Desc = \"Description\";\n"
    "   Block\n"
    "   {\n"
    "       Name = \"Block 3\";\n"
    "       Index = 112;\n"
    "       XPos = 122268.472;\n"
    "       YPos = 0.000;\n"
    "       ZPos = 208878.718;\n"
    "   }\n"
    "\n"
    "   Block\n"
    "   {\n"
    "       Name = \"Block 4\";\n"
    "       Index = 214;\n"
    "       XPos = 159868.472;\n"
    "       YPos = 0.000;\n"
    "       ZPos = 202678.718;\n"
    "   }\n"
    "\n"
    "}\n")


print (findGroup(test_str,"Group 2"))