Question

我需要从xml字段组装一个长文本字符串。

XML_FIELD_ONE =“Iamacatthatisoddlyimmunetocatnip”

XML_FILED_TWO = [7,8,24]

FILED_TWO包含插入\ n或\ r \ n的索引。如果两个索引相隔1（如7,8），那么我需要插入\ r \ n。如果索引是独奏（如24），我需要插入\ n。

使用此代码处理25K行文件大约需要2分钟。我做错了什么？

XML_FIELD_ONE = list("Iamacatthatisoddlyimmunetocatnip")
XML_FILED_TWO = [7,8,24]

idx = 0
while idx <= len(XML_FIELD_ONE):
   for position in XML_FIELD_ONE:
       for space in XML_FIELD_TWO:

             if idx == int(space) and idx+1 == int(space)+1:
               XML_FIELD_ONE[idx] = "\r"

                        try:
                            XML_FIELD_ONE[idx+1] = "\n"
                        except:
                            pass

              elif idx == int(space):
                 XML_FIELD_ONE[idx] = "\n"

    idx += 1


new_text = "".join(XML_FIELD_ONE)
return new_text

这样做的简单方法是：

for offset in XML_FILED_TWO:
    XML_FILED_ONE[offset] = \n

但这违反了“如果两个偏移在一起，第一个是\ r，下一个是\ n”

Answer 1

当你只需要一个时，你写了一个三重循环;这是非常低效的。您确切地知道插入新项目的位置：直接转到那里，而不是递增两个计数器以找到该位置。

我不确定您需要插入的确切位置，但这应该很接近。要保持原始索引正确，您需要从右端插入并向左侧工作;这就是我反转XML_FIELD_TWO的原因。

我离开了我的调试打印语句。

XML_FIELD_ONE = list("Iamacatthatisoddlyimmunetocatnip")
XML_FIELD_TWO = [7,8,24]

print XML_FIELD_ONE
XML_FIELD_TWO = XML_FIELD_TWO[::-1]
print XML_FIELD_TWO
i = 0
while i < len(XML_FIELD_TWO):
    print i, XML_FIELD_TWO[i]
    if XML_FIELD_TWO[i] - XML_FIELD_TWO[i+1] == 1:
        XML_FIELD_ONE.insert(XML_FIELD_TWO[i], '\r\n')
        i += 2
    else:
        XML_FIELD_ONE.insert(XML_FIELD_TWO[i], '\n')
        i += 1

    print "\n", ''.join(XML_FIELD_ONE)

输出：

['I', 'a', 'm', 'a', 'c', 'a', 't', 't', 'h', 'a', 't', 'i', 's', 'o', 'd', 'd', 'l', 'y', 'i', 'm', 'm', 'u', 'n', 'e', 't', 'o', 'c', 'a', 't', 'n', 'i', 'p']
[24, 8, 7]
0 24

Iamacatthatisoddlyimmune
tocatnip
1 8

Iamacatt
hatisoddlyimmune
tocatnip

索引/抵消表现很糟糕 - 我做错了什么？蟒蛇

1 个答案: