我正在尝试将列表排序到一个列表中,该列表包含节,子节和子子节的数字和名称。该计划如下:
import heapq
sections = ['1. Section', '2. Section', '3. Section', '4. Section', '5. Section', '6. Section', '7. Section', '8. Section', '9. Section', '10. Section', '11. Section', '12. Section']
subsections = ['1.1 Subsection', '1.2 Subsection', '1.3 Subsection', '1.4 Subsection', '2.1 Subsection', '4.1 My subsection', '7.1 Subsection', '8.1 Subsection', '12.1 Subsection']
subsubsections = ['1.2.1 Subsubsection', '1.2.2 Subsubsection', '1.4.1 Subsubsection', '2.1.1 Subsubsection', '7.1.1 Subsubsection', '8.1.1 Subsubsection', '12.1.1 Subsubsection']
sorted_list = list(heapq.merge(sections, subsections, subsubsections))
print(sorted_list)
我得到的是:
['1. Section', '1.1 Subsection', '1.2 Subsection', '1.2.1 Subsubsection', '1.2.2 Subsubsection', '1.3 Subsection', '1.4 Subsection', '1.4.1 Subsubsection', '2. Section', '2.1 Subsection', '2.1.1 Subsubsection', '3. Section', '4. Section', '4.1 My subsection', '5. Section', '6. Section', '7. Section', '7.1 Subsection', '7.1.1 Subsubsection', '8. Section', '8.1 Subsection', '12.1 Subsection', '8.1.1 Subsubsection', '12.1.1 Subsubsection', '9. Section', '10. Section', '11. Section', '12. Section']
我的第12小节,子小节位于第8节,而不是第12节。
为什么会这样?原始列表已经过排序,一切顺利,显然达到了10个。
我不确定为什么会发生这种情况,并且有办法更好地将其分类为“树”。基于列表中的数字?我正在构建一个各种各样的目录,这将返回(一旦我将列表过滤掉)
1. Section
1.1 Subsection
1.2 Subsection
1.2.1 Subsubsection
1.2.2 Subsubsection
1.3 Subsection
1.4 Subsection
1.4.1 Subsubsection
2. Section
2.1 Subsection
2.1.1 Subsubsection
3. Section
4. Section
4.1 My subsection
5. Section
6. Section
7. Section
7.1 Subsection
7.1.1 Subsubsection
8. Section
8.1 Subsection
12.1 Subsection
8.1.1 Subsubsection
12.1.1 Subsubsection
9. Section
10. Section
11. Section
12. Section
注意8.1小节背后的12.1小节和8.1.1小节后的12.1.1小节。
答案 0 :(得分:4)
您的列表可能会显示对人眼进行排序。但是对于Python,你的输入没有完全排序,因为它按字典顺序排序字符串 。这意味着'8'
按排序顺序排在'12.1'
之前,因为只比较了第一个字符。
因此,合并是完全正确的;在看到'8.1'
字符串后遇到以'8.1.1'
开头的字符串,但以section = lambda s: [int(d) for d in s.partition(' ')[0].split('.') if d]
heapq.merge(sections, subsections, subsubsections, key=section))
开头的字符串随后会被排序。
您必须使用键功能从字符串中提取整数元组才能正确排序:
key
请注意,>>> section = lambda s: [int(d) for d in s.partition(' ')[0].split('.') if d]
>>> sorted_list = list(heapq.merge(sections, subsections, subsubsections, key=section))
>>> from pprint import pprint
>>> pprint(sorted_list)
['1. Section',
'1.1 Subsection',
'1.2 Subsection',
'1.2.1 Subsubsection',
'1.2.2 Subsubsection',
'1.3 Subsection',
'1.4 Subsection',
'1.4.1 Subsubsection',
'2. Section',
'2.1 Subsection',
'2.1.1 Subsubsection',
'3. Section',
'4. Section',
'4.1 My subsection',
'5. Section',
'6. Section',
'7. Section',
'7.1 Subsection',
'7.1.1 Subsubsection',
'8. Section',
'8.1 Subsection',
'8.1.1 Subsubsection',
'9. Section',
'10. Section',
'11. Section',
'12. Section',
'12.1 Subsection',
'12.1.1 Subsubsection']
参数仅适用于Python 3.5及更高版本;你必须在早期版本中进行手动装饰 - 合并 - 不合理的舞蹈。
演示(使用Python 3.6):
import heapq
def _heappop_max(heap):
lastelt = heap.pop()
if heap:
returnitem = heap[0]
heap[0] = lastelt
heapq._siftup_max(heap, 0)
return returnitem
return lastelt
def _heapreplace_max(heap, item):
returnitem = heap[0]
heap[0] = item
heapq._siftup_max(heap, 0)
return returnitem
def merge(*iterables, key=None, reverse=False):
h = []
h_append = h.append
if reverse:
_heapify = heapq._heapify_max
_heappop = _heappop_max
_heapreplace = _heapreplace_max
direction = -1
else:
_heapify = heapify
_heappop = heappop
_heapreplace = heapreplace
direction = 1
if key is None:
for order, it in enumerate(map(iter, iterables)):
try:
next = it.__next__
h_append([next(), order * direction, next])
except StopIteration:
pass
_heapify(h)
while len(h) > 1:
try:
while True:
value, order, next = s = h[0]
yield value
s[0] = next() # raises StopIteration when exhausted
_heapreplace(h, s) # restore heap condition
except StopIteration:
_heappop(h) # remove empty iterator
if h:
# fast case when only a single iterator remains
value, order, next = h[0]
yield value
yield from next.__self__
return
for order, it in enumerate(map(iter, iterables)):
try:
next = it.__next__
value = next()
h_append([key(value), order * direction, value, next])
except StopIteration:
pass
_heapify(h)
while len(h) > 1:
try:
while True:
key_value, order, value, next = s = h[0]
yield value
value = next()
s[0] = key(value)
s[2] = value
_heapreplace(h, s)
except StopIteration:
_heappop(h)
if h:
key_value, order, value, next = h[0]
yield value
yield from next.__self__
键控合并很容易向后移植到Python 3.3和3.4:
def decorate(iterable, key):
for elem in iterable:
yield key(elem), elem
sorted = [v for k, v in heapq.merge(
decorate(sections, section), decorate(subsections, section)
decorate(subsubsections, section))]
decorate-sort-undecorate merge简单如下:
sorted()
由于您的输入已经排序,因此使用合并排序更有效。作为最后的手段,您可以使用from itertools import chain
result = sorted(chain(sections, subsections, subsubsections), key=section)
但是:
#include <iostream>
class Object
{
public:
virtual ~Object(void) {};
int compare(Object const& obj) const;
virtual bool operator==(Object const& integer) const = 0;
virtual bool operator<(Object const& integer) const = 0;
virtual bool operator>(Object const& integer) const = 0;
};
int Object::compare(Object const& obj) const
{
if(*this == obj)
return 0;
else if(*this < obj)
return -1;
else return 1;
}
class Integer: public Object
{
private:
int myInt;
public:
Integer(int i) : myInt(i) { };
virtual bool operator==(Object const& integer) const override;
virtual bool operator<(Object const& integer) const override;
virtual bool operator>(Object const& integer) const override;
};
bool Integer::operator==(Object const& integer) const
{
return myInt == dynamic_cast<Integer const&>(integer).myInt;
}
bool Integer::operator<(Object const& integer) const
{
return myInt < dynamic_cast<Integer const&>(integer).myInt;
}
bool Integer::operator>(Object const& integer) const
{
return myInt > dynamic_cast<Integer const&>(integer).myInt;
}
int main()
{
Integer a(2), b(2), c(3);
std::cout << a.compare(b) << std::endl;
std::cout << b.compare(c) << std::endl;
std::cout << c.compare(a) << std::endl;
}
答案 1 :(得分:4)
正如在其他答案中所解释的那样,您必须指定一个排序方法,否则python将按字典顺序对字符串进行排序。如果您使用的是python 3.5+,可以在key
函数中使用merge
参数,在python 3.5中 - 您可以使用itertools.chain
和sorted
,作为一般方法,您可以使用使用正则表达式来查找数字并将它们转换为int:
In [18]: from itertools import chain
In [19]: import re
In [23]: sorted(chain.from_iterable((sections, subsections, subsubsections)),
key = lambda x: [int(i) for i in re.findall(r'\d+', x)])
Out[23]:
['1. Section',
'1.1 Subsection',
'1.2 Subsection',
'1.2.1 Subsubsection',
'1.2.2 Subsubsection',
'1.3 Subsection',
'1.4 Subsection',
'1.4.1 Subsubsection',
'2. Section',
'2.1 Subsection',
'2.1.1 Subsubsection',
'3. Section',
'4. Section',
'4.1 My subsection',
'5. Section',
'6. Section',
'7. Section',
'7.1 Subsection',
'7.1.1 Subsubsection',
'8. Section',
'8.1 Subsection',
'8.1.1 Subsubsection',
'9. Section',
'10. Section',
'11. Section',
'12. Section',
'12.1 Subsection',
'12.1.1 Subsubsection']