Question

我过去曾问过类似的问题，但我不擅长，所以我会再次问你。

以下是示例textfile.txt

    dummy01234567890
    0987654321dummy 
    -------start-------(It is possible to modify)
    text line1
    text line2
    -------end---------(It is possible to modify)
    12345678910
    qwertyuiop        
    -------start-------(It is possible to modify)
    text line3
    text line4
    -------end---------(It is possible to modify)
    ;p12309809128309123
    dummyline1235567

我想解析

＆＃34;文本行1 \ n文本行2＆＃34; →array [0]

＆＃34;文本行3 \ n文本行4＆＃34; →array [1]

我应该如何在python中编码？

我应该两次使用拆分功能吗？

Answer 1

Finite-state machine适应性强，足以满足大多数需求。

state = 'init'
arrays = []
with open('textfile.txt') as f:
    lines = []
    for line in f.readlines():
        if state == 'init':  # seek for start
             word = line.strip().strip('-')
             if word != 'start':
                 continue
             state = 'start'
             lines = []
        elif state == 'start':  # start parsing now
             word = line.strip().strip('-')
             if word != 'end':
                 lines.append(line.strip())
                 continue
             # end current parsing now
             arrays.append('\n'.join(lines))
             state = 'init'

Answer 2

你可以做这样的事情来达到预期的效果：

#admin.py
class SomeModelAdmin(admin.ModelAdmin):
    form = SomeModelForm
    search_fields = []
    def get_search_results(self, request, queryset, search_term):
        new_queryset, use_distinct = super(SomeModelAdmin, self).\
        get_search_results(request, queryset, search_term)
        new_queryset |= 
            queryset.filter(SomeOtherModel__name__icontains=search_term)
        return new_queryset, use_distinct

这将导致：

text = """dummy01234567890
    0987654321dummy 
    -------start-------(It is possible to modify)
    text line1
    text line2
    -------end---------(It is possible to modify)
    12345678910
    qwertyuiop        
    -------start-------(It is possible to modify)
    text line3
    text line4
    -------end---------(It is possible to modify)
    ;p12309809128309123
    dummyline1235567"""

text_list = text.splitlines()
print(['\n'.join([text_list[3+i*6].strip(), text_list[4+i*6].strip()]) for i in xrange(len(text_list)/6)])

在文本文件中，如何使用python解析特定模式中的多线？

2 个答案: