Python - 从不同标签列表中排除标签

时间:2017-09-18 03:04:12

标签: python workflow

我在iOS上有一个用于编辑的python脚本,我已经修改了,我想帮助进一步调整它。

我的.taskpaper文件位于编辑指向的Dropbox文件夹中。当我运行this工作流程时,脚本会搜索所有文件并返回包含" @ hardware"的行列表。这很好用,但最终列表中包含我已完成并添加了@done的@hardware项目。如何用@done排除@hardware行?

有七个文件在运行。这两个似乎是需要修改的:

生成主题标签列表

import editor
import console
import os
import re
import sys
import codecs
import workflow

pattern = re.compile(r'\s@{1}(\w+)', re.I|re.U)
p = editor.get_path()
from urllib import quote
dir = os.path.split(p)[0]
valid_extensions = set(['.taskpaper'])

tags = ['@hardware']

for w in os.walk(dir):
    dir_path = w[0]
    filenames = w[2]
    for name in filenames:
        full_path = os.path.join(dir_path, name)
        ext = os.path.splitext(full_path)[1]
        if ext.lower() in valid_extensions:
            try:
                with codecs.open(full_path, 'r', 'utf-8') as f:
                    for line in f:
                        for match in re.finditer(pattern, line):
                            tags.append(match.group(1))


            except UnicodeDecodeError, e:
                pass

workflow.set_output('\n'.join(sorted(set(tags))))

使用主题标签搜索文档

import editor
import console
import os
import re
import sys
import codecs
import workflow
from StringIO import StringIO

theme = editor.get_theme()
workflow.set_variable('CSS', workflow.get_variable('CSS Dark' if theme == 'Dark' else 'CSS Light'))

p = editor.get_path()
searchterm = workflow.get_variable('Search Term')
term = '@' + searchterm
pattern = re.compile(re.escape(term), flags=re.IGNORECASE)
from urllib import quote
dir = os.path.split(p)[0]
valid_extensions = set(['.taskpaper'])
html = StringIO()
match_count = 0
for w in os.walk(dir):
    dir_path = w[0]
    filenames = w[2]
    for name in filenames:
        full_path = os.path.join(dir_path, name)
        ext = os.path.splitext(full_path)[1]
        if ext.lower() not in valid_extensions:
            continue
        found_snippets = []
        i = 0
        try:
            with codecs.open(full_path, 'r', 'utf-8') as f:
                for line in f:
                    for match in re.finditer(pattern, line):
                        start = max(0, match.start(0) - 100)
                        end = min(len(line)-1, match.end(0) + 100)
                        snippet = (line[start:match.start(0)],
                                   match.group(0),
                                   line[match.end(0):end],
                                   match.start(0) + i,
                                   match.end(0) + i)
                        found_snippets.append(snippet)
                    i += len(line)
        except UnicodeDecodeError, e:
            pass
        if len(found_snippets) > 0:
            match_count += 1
            root, rel_path = editor.to_relative_path(full_path)
            ed_url = 'editorial://open/' + quote(rel_path.encode('utf-8')) + '?root=' + root
            html.write('<h2><a href="' + ed_url + '">' + name + '</a></h2>')
            for snippet in found_snippets:
                start = snippet[3]
                end = snippet[4]
                select_url = 'editorial://open/' + quote(rel_path.encode('utf-8')) + '?root=' + root
                select_url += '&selection=' + str(start) + '-' + str(end)
                html.write('<a class="result-box" href="' + select_url + '">' + snippet[0] + '<span class="highlight">' + snippet[1] + '</span>' + snippet[2] + '</a>')
if match_count == 0:
    html.write('<p>No matches found.</p>')

workflow.set_output(html.getvalue())

谢谢。

2 个答案:

答案 0 :(得分:0)

由于匹配行存储在列表中,因此您可以使用列表组合来排除您不想要的列表。像这样:

l = ['@hardware ttuff', 'stuff @hardware', 'things @hardware sett @done', '@hardware', '@hardware@ @done']
print(l)
['@hardware ttuff', 'stuff @hardware', 'things @hardware sett @done', '@hardware', '@hardware@ @done']
m = [ s for s in l if '@done' not in s]
print(m)
['@hardware ttuff', 'stuff @hardware', '@hardware']

答案 1 :(得分:0)

一位朋友为我解决了这个问题。

我们补充说: 如果不是“@done”排队: 在“使用hashtags搜索文档”文件之后 对于f中的行:

效果很好