Question

我想要完成的是使用pprint(dict(str_types))

仅打印10行而不是整个列表

这是我的代码

from collections import defaultdict

str_type_re = re.compile(r'\b\S+\.?$', re.IGNORECASE)

expected = ["Street", "Avenue", "Boulevard", "Drive", "Court", "Place", "Square", "Lane", "Road", 
            "Trail", "Parkway", "Commons"]

def audit_str_type(str_types, str_name, rex):
    stn = rex.search(str_name)
    if stn :
        str_type = stn.group()
        if str_type not in expected:
            str_types[str_type].add(str_name)

我定义了一个审计标签元素的函数，其中k =“addr：street”，并且任何标签元素都与is_str_name函数匹配。

def audit(osmfile,rex):
    osm_file = open(osmfile, "r", encoding="utf8")
    str_types = defaultdict(set)
    for event, elem in ET.iterparse(osm_file, events=("start",)):

        if elem.tag == "node" or elem.tag == "way":
            for tag in elem.iter("tag"):
                if is_str_name(tag):
                    audit_str_type(str_types, tag.attrib['v'],rex)

    return str_types

在上面的代码中，我使用“is_str_name”函数来调用审计函数来审计街道名称时过滤标记。

def is_str_name(elem):
    return (elem.attrib['k'] == "addr:street")

str_types = audit(mydata, rex = str_type_re)
pprint.pprint(dict(str_types[:10]))

Answer 1

使用pprint.pformat取回对象的字符串表示形式而不是直接打印它，然后您可以按行分割并只打印出前几个：

whole_repr = pprint.pformat(dict(str_types))

for line in whole_repr.splitlines()[:10]:
    print(line)

请注意，由于您没有MCVE，我无法对此进行测试，但我确实用一个更简单的例子来验证它：

>>> import pprint
>>> thing = pprint.pformat({i:str(i) for i in range(10000)})
>>> type(thing), len(thing)
(<class 'str'>, 147779)
>>> for line in thing.splitlines()[:10]:print(line)

{0: '0',
 1: '1',
 2: '2',
 3: '3',
 4: '4',
 5: '5',
 6: '6',
 7: '7',
 8: '8',
 9: '9',

如何使用pprint打印前10行而不是整个列表

1 个答案: