Question

希望在Python的tokenize模块中获得理解。我有兴趣在给定的python源文件（如下面的那个）上调用tokenize.tokenize方法，并使用文档中提到的5元组获取其标记化输出。

# Python source file
import os

class Test():
    """
    This class holds latitude, longitude, depth and magnitude data.
    """

    def __init__(self, latitude, longitude, depth, magnitude):
        self.latitude = latitude
        self.longitude = longitude
        self.depth = depth
        self.magnitude = magnitude

    def __str__(self):
        # -1 is for detection of missing data
        depth = self.depth
        if depth == -1:
            depth = 'unknown'

        magnitude = self.magnitude
        if magnitude == -1:
            depth = 'unknown'

        return "M{0}, {1} km, lat {2}\N{DEGREE SIGN} lon {3}\N{DEGREE SIGN}".format(magnitude, depth, self.latitude, self.longitude)

不幸的是，由于我没有使用Python来使其工作，因此文档中的example不够清晰。另外，我在网上找不到任何相关的有用示例代码。

将非常感谢任何简单易用的代码示例。此外，如果您知道有用的在线资料，其中包含tokenize模块的示例/解释及其方法，那将非常棒。

Answer 1

tokenize.tokenize是一个生成器，它将yield多个5元组，对应于源中的每个标记。

with open('/path/to/src.py', 'rb') as f:
    for five_tuple in tokenize.tokenize(f.readline):
        print(five_tuple.type)
        print(five_tuple.string)
        print(five_tuple.start)
        print(five_tuple.end)
        print(five_tuple.line)

Tokenize python源代码示例（在Python中）

1 个答案: