Question

在Javascript中，有多种方法可以继承方法。下面是使用一些方法的混合示例：

A = {
    name: 'first',
    wiggle: function() { return this.name + " is wiggling" },
    shake: function() { return this.name + " is shaking" }
}

B = Object.create(A)
B.name = 'second'
B.bop = function() { return this.name + ' is bopping' }


C = function(name) {
    obj = Object.create(B)
    obj.name = name
    obj.crunk = function() { return this.name + ' is crunking'}

    return obj
}

final = new C('third')

这给了我以下的继承层次结构。

enter image description here

需要注意的重要事项之一是每个对象的name属性。运行方法时，即使是远离原型链的方法，this关键字定义的本地上下文也可确保使用 localmost 属性/变量。

enter image description here

我最近转向使用Python，但我无法理解子类如何访问超类方法，以及变量作用域/对象属性的工作方式。

我在Scrapy中创建了一个Spider，它（非常成功地）在一个域上抓取了2000多个页面，并将它们解析为我需要的格式。很多帮助器只是在主parse_response方法中的函数，我可以直接在数据上使用它。原始的蜘蛛看起来像这样：

from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
from spider_scrape.items import SpiderItems

class ScrapeSpider(CrawlSpider):

    name              =   "myspider"
    allowed_domains   =   ["domain.com.au"]
    start_urls        =   ['https://www.domain.com.au/']
    rules             =   (Rule(SgmlLinkExtractor(allow=()), 
                                                  callback="parse_items", 
                                                  follow=True), )

    def parse_items(self, response):
        ...

回调函数parse_items包含处理响应的逻辑。当我概括了所有内容时，我最终得到了以下内容（意图在多个域上使用它）：

#Base

class BaseSpider(CrawlSpider):
    """Base set of configuration rules and helper methods"""

    rules = (Rule(LinkExtractor(allow=()),
                                    callback="parse_response",
                                    follow=True),)

    def parse_response(self, response):
            ...

        def clean_urls(string):
          """remove absolute URL's from hrefs, if URL is form an external domain do nothing"""
          for domain in allowed_domains:
              string = string.replace('http://' + domain, '')
              string = string.replace('https://' + domain, '')
          if 'http' not in string:
              string = "/custom/files" + string
          return string


#Specific for each domain I want to crawl
class DomainSpider(BaseSpider):

    name = 'Domain'
    allowed_domains = ['Domain.org.au']
    start_urls      = ['http://www.Domain.org.au/'
                      ,'http://www.Domain.org.au/1']

当我通过Scrapy命令行运行时，我在控制台中出现以下错误：

enter image description here

经过一些测试后，将列表理解更改为导致它起作用：for domain in self.allowed_domains:

一切都很好，这似乎与Javascript中的this关键字类似 - 我正在使用对象的属性来获取值。还有许多变量/属性可以保存scrape所需的XPath表达式：

class DomainSpider(BaseSpider):

    name = 'Domain'
    page_title      =      '//title'
    page_content    =      '//div[@class="main-content"]'

更改Spider的其他部分以模仿allowed_domains变量，我收到此错误：

enter image description here

我尝试以不同方式设置属性，包括使用self.page_content和/或__init__(self)构造函数但没有成功但错误不同。

我完全迷失了这里发生的事情。我期望发生的行为是：

当我从终端运行scrapy crawl <spider name>时，它会实例化DomainSpider类
该类中的任何类常量都可用于它继承的所有方法，类似于Javascript及其this关键字
由于上下文，它的超类中的任何类常量都会被忽略。

如果有人可以

向我解释以上内容
指出一些比LPTHW更丰富的东西，但没有使用Python的TDD会让人惊叹。

提前致谢。

Answer 1

我不熟悉JavaScript，但类似于你的问题总是包含一个答案，建议你必须学习如何在Python中学习它，而不是试图强迫Python像你的其他语言。试图用Python重新创建你的Javascript风格我想出了这个：

class A(object):
    def __init__(self):
        self.name = 'first'
    def wiggle(self):
        return self.name + ' is wiggling'
    def shake(self):
        return self.name + ' is shaking'

创建A的实例，更改其名称并将方法属性添加到实例

b = A()
b.name = 'second'
b.bop = lambda : b.name + ' is bopping'

返回A实例的函数，其附加属性为crunk。我认为这不适用于您的示例，thing将不会有bop方法，尽管函数中的另一个语句可以添加一个。

def c(name):
    thing = A()
    thing.name = name
    thing.crunk = lambda : thing.name + ' is crunking'
    return thing

final = c('third')

没有任何继承，只有具有其他属性的A实例。您会得到以下结果：

>>> 
>>> b.name
'second'
>>> b.bop()
'second is bopping'
>>> b.shake()
'second is shaking'
>>> b.wiggle()
'second is wiggling'
>>> 
>>> final.name
'third'
>>> final.crunk()
'third is crunking'
>>> final.shake()
'third is shaking'
>>> final.wiggle()
'third is wiggling'
>>> final.bop()

Traceback (most recent call last):
  File "<pyshell#32>", line 1, in <module>
    final.bop()
AttributeError: 'A' object has no attribute 'bop'
>>>

在Python中你可以这样做：

类A，其中包含name属性的默认参数，以及两个将绑定到A实例的方法。 name是实例属性，因为它在__init__中定义。只有A的实例将具有name属性 - A.name将引发AttributeError。

class A(object):
    def __init__(self, name = 'first'):
        self.name = name
    def wiggle(self):
        return self.name + ' is wiggling'
    def shake(self):
        return self.name + ' is shaking'

Foo继承了A的所有内容，并定义了其他属性bop。

class Foo(A):
    def bop(self):
        return self.name + ' is bopping'

Bar继承了Foo的所有内容，并定义了其他属性crunk

class Bar(Foo):
    def crunk(self):
        return self.name + ' is crunking'

Baz继承了Bar和覆盖wiggle

的所有内容

class Baz(Bar):
    def wiggle(self):
        return 'This Baz instance, ' + self.name + ', is wiggling'

foo = Foo('second')
bar = Bar('third')
baz = Baz('fourth')

用法：

>>> 
>>> foo.name
'second'
>>> foo.bop()
'second is bopping'
>>> foo.shake()
'second is shaking'
>>> foo.wiggle()
'second is wiggling'
>>> 
>>> bar.name
'third'
>>> bar.bop()
'third is bopping'
>>> bar.shake()
'third is shaking'
>>> bar.wiggle()
'third is wiggling'
>>> bar.crunk()
'third is crunking'
>>> 
>>> baz.wiggle()
'This Baz instance, fourth, is wiggling'
>>>

这些示例中的类具有对于类的实例仅有效的方法属性 - 方法需要绑定到实例。我没有包含任何不需要绑定到实例的类方法或静态方法的示例 - What is the difference between @staticmethod and @classmethod in Python?

有一些很好的答案

>>> A.wiggle
<unbound method A.wiggle>
>>> A.wiggle()

Traceback (most recent call last):
  File "<pyshell#41>", line 1, in <module>
    A.wiggle()
TypeError: unbound method wiggle() must be called with A instance as first argument (got nothing instead)
>>> Bar.crunk
<unbound method Bar.crunk>
>>> Bar.crunk()

Traceback (most recent call last):
  File "<pyshell#43>", line 1, in <module>
    Bar.crunk()
TypeError: unbound method crunk() must be called with Bar instance as first argument (got nothing instead)
>>>

Javascript to Python - 了解类，方法和属性的工作原理

1 个答案: