Scrapy / Python:如何将项添加到类中创建的列表中?

时间:2018-04-18 03:07:49

标签: python list class collections scrapy

我正在尝试将对象添加到在另一个对象内创建的列表中。 所以,这是我的课程:

# Clases auxiliares
class job(scrapy.Item): # containing class, it has a List of 'batches'
    job_name = scrapy.Field()
    status = scrapy.Field()
    start = scrapy.Field()
    end = scrapy.Field()
    operator = scrapy.Field()
    recipe = scrapy.Field()
    planned = scrapy.Field()
    executed = scrapy.Field()   
    def __init__(self):
        self.batches = [] 

class batch(scrapy.Item): # this class goes inside a job class, and
                          # also stores a list of 'units'
    batch_name = scrapy.Field()
    status = scrapy.Field()
    start = scrapy.Field()
    end = scrapy.Field()
    def __init__(self):
        self.units = [] 

class unit(scrapy.Item): # Finally, this class stores a list of data
    unit_name = scrapy.Field()
    status = scrapy.Field()
    start = scrapy.Field()
    end = scrapy.Field()
    operator = scrapy.Field()
    recipe = scrapy.Field()
    def __init__(self):
        self.datos = [] 

这是我试图运行的代码(不幸的是有错误):

def inicializa_batches(self, lista_batches, jobs):

# 1- the param lista_batches is an extract() of a portion of the 
# response.css with the required data

# 2 - The param jobs is a list of job() objects previously created
    for batchname in lista_batches:
        bn =  str(batchname.strip()) #mejor recibir pura cadena de texto
        if len(bn) > 0:
            newbatch = batch() #declare a new batch object
            newbatch['batch_name'] = bn
            for job in jobs:
                nom_job = job['job_name']
                if nom_job[0:4] == bn[0:4]: #4 letter match

                    job['batches'].append(newbatch) # <-- Error!!
        self.log(bn)

我得到的错误是:

AttributeError: Use item['batches'] = [] to set field value

任何帮助将不胜感激。

感谢。

1 个答案:

答案 0 :(得分:0)

根据官方文件(link),

  

Field类只是内置dict类的别名,不提供任何额外的功能或属性。换句话说,Field对象是普通的Python dicts。一个单独的类用于支持基于类属性的项声明语法。

因此,您可以将unitsdatosField定义为# Clases auxiliares class job(scrapy.Item): # containing class, it has a List of 'batches' job_name = scrapy.Field() status = scrapy.Field() start = scrapy.Field() end = scrapy.Field() operator = scrapy.Field() recipe = scrapy.Field() planned = scrapy.Field() executed = scrapy.Field() batches = scrapy.Field() class batch(scrapy.Item): # this class goes inside a job class, and # also stores a list of 'units' batch_name = scrapy.Field() status = scrapy.Field() start = scrapy.Field() end = scrapy.Field() units = scrapy.Field() class unit(scrapy.Item): # Finally, this class stores a list of data unit_name = scrapy.Field() status = scrapy.Field() start = scrapy.Field() end = scrapy.Field() operator = scrapy.Field() recipe = scrapy.Field() datos = scrapy.Field() 对象,如下所示

def inicializa_batches(self, lista_batches, jobs):

# 1- the param lista_batches is an extract() of a portion of the 
# response.css with the required data

# 2 - The param jobs is a list of job() objects previously created
    for batchname in lista_batches:
        bn =  str(batchname.strip()) #mejor recibir pura cadena de texto
        if len(bn) > 0:
            newbatch = batch() #declare a new batch object
            newbatch['batch_name'] = bn
            for job in jobs:
                job['batches'] = []
                nom_job = job['job_name']
                if nom_job[0:4] == bn[0:4]: #4 letter match

                    job['batches'].append(newbatch)
        self.log(bn)

在该功能中,您可以将其更改为以下

import pandas as pd
atlantic = pd.read_csv('atlantic.csv', sep=',')
atlantic.to_xarray()
xr.open_dataset(atlantic)