我的代码:
import luigi
import pickle
from datetime import datetime
class QueryTwitterTrend(luigi.ExternalTask):
date = luigi.DateMinuteParameter(default=datetime.now())
country_code = luigi.Parameter(default='usa')
def requires(self):
return []
def output(self, **kwargs):
kwargs.setdefault('loc', 'dummy')
return luigi.LocalTarget(
"data/trends/trends_{}_{}.csv".format(self.date.strftime('%m%d_%Y_%H%M'), kwargs['loc']))
def run(self):
from retrieve_trends import run as retrieve_trends
import pandas as pd
args_dict = {
'location': [self.country_code]
}
df_container = retrieve_trends(args_dict)
f = self.output(loc=self.country_code).open('w')
df_container[self.country_code].to_csv(f, sep=',', encoding='utf-8')
f.close()
class TrendsTaskWrapper(luigi.WrapperTask):
def requires(self):
locations = [
'usa-nyc',
'usa-lax',
'usa-chi',
'usa-dal',
'usa-hou',
'usa-wdc',
'usa-mia',
'usa-phi',
'usa-atl',
'usa-bos',
'usa-phx',
'usa-sfo',
'usa-det',
'usa-sea',
]
for loc in locations:
yield QueryTwitterTrend(country_code=loc)
Luigi执行摘要:
===== Luigi Execution Summary =====
Scheduled 15 tasks of which:
* 14 ran successfully:
- 14 QueryTwitterTrend(date=2019-04-27T1955, country_code=usa-atl) ...
* 1 failed:
- 1 TrendsTaskWrapper()
This progress looks :( because there were failed tasks
===== Luigi Execution Summary =====
跟踪:
RuntimeError: Unfulfilled dependencies at run time:QueryTwitterTrend_usa_nyc_2019_04_27T1955_c4a2592db0, QueryTwitterTrend_usa_lax_2019_04_27T1955_e2676d1bef..
在luigi daemon
中,我得到一个UPSTREAM_ERROR
。
我不知道是什么导致了此错误。就创建本地文件而言,我所有的QueryTwitterTrend
任务都可以正常工作,并说它们在Luigi环境中已完成。问题是,TaskWrapper
被读取为错误。
我该怎么做才能避免这种情况发生,并确保仍在跟踪所有依赖项。我很快将有更多依赖QueryTwitterTrend的任务。