我是Bonobo库的新手,并构建了一个简单的流程:
我正在使用bonobo内置的CsvReader和CsvWriter来简化它。 首先,我陷入了CsvReader无法发送带有单元格的标头的问题,建议的解决方法是添加
@use_raw_input
转换的注释紧随CsvReader之后。但是,当将内容传递到下一个活动时,该包再次失去其标头,并被视为简单的元组。仅当且仅当我明确命名字段时,它才起作用
def process_rows(Header1,Header2,Header3,Header4)
我的代码如下(在process_rows中设置一个断点,以查看没有标题的元组):
import bonobo
from bonobo.config import use_raw_input
# region constants
INPUT_PATH = 'input.csv'
OUTPUT_PATH = 'output.csv'
EXPECTED_HEADER = ('Header1', 'Header2', 'Header3', 'Header4')
# endregion constants
#This is stupid because all rows are checked instead of only the first
@use_raw_input #mandatory to get the header
def validate_header(input):
if input._fields != EXPECTED_HEADER:
raise("This file has an unexpected header, won't be processed")
yield input
def process_rows(*input):
concat = ""
for elem in input:
concat += elem
result = input.__add__((concat,))
yield result
# region bonobo + main
def get_graph(**options):
graph = bonobo.Graph()
graph.add_chain(bonobo.CsvReader(INPUT_PATH, delimiter=','),
validate_header,
process_rows,
bonobo.CsvWriter(OUTPUT_PATH))
return graph
def get_services(**options):
return {}
if __name__ == '__main__':
parser = bonobo.get_argument_parser()
with bonobo.parse_args(parser) as options:
bonobo.run(
get_graph(**options),
services=get_services(**options)
)
# endregion bonobo + main
感谢您的时间和帮助!
答案 0 :(得分:1)
我做了一些调查,发现我认为这是您追求的“未来”文档:
http://docs.bonobo-project.org/en/master/guide/future/transformations.html
但是它没有实现。
我发现了类似的问题Why does Bonobo's CsvReader() method yield tuples and not dicts?