最近我正在学习apache Beam,并找到一些类似这样的python代码:
lines = p | 'read' >> ReadFromText(known_args.input)
# Count the occurrences of each word.
def count_ones(word_ones):
(word, ones) = word_ones
return (word, sum(ones))
counts = (lines
| 'split' >> (beam.ParDo(WordExtractingDoFn())
.with_output_types(unicode))
| 'pair_with_one' >> beam.Map(lambda x: (x, 1))
| 'group' >> beam.GroupByKey()
| 'count' >> beam.Map(count_ones))
发件人:https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/wordcount.py#L92
Python中|
和>>
的语法和用法是什么?
答案 0 :(得分:1)
默认情况下,|
代表逻辑或按位运算符 OR ,>>
代表右移,但是幸运的是,您可以在Python中重载运算符。因此,为了具有|
和>>,
的自定义定义,您只需要在类__or__
和__rshift__
中重载以下两个dunder(magic)方法:
class A():
def __or__(self):
pass
def __rshift__(self):
pass
我建议您阅读有关Python Data Model的更多信息。
现在,在Beam Python SDK上,__or__
类中的PTransform
重载:
def __or__(self, right):
"""Used to compose PTransforms, e.g., ptransform1 | ptransform2."""
if isinstance(right, PTransform):
return _ChainedPTransform(self, right)
return NotImplemented