以下是我的配置
<int-kafka:inbound-channel-adapter id="kafkaInboundChannelAdapter"
kafka-consumer-context-ref="consumerContext"
auto-startup="true"
channel="inputFromKafka">
<int:poller fixed-delay="1" time-unit="MILLISECONDS" />
</int-kafka:inbound-channel-adapter>
inputFromKafka
在
public Message<?> transform(final Message<?> message) {
System.out.println( "KAFKA Message Headers " + message.getHeaders());
final Map<String, Map<Integer, List<Object>>> origData = (Map<String, Map<Integer, List<Object>>>) message.getPayload();
// some code to figure-out the nonPartitionedData
return MessageBuilder.withPayload(nonPartitionedData).build();
}
上面的print语句只打印两个一致的标题,无论
KAFKA Message Headers {id=9c8f09e6-4b28-5aa1-c74c-ebfa53c01ae4, timestamp=1437066957272}
在发送Kafka消息时,传递了一些标题,包括KafkaHeaders.MESSAGE_KEY
,但我也没有回复,想知道是否还有实现这个目标?
答案 0 :(得分:5)
不幸的是,它并没有这样做......
yield Items
部分(class SpiderCrawler(scrapy.Spider):
name = "spiderman"
allowed_domains = ["mywebsite.com"]
start_urls = [
"https://www.mywebsite.com/items",
]
def parse(self, response):
for sel in response.xpath('//div[@id="col"]'):
items = MyItem()
items['categories'] = []
sections = sel.xpath('//tbody')
category_count = 5 #filler
for count in range(1, category_count):
category = Category()
#set categories
for item, link in zip(items.xpath("text()"), items.xpath("@href")):
subItem = SubItem()
#set subItems
subItem['link'] = "www.mywebsite.com/nexturl"
#the problem
request = scrapy.Request(subItem['link'], callback=self.parse_sub_item)
request.meta['sub_item'] = subItem
yield request
category['sub_items'].append(subItem)
items['categories'].append(category)
#I want this yield to not be executed until ALL requests are complete
yield items
def parse_sub_item(self, response):
fields = #some xpath
subItem = response.meta["sub_item"]
subItem['fields'] = #some xpath
subItem['another_field'] = #some xpath
)如下所示:
Producer
如您所见,我们不会将任何KafkaProducerMessageHandler
发送给Kafka this.kafkaProducerContext.send(topic, partitionId, messageKey, message.getPayload());
。只有messageHeaders
且完全低于Kafka协议指定的topic
。
payload
侧(messageKey
)从另一侧执行此逻辑:
Consumer
如您所见,我们并不关心KafkaHighLevelConsumerMessageSource
。
if (!payloadMap.containsKey(messageAndMetadata.partition())) {
final List<Object> payload = new ArrayList<Object>();
payload.add(messageAndMetadata.message());
payloadMap.put(messageAndMetadata.partition(), payload);
}
(messageKey
)适合您!它在将消息发送到频道之前执行此操作:
KafkaMessageDrivenChannelAdapter
答案 1 :(得分:0)
如前所述,Kafka中没有消息头的概念。因为我过去一直在努力解决同样的问题,所以我编译了small library来帮助解决这个问题。它可能会派上用场。