我有日志数据,我想将每个信息提取到变量
以下是一行日志示例。 {:id => 306,:name =>“bblite”,:cpu => {:quota => 4,:allocated => 4,:actual => 0} ,: memory => {:quota => 8192,:assigned => 8192,:actual => 8578} ,: cluster_stats => {“wc1104”=> {:cpu => 0,:mem => 8578} }}
我需要包含所有ID的变量,包含所有名称的变量,包含CPU的变量和包含所有群集统计信息的变量
以下是我的猪脚本部分。我可以存储id但我不知道如何使用正则表达式提取其余的。
。 。
matching_messages = FILTER raw_lines BY (LOWER(message) MATCHES '.*cc_altus-plaform.*');
ids = FOREACH matching_messages GENERATE REGEX_EXTRACT(message,'id=>\\d*',0);
names = FOREACH matching_messages GENERATE REGEX_EXTRACT(message,'name=>\\"\\",',0);
line_with_date = FOREACH matching_messages GENERATE
DateFormatter(timestamp) AS formatted_time: chararray, message;
DUMP names;
答案 0 :(得分:0)
以下代码片段是我编写的正则表达式:
id = FOREACH matching_messages GENERATE REGEX_EXTRACT(message,'(?<=id=>)\\d*',0);
name = FOREACH matching_messages GENERATE REGEX_EXTRACT(message,'name=>\\"[\\w]*\\"',0);
cpu = FOREACH matching_messages GENERATE REPLACE( REGEX_EXTRACT(message, 'cpu=>\\{.*?\\}',0), ',','');
memory = FOREACH matching_messages GENERATE REGEX_EXTRACT(message,'memory=>\\{.*?\\}',0);
cluster = FOREACH matching_messages GENERATE REGEX_EXTRACT(message,'cluster_stats=>\\{.*?\\}',0);