我能够使用位于url https://github.com/raghuveerm/MRPcapParser的GitHub代码使用PcapFileInputFormat解析mlab ndt跟踪日志文件。我能够解析在" Packet.java"中预定义的字段(如TIMESTAMP)。但是我无法添加像(total_traffic,Total_Size)这样的新字段并解析它们。尽管我在Packet.java中包含了新字段并尝试调试代码。调试器没有输入代码我用新字段新编辑的地方。
My Edited Code:
===============
In PacketReader.java:(In hadoop-pcap-lib)
====================
(src-->main-->java-->net-->ripe-->hadoop-->pcap-->PcapReader.java)
At Line 62: public static final int TOTAL_SIZE_OFFSET = 12;
At Line 162:
long totalSize = PcapReaderUtil.convertInt(pcapPacketHeader, TOTAL_SIZE_OFFSET, reverseHeaderByteOrder);
packet.put(Packet.TOTAL_SIZE, totalSize);
In Packet.java:(In hadoop-pcap-lib)
==============
(src-->main-->java-->net-->ripe-->hadoop-->pcap-->packet-->Packet.java)
At Line 36: public static final String TOTAL_SIZE = "tot_size";
My Mapper:
=========
package com.calsoftlabs.parse;
import net.ripe.hadoop.pcap.packet.Packet;
public class PcapMapper extends Mapper<LongWritable, ObjectWritable, Text, Text> {
public void map(LongWritable arg0, ObjectWritable arg1, Context context) throws IOException,
InterruptedException {
Packet packet = (Packet) arg1.get();
String clientIp = null;
String serverIp = null;
String timeStamp = null;
String totalTraffic = null;
String srcPort = null;
String destPort = null;
String fields = null;
if (packet != null) {
clientIp = packet.get(Packet.SRC).toString();
srcPort = packet.get(Packet.SRC_PORT).toString();
serverIp = packet.get(Packet.DST).toString();
destPort = packet.get(Packet.DST_PORT).toString();
timeStamp = packet.get(Packet.TIMESTAMP).toString();
totalTraffic = packet.get(Packet.TOTAL_SIZE).toString();
if (serverIp != null && srcPort != null && destPort != null ) {
fields = serverIp
.concat(", ").concat(srcPort)
.concat(", ").concat(destPort);
.concat(", ").concat(totalTraffic);
}
}
context.write(new Text(clientIp.concat("_").concat(timeStamp).concat("::")), new Text(fields));
}
}
最后,如果我运行一个MRJob我在total_traffic附近得到一个空指针异常。因为调试器根本没有进入新编辑的代码。我怎么能克服这个问题。任何人都可以建议这个问题......
先谢谢