从csv中删除重复项并在Java中

时间:2016-01-06 16:22:25

标签: java excel csv ip ip-address

我试图从我的路由器创建的日志文件中取出文本行,并将其分解为csv文件以便于查看。

日志文件中的示例行:

[VPN-IPSEC rule not match] from 192.168.1.254:63991 to 8.8.8.8:53 Wednesday, January 06,2016 08:52:18

我希望结束文件有一个列,用于规则,IP地址向内,端口转发,IP地址输出,端口输出,主机名,日期和连接数量(重复)。

public static void main(String[] args) throws UnknownHostException, IOException {

    PrintStream diskwriter = new PrintStream("C:\\Users\\admin\\Desktop\\RawIPs.csv");

    diskwriter.print("Rule" + ",");
    diskwriter.print("Host Name" + ",");
    diskwriter.print("IP Address" + ",");
    diskwriter.println("Port");

    int count = 0;

    try (BufferedReader br = new BufferedReader(new FileReader("C:\\Users\\admin\\Desktop\\IPs.txt"))) {
        String line;
        while ((line = br.readLine()) != null) {

    String IPaddress = line;

    String IPadd = IPaddress.substring((IPaddress.lastIndexOf("to") +3));
    String IP = IPadd.substring(0, IPadd.indexOf(":"));


    String Rule = IPaddress.substring((IPaddress.indexOf("[") +1), (IPaddress.indexOf("]")));

    String Port = IPadd.substring((IPadd.indexOf(":") +1), IPadd.indexOf(" "));

    String host;

    count++;

    if(IP.startsWith("212.56.7"))
    {
        host = "Plus Net";
    }
    else if(IP.equals("157.56.144.215") || IP.equals("40.113.152.30") || IP.equals("23.102.160.172") || IP.equals("157.56.106.184") || IP.equals("94.245.121.251") || IP.equals("157.56.75.164") || IP.equals("134.170.185.125") || IP.equals("191.237.208.126") || IP.equals("191.232.139.253") || IP.equals("157.55.231.252"))
    {
        host = "Microsoft";
    }
    else if(IP.startsWith("104.16.") || IP.equals("172.69.2.2"))
    {
        host = "CloudFlare";
    }
    else if(IP.startsWith("68.232."))
    {
        host = "EdgeCast Networks";
    }
    else if(IP.startsWith("192.225.15"))
    {
        host = "ThreatMetrix";
    }
    else if(IP.startsWith("70.32."))
    {
        host = "Gigenet";
    }
    else if(IP.startsWith("185.31.19"))
    {
        host = "Fastly London 1 Operations (Hosting Company)";
    }
    else if(IP.startsWith("96.31."))
    {
        host = "Host Collective";
    }
    else if(IP.startsWith("182.70."))
    {
        host = "Bharti Telenet (India - Vodafone)";
    }
    else if(IP.startsWith("17."))
    {
        host = "Apple Inc.";
    }
    else if(IP.startsWith("199.16.15"))
    {
        host = "Twitter Inc.";
    }
    else if(IP.startsWith("128.0."))
    {
        host = "RIPE Network Coordination Centre";
    }
    else if(IP.startsWith("129.1."))
    {
        host = "Bowling Green State University";
    }
    else if(IP.startsWith("185.42.205.144") || IP.startsWith("192.16.64.181"))
    {
        host = "Twitch.tv";
    }
    else if(IP.startsWith("122.248.142.74"))
    {
        host = "Netgear";
    }
    else if(IP.startsWith("173.241.2"))
    {
        host = "OpenX Technologies";
    }
    else if(IP.startsWith("69.172."))
    {
        host = "Peer 1 Network (USA)";
    }
    else if(IP.startsWith("204.154.110") || IP.startsWith("204.154.111"))
    {
        host = "DoubleVerify";
    }
    else if(IP.startsWith("208.146."))
    {
        host = "Internap Network Services";
    }
    else
    {
        InetAddress addr = InetAddress.getByName(IP);
        host = addr.getCanonicalHostName();
    }

    diskwriter.print(Rule + ",");
    diskwriter.print(host + ",");
    diskwriter.print(IP + ",");
    diskwriter.println(Port);

        }

        System.out.println("There were " + count + " connections");
    }
}

我遇到这个问题时,我无法解决许多问题。其中之一是我无法说:"如果IP大于192.168.1.0且小于192.168.1.254,则host = home network"因为IP被写成字符串来举例。

我希望我今天可以获得帮助,但具体是重复。我想不仅要删除我的CSV文件中的重复项,还要计算它们。记录是否重复记录将取决于所有字段是否相同而不仅仅是单个记录。

我还希望将一些唯一值保存在变量中,以便在循环结束后将其打印到控制台。

使用删除重复功能可以很容易地在excel中完成,并且如果也可以在excel中进行计数,但这需要每次或在最佳情况下编写公式,拖动公式' s down。

1 个答案:

答案 0 :(得分:1)

您不应该尝试通过基于IO的while循环在单次运行中执行所有这些逻辑。忘记CSV文件以及您对excel的了解。他们让你感到困惑,而不是帮助你。

解决问题。

阅读日志一次。

定义与每行对应的POJO类,并创建这些对象的List,每行一个。

POJO是“普通的旧Java对象”。只是一堆数据字段,包含setter和getter。 EG:

public class LogEntry {
   private String host;
   private String port; 
   private String rule;

   public String getHost() {
      return this.host;
   }

   public void setHost(String host) {
      this.host = host;
   }

   public String getPort() {
      return this.port;
   }

   public void setPort(String port) {
      this.port = port;
   }
   public String getRule() {
      return this.rule;
   }

   public void setRule(String rule) {
      this.rule = rule;
   }
}

将IP地址转换为数字。这很容易。 256 ^ 3 *第一个四边形+ 256 ^ 2 *第二个四边形+ 256 *第三个四边形+第四个四边形

运行摘要逻辑以组合欺骗,获取计数等。

一次做一步。你会到达那里。