Question

我正在尝试在我正在编写的程序中为某些任务并行实现多线程。该程序使用Spring框架并在Pivotal Cloud Foundry上运行。它偶尔崩溃所以我进去查看了日志和性能指标;这是我发现它有内存泄漏的时候。经过一些测试后，我将罪魁祸首缩小到我的线程实现。我对JVM中GC的理解是它不会处理一个没有死的线程，也不会处理任何仍被另一个对象或后来的可执行代码行引用的对象。但是我根本没有对该线程的任何引用，如果我这样做，它声称一旦它完成运行就将自己置于死状态，所以我不知道是什么导致了泄漏。

我写了一个干净的PoC来证明泄漏。它使用一个休息控制器，所以我可以控制线程的数量，一个可运行的类，因为我的真实程序需要参数，一个字符串占用内存中的任意空间，将由实际程序中的其他字段保存（使泄漏更多表观的）。

package com.example;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

@RestController
public class LeakController {

    @RequestMapping("/Run")
    public String DoWork(@RequestParam("Amount") int amount, @RequestParam("Args") String args)
    {
        for(int i = 0; i < amount; i++)
            new Thread(new MyRunnable(args)).start();
        return "Workin' on it";
    }

    public class MyRunnable implements Runnable{
        String args;
        public MyRunnable(String args){ this.args = args; }
        public void run()
        {
            int timeToSleep = Integer.valueOf(args);
            String spaceWaster = "";
            for (int i = 0; i < 10000; i ++)
                spaceWaster += "W";
            System.out.println(spaceWaster);
            try {Thread.sleep(timeToSleep);} catch (InterruptedException e) {e.printStackTrace();}
            System.out.println("Done");
        }
    }
}

任何人都可以解释为什么这个程序会泄漏内存吗？

编辑：我收到了一些关于字符串赋值vs字符串构建和字符串池的回复，所以我将代码更改为以下

        int[] spaceWaster = new int[10000];
        for (int i = 0; i < 10000; i ++)
            spaceWaster[i] = 512;
        System.out.println(spaceWaster[1]);

它仍然泄漏。

编辑：在获取一些真实数字以回应Voo时我发现了一些有趣的东西。调用新线程开始吃内存，但只是一点。在永久增长大约60mb之后，新的基于整数的程序无论其推动力度如何都会停止进一步增长。这是否与spring框架分配内存的方式有关？

我还认为返回String示例是有好处的，因为它与我的实际用例更紧密相关;这是对传入的JSON进行正则表达式操作，几秒钟就有几百个这样的JSON。考虑到这一点，我已将代码更改为：

@RestController
public class LeakController {

    public static String characters[] = {
            "1","2","3","4","5","6","7","8","9","0",
            "A","B","C","D","E","F","G","H","I","J","K","L","M",
            "N","O","P","Q","R","S","T","U","V","W","X","Y","Z"};
    public Random rng = new Random();

    @RequestMapping("/Run")
    public String GenerateAndSend(@RequestParam("Amount") int amount)
    {
        for(int i = 0; i < amount; i++)
        {
            StringBuilder sb = new StringBuilder(100);
            for(int j = 0; j< 100; j++)
                sb.append(characters[rng.nextInt(36)]);
            new Thread(new MyRunnable(sb.toString())).start();
            System.out.println("Thread " + i + " created");
        }
        System.out.println("Done making threads");
        return "Workin' on it";
    }

    public class MyRunnable implements Runnable{
        String args;
        public MyRunnable(String args){ this.args = args; }
        public void run()
        {
            System.out.println(args);
            args = args.replaceAll("\\d+", "\\[Number was here\\]");
            System.out.println(args);
        }
    }
}

这个新应用程序表现出与整数示例相似的行为，因为它永久地增长了大约50mb（在2000个线程之后），并从那里逐渐减少，直到我不能注意到每批新的1000个线程（大约85mb以上）的任何内存增长部署内存）。

如果我将其更改为删除stringbuilder：

String temp = "";
for(int j = 0; j< 100; j++)
    temp += characters[rng.nextInt(36)];
new Thread(new MyRunnable(temp)).start();

它无限期地泄漏;我假设一旦生成所有36 ^ 100个字符串，它就会停止。

结合这些发现我想我的真正问题可能是字符串池的问题和spring如何分配内存的问题。我仍然无法理解的是，在我的实际应用程序中，如果我创建一个runnable并在主线程上调用run（），内存似乎没有尖峰，但如果我创建一个新线程并给它runnable然后内存跳转。下面是我正在建立的应用程序中我的runnable目前的样子：

public class MyRunnable implements Runnable{
    String json;
    public MyRunnable(String json){
        this.json = new String(json);
    }
    public void run()
    {
        DocumentClient documentClient = new DocumentClient (END_POINT,
                MASTER_KEY, ConnectionPolicy.GetDefault(),
                ConsistencyLevel.Session);
        System.out.println("JSON : " + json);
        Document myDocument = new Document(json);
        System.out.println(new DateTime().toString(DateTimeFormat.forPattern("MM-dd-yyyy>HH:mm:ss.SSS"))+">"+"Created JSON Document Locally");
        // Create a new document
        try {
            //collectioncache is a variable in the parent restcontroller class that this class is declared inside of
            System.out.println("CollectionExists:" + collectionCache != null);
            System.out.println("CollectionLink:" + collectionCache.getSelfLink());
            System.out.println(new DateTime().toString(DateTimeFormat.forPattern("MM-dd-yyyy>HH:mm:ss.SSS"))+">"+"Creating Document on DocDB");
            documentClient.createDocument(collectionCache.getSelfLink(), myDocument, null, false);
            System.out.println(new DateTime().toString(DateTimeFormat.forPattern("MM-dd-yyyy>HH:mm:ss.SSS"))+">"+"Document Creation Successful");
            System.out.flush();
            currentThreads.decrementAndGet();
        } catch (DocumentClientException e) {
            System.out.println("Failed to Upload Document");
            e.printStackTrace();
        }
    }
}

我的真正泄漏的任何想法？有什么地方我需要一个字符串生成器？字符串只是做内存搞笑，我需要给它一个更高的上限伸展到那时它会没事吗？

编辑：我做了一些基准测试，所以我可以实际绘制行为图，以便更好地理解GC正在做什么

00000 Threads - 457 MB
01000 Threads - 535 MB
02000 Threads - 545 MB
03000 Threads - 549 MB
04000 Threads - 551 MB
05000 Threads - 555 MB
2 hours later - 595 MB
06000 Threads - 598 MB
07000 Threads - 600 MB
08000 Threads - 602 MB

似乎渐渐渐渐但我最感兴趣的是，当我出去参加会议和吃午饭时，它决定自己增加40mb。我和我的团队核实过，在此期间没有人使用过该应用程序。不知道该怎么做

Answer 1

因为你继续添加String。 Java自动没有GC字符串池

Java String Pool

String spaceWaster = "";
            for (int i = 0; i < 10000; i ++)
                spaceWaster += "W";

使用StringBuilder代替

Answer 2

使用stringbuilder是正确的

不认为你需要2000个线程。

更好的设计可能是任务（字符串/文档）的A Queue和处理字符串/文档的thread pool。

Java线程内存泄漏

2 个答案: