如何找到哪个Java / Scala线程锁定了文件?

时间:2015-12-11 17:14:20

标签: java scala apache-spark hive

简而言之:

  1. 如何查找哪个Java / Scala线程已锁定文件? 我知道JVM中的一个类/线程已经锁定了一个具体文件(重叠了一个文件区域),但我不知道如何。当我在断点处停止应用程序时,有可能找出正在执行此操作的类/线程吗?
  2. 以下代码抛出OverlappingFileLockException

    false
    1. Java / Scala如何锁定此文件(Spark)?我知道如何使用java.nio.channels锁定文件,但我没有在github repository of Spark中找到适当的调用。
    2. 关于我的问题的更多信息: 1.当我在Windows操作系统中使用Hive运行Spark时,它可以正常工作,但每次Spark关闭时,它都无法删除一个临时目录(在此之前的其他临时目录被正确删除)并输出以下异常:

      FileChannel.open(Paths.get("thisfile"), StandardOpenOption.APPEND).tryLock().isValid();
      FileChannel.open(Paths.get("thisfile"), StandardOpenOption.APPEND).tryLock()..isShared();
      

      我尝试在互联网上进行搜索,但在Spark中发现只有in progress个问题(一个用户尝试做一些补丁,但是如果我正确地对此拉取请求进行注释,则它不起作用和SO中的一些未解答的问题。

      问题出现在Utils.scala类的deleteRecursively()方法中。我将断点设置为此方法并将其重写为Java:

      2015-12-11 15:04:36 [Thread-13] INFO  org.apache.spark.SparkContext - Successfully stopped SparkContext
      2015-12-11 15:04:36 [Thread-13] INFO  o.a.spark.util.ShutdownHookManager - Shutdown hook called
      2015-12-11 15:04:36 [Thread-13] INFO  o.a.spark.util.ShutdownHookManager - Deleting directory C:\Users\MyUser\AppData\Local\Temp\spark-9d564520-5370-4834-9946-ac5af3954032
      2015-12-11 15:04:36 [Thread-13] INFO  o.a.spark.util.ShutdownHookManager - Deleting directory C:\Users\MyUser\AppData\Local\Temp\spark-42b70530-30d2-41dc-aff5-8d01aba38041
      2015-12-11 15:04:36 [Thread-13] ERROR o.a.spark.util.ShutdownHookManager - Exception while deleting Spark temp dir: C:\Users\MyUser\AppData\Local\Temp\spark-42b70530-30d2-41dc-aff5-8d01aba38041
      java.io.IOException: Failed to delete: C:\Users\MyUser\AppData\Local\Temp\spark-42b70530-30d2-41dc-aff5-8d01aba38041
          at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:884) [spark-core_2.11-1.5.0.jar:1.5.0]
          at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:63) [spark-core_2.11-1.5.0.jar:1.5.0]
          at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:60) [spark-core_2.11-1.5.0.jar:1.5.0]
          at scala.collection.mutable.HashSet.foreach(HashSet.scala:78) [scala-library-2.11.6.jar:na]
          at org.apache.spark.util.ShutdownHookManager$$anonfun$1.apply$mcV$sp(ShutdownHookManager.scala:60) [spark-core_2.11-1.5.0.jar:1.5.0]
          at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:264) [spark-core_2.11-1.5.0.jar:1.5.0]
          at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
          at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
          at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
          at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699) [spark-core_2.11-1.5.0.jar:1.5.0]
          at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
          at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
          at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
          at scala.util.Try$.apply(Try.scala:191) [scala-library-2.11.6.jar:na]
          at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:234) [spark-core_2.11-1.5.0.jar:1.5.0]
          at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:216) [spark-core_2.11-1.5.0.jar:1.5.0]
          at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54) [hadoop-common-2.4.1.jar:na]
      

      Spark在此方法的断点处停止时,我发现Spark的一个线程的JVM已锁定" C:\ Users \ MyUser \ AppData \ Local \ Temp \ spark -9ba0bb0c-1e20-455d-bc1f-86c696661ba3 \ metastore \ db.lck"文件和Windows Process Explorer也显示Java锁定此文件。 FileChannel也显示该文件在JVM中被锁定。

      现在,我必须:

      1. 找出哪个线程/类已锁定此文件

      2. 找出锁定文件的方法Spark用于锁定" metastore \ db.lck",它是什么类以及如何在关机前将其解锁

      3. 在调用deleteRecursively()方法之前或之前,向SparkHive执行拉取请求以解锁此文件(" metastore \ db.lck")至少留下关于问题的评论

      4. 如果您需要任何其他信息,请在评论中提问。

3 个答案:

答案 0 :(得分:5)

请参阅How to find out which thread is locking a file in java?

Windows进程锁定了文件。线程可以打开文件来读写,但是一个包含文件句柄引用的类负责关闭它。因此,您应该寻找一个对象,而不是一个线程。

请参阅How can I figure out what is holding on to unfreed objects?了解具体方法。

答案 1 :(得分:3)

  
      
  1. 如何查找哪个Java / Scala线程已锁定文件?
  2.   

我遇到了一些问题并找到了这个解决方案:至少在Thread.threadLocals字段中你可以看到所有锁定的对象。

如果文件在代码后锁定:

    File newFile = new File("newFile.lock");
    newFile.createNewFile();
    FileLock fileLock = FileChannel.open(Paths.get(newFile.getAbsolutePath()), StandardOpenOption.APPEND).tryLock();

Thread.threadLocals中,您可以看到sun.nio.fs.NativeBuffer类,其中包含字段owner =“... / newFile.lock”。

所以你可以尝试下面的代码,它返回所有带有threadLocals中所有类的线程,你需要找到什么Threads有NativeBuffer类或Spark / Hive对象等等(并且在Eclipse或IDEA调试之后检查这个线程的threadLocals)模式):

private static String getThreadsLockFile() {
    Set<Thread> threads = Thread.getAllStackTraces().keySet();
    StringBuilder builder = new StringBuilder();
    for (Thread thread : threads) {
        builder.append(getThreadsLockFile(thread));
    }
    return builder.toString();
}

private static String getThreadsLockFile(Thread thread) {
    StringBuffer stringBuffer = new StringBuffer();
    try {
        Field field = thread.getClass().getDeclaredField("threadLocals");
        field.setAccessible(true);
        Object map = field.get(thread);
        Field table = Class.forName("java.lang.ThreadLocal$ThreadLocalMap").getDeclaredField("table");
        table.setAccessible(true);
        Object tbl = table.get(map);
        int length = Array.getLength(tbl);
        for (int i = 0; i < length; i++) {
            try {
                Object entry = Array.get(tbl, i);
                if (entry != null) {
                    Field valueField = Class.forName("java.lang.ThreadLocal$ThreadLocalMap$Entry").getDeclaredField("value");
                    valueField.setAccessible(true);
                    Object value = valueField.get(entry);
                    if (value != null) {
                        stringBuffer.append(thread.getName()).append(" : ").append(value.getClass()).
                                append(" ").append(value).append("\n");
                       }
                }
            } catch (Exception exp) {
                // skip, do nothing
            }
        }
    } catch (Exception exp) {
        // skip, do nothing
    }
    return stringBuffer.toString();
}

或者您可以尝试使用以下代码,但此代码仅查找带有NativeBuffer参数的owner类(因此在所有情况下都不起作用):

private static String getThreadsLockFile(String fileName) {
    Set<Thread> threads = Thread.getAllStackTraces().keySet();
    StringBuilder builder = new StringBuilder();
    for (Thread thread : threads) {
        builder.append(getThreadsLockFile(thread, fileName));
    }
    return builder.toString();
}

private static String getThreadsLockFile(Thread thread, String fileName) {
    StringBuffer stringBuffer = new StringBuffer();
    try {
        Field field = thread.getClass().getDeclaredField("threadLocals");
        field.setAccessible(true);
        Object map = field.get(thread);
        Field table = Class.forName("java.lang.ThreadLocal$ThreadLocalMap").getDeclaredField("table");
        table.setAccessible(true);
        Object tbl = table.get(map);
        int length = Array.getLength(tbl);
        for (int i = 0; i < length; i++) {
            try {
                Object entry = Array.get(tbl, i);
                if (entry != null) {
                    Field valueField = Class.forName("java.lang.ThreadLocal$ThreadLocalMap$Entry").getDeclaredField("value");
                    valueField.setAccessible(true);
                    Object value = valueField.get(entry);
                    if (value != null) {
                        int length1 = Array.getLength(value);
                        for (int j = 0; j < length1; j++) {
                            try {
                                Object entry1 = Array.get(value, j);
                                Field ownerField = Class.forName("sun.nio.fs.NativeBuffer").getDeclaredField("owner");
                                ownerField.setAccessible(true);
                                String owner = ownerField.get(entry1).toString();
                                if (owner.contains(fileName)) {
                                    stringBuffer.append(thread.getName());
                                }
                            } catch (Exception exp) {
                                // skip, do nothing
                            }
                        }
                    }
                }
            } catch (Exception exp) {
                // skip, do nothing
            }
        }
    } catch (Exception exp) {
        // skip, do nothing
    }
    return stringBuffer.toString();
}

答案 2 :(得分:0)

我向你提供了我发现的关于我自己的猜测的信息,没有其他答案(非常感谢你Basilevstploter),在同样的情况下可能对某人有所帮助:

  1. 每当JVM线程独占锁定文件时,JVM锁也会被锁定 一些 Jave对象,例如,我发现在我的情况下:

    • sun.nio.fs.NativeBuffer
    • sun.nio.ch.Util $ BufferCache

    所以你需要找到这个锁定的Java对象并对其进行分析,然后找到线程锁定文件的内容。

  2. 如果文件只是打开(没有专门锁定),我不确定它是否有效,但我确信如果文件被Thread专门锁定(使用java.nio.channels.FileLockjava.nio.channels.FileChannel等等,那么它是有用的)

    1. 不幸的是,关于Spark,我发现很多其他锁定的Hive对象(org.apache.hadoop.hive.ql.metadata.Hiveorg.apache.hadoop.hive.metastore.ObjectStoreorg.apache.hadoop.hive.ql.session.SessionStateorg.apache.hadoop.hive.ql.metadata.Hive等等Spark尝试删除db.lck,这意味着Spark在尝试删除Hive文件之前没有正确关闭Hive's。幸运的是,Linux OS中没有此问题(可能Linux允许删除锁定的文件)。