Batch metadata requests for files

时间:2018-03-25 19:52:55

标签: android kotlin filesystems

It basically boils down to: if I have 4000 files in a directory, the File.isDirectory() function takes 1ms to execute, so the directory takes 4s to compute (too slow [ 1 ]).

I haven't got the most complete knowledge of the filesystem, but I think that isDirectory() can be batched for all the elements in the directory (reading a chunk of data, and then separating the individual file's metadatas). C/C++ code is acceptable (it can be run with the JNI) but should be left as a last resource.

I have found FileVisitor, but it doesn't seem that it is the best solution to my problem, as I don't have to visit the entire file tree. I also found BasicFileAttributeView but it seems it has the same problem. This is a related question but there aren't answers that provide a significant solution.

[ 1 ]: Because it is not the only thing I do it ends up being like 17s.

Edit: Code:

internal fun workFrom(unit: ProcessUnit<D>) {
    launch {
        var somethingAddedToPreload = false
        val file = File(unit.first)

        ....

        //Load children folders
        file.listFiles(FileFilter {
            it.isDirectory
        })?.forEach {
            getPreloadMapMutex().withLock {
                if (getPreloadMap()[it.path] == null) {
                    val subfiles = it.list() ?: arrayOf()
                    for (filename in subfiles) {
                        addToProcess(it.path, ProcessUnit(it.path + DIVIDER + filename, unit.second))
                    }

                    getPreloadMap()[it.path] = PreloadedFolder(subfiles.size)
                    if (getPreloadMap().size > PRELOADED_MAP_MAXIMUM) cleanOldEntries()
                    getDeleteQueue().add(it.path)

                    somethingAddedToPreload = somethingAddedToPreload || subfiles.isNotEmpty()
                }
            }
        }

        ...

        if(somethingAddedToPreload) {
            work()
        }
    }
}

private fun addToProcess(path: String, unit: ProcessUnit<D>) {
    val f: () -> Pair<String, FetcherFunction<D>> = { load(path, unit) }
    preloadList.add(f)
}

private suspend fun work() {
    preloadListMutex.withLock {
        preloadList.forEach {
            launch {
                val (path, data) = it.invoke()

                if (FilePreloader.DEBUG) {
                    Log.d("FilePreloader.Processor", "Loading from $path: $data")
                }

                val list = getPreloadMap()[path]
                        ?: throw IllegalStateException("A list has been deleted before elements were added. We are VERY out of memory!")
                list.add(data)
            }
        }
        preloadList.clear()
    }
}

PS: I will remove the coroutines in work before implementing an optimization, complete code is here.

2 个答案:

答案 0 :(得分:5)

您可以运行ls -F并通过查看最后一个字符检查输出文件是否是目录,目录将以/结尾。 E.g。

val cmd = "ls -F ${myFile.absolutePath}"
val process = Runtime.getRuntime().exec(cmd)
val files = process.inputStream
        .bufferedReader()
        .use(BufferedReader::readText)
        .lines()

for (fileName in files) {
    val isDir = fileName.endsWith("/")
}

我在模拟器上运行快速测试,有4000个文件和4000个目录,整个过程需要150毫秒。

答案 1 :(得分:3)

多年前,我必须为opendir() / readdir() / closedir() / rewinddir()编写JNI接口,以解决类似的性能问题。它有点像黑客,因为它使用jlong来保存DIR *的{​​{1}}指针并将其传递给后续opendir()readdir()调用,但它可能比Java closedir()在大型目录上快几个数量级。

它需要一个JNI库,但您可能会发现它很有用:

listFiles()

我已从代码中删除了客户识别信息,因此它并不完全符合交付条件,可能会有错误。

鉴于the Android dirent structure

/*
 * Class:     path_to_jni_ReadDir
 * Method:    opendir
 * Signature: (Ljava/lang/String;)J
 */
JNIEXPORT jlong JNICALL Java_path_to_jni_ReadDir_opendir
  (JNIEnv *env, jclass cl, jstring jdirname )
{
    const char *cdirname;
    jboolean copy;

    jlong dirp;

    if ( NULL == jdirname )
    {
        return( ( jlong ) NULL );
    }

    cdirname= ( env )->GetStringUTFChars( jdirname , &copy );
    if ( NULL == cdirname )
    {
        return( ( jlong ) NULL );
    }

    if ( 0 == ::strlen( cdirname ) )
    {
        ( env )->ReleaseStringUTFChars( jdirname , cdirname );
        return( ( jlong ) NULL );
    }

    dirp = ( jlong ) ::opendir( cdirname );

    ( env )->ReleaseStringUTFChars( jdirname , cdirname );

    return( dirp );
}

/*
 * Class:     path_to_jni_ReadDir
 * Method:    readdir
 * Signature: (J)Ljava/lang/String;
 */
JNIEXPORT jstring JNICALL Java_path_to_jni_ReadDir_readdir
  (JNIEnv *env, jclass cl, jlong dirp )
{
    struct dirent *dentp;
    struct dirent *dentbuffer;
    char buffer[ 8192 ];

    jstring jfilename;

    int rc;

    dentbuffer = (  struct dirent * ) buffer;
    dentp = NULL;

    rc = ::readdir_r( ( DIR * ) dirp, dentbuffer, &dentp );
    if ( ( SUCCESS != rc ) || ( NULL == dentp ) )
    {
        return( NULL );
    }

    jfilename = env->newStringUTF( dentp->d_name );

    return( jfilename );
}

/*
 * Class:     path_to_jni_ReadDir
 * Method:    closedir
 * Signature: (J)I
 */
JNIEXPORT jint JNICALL Java_path_to_jni_ReadDir_closedir
  (JNIEnv *env, jclass cl, jlong dirp )
{
    jint rc;

    rc = ::closedir( ( DIR * ) dirp );

    return( rc );
}

/*
 * Class:     path_to_jni_ReadDir
 * Method:    rewinddir
 * Signature: (J)V
 */
JNIEXPORT void JNICALL Java_path_to_jni_ReadDir_rewinddir
  (JNIEnv *env, jclass cl, jlong dirp )
{
    ::rewinddir( ( DIR * ) dirp );

    return;
}

您可以修改JNI struct dirent { uint64_t d_ino; int64_t d_off; unsigned short d_reclen; unsigned char d_type; char d_name[256]; }; 方法,根据readdir字段添加过滤器,该字段包含以下值之一:

d_type

例如,如果您正在查找目录,则可以添加循环以继续调用#define DT_UNKNOWN 0 #define DT_FIFO 1 #define DT_CHR 2 #define DT_DIR 4 #define DT_BLK 6 #define DT_REG 8 #define DT_LNK 10 #define DT_SOCK 12 #define DT_WHT 14 ,直到它返回::readdir_r()NULL字段为{{1 }}:

d_type