Question

我有大量的.JPG文件（> 100 000），我想从每个文件中提取经度和纬度。我目前的设置完成了工作，但我想加快这个过程。这是我得到的

public static void digFolder( File[] files ) {
    totalCount += files.length;       // Total files to be processed
    JPGFile jpgFile = new JPGFile( ); // Holds the extracted longitude and latitude
    int progress;                     // How many files have been processed

    for (File file : files) {
        if (file.isDirectory( )) {
            // Updates a JLabel with the current directory
            label.setText( "Currently working on " + file.getName( ) );
            digFolder( file.listFiles( ) );    // Recursive call
        } else {
            // Sets the path to the .jpg file
            jpgFile.setPath( file.getAbsolutePath( ) );

            if (!jpgFile.initialize( )) continue; // Code for .initialize( ) is below

            // Grabs the longitude and latitude
            String record = jpgFile.getLongitude( ) + ", " + jpgFile.getLatitude( );

            // BufferedReader writes to .CSV with a buffered size of 8192
            output.writeRecord( record );

            // Updates the progress of a JProgressBar, and sets the text
            progress = ( int )Math.round( ( ++processedCount / ( double )totalCount ) * 100 );
            progressBar.setValue( progress );
            progressBar.setString( progress + "% (" + processedCount + "/" + totalCount + ")" );
        }
    }
}

以下是课程.initialize( )中JPGFile的代码。它是从.JPG中的EXIF数据中获取坐标的内容。然后，您可以使用_location.getLongitude( );和_location.getLatitude( );来获取经度和纬度。我正在使用的库是metadata-extractor

public boolean initialize( ) {
    try {
        Metadata metadata = ImageMetadataReader.readMetadata( _file );
        GpsDirectory gpsDirectory = metadata.getDirectory( GpsDirectory.class );
        _location = gpsDirectory.getGeoLocation( );

        return true;
    } catch (Exception e) {
        e.printStackTrace( );
    }

    return false
}

当我检查运行时间时，我有437秒的时间将33000个.JPG文件中的数据写入.CSV文件（如果我用完全相同的文件再次运行它，它会下降到8秒，但我认为那是因为它们已经在内存中了。第一次运行只需要8秒就可以了！）。 metadata-extractor需要花费很长时间才能获取数据，而且看起来有点矫枉过正（20个包含超过100个类的包）。

是否有一种简单的方法来获取数据？任何人都有任何提示，以减少处理时间？谢谢！

这是我现在拥有的。我现在正在使用this library。我对此库所做的唯一更改是创建所有必需的对象和方法static，以便在不必创建新对象的情况下使用readMetaData

public static void walkFileSystem( File[] files ) {
    totalCount += files.length;

    for (int i = 0; i < files.length; i++) {
        if (files[i].getAbsolutePath( ).endsWith( ".jpg" )) {
            try {
                GeoTag current = JpegGeoTagReader.readMetadata( files[i] );

                // Uses a BufferedWriter to write to the file
                writer.writeRecord( current.getLongitude( ) + ", " +
                                    current.getLatitude( ) + ", " +
                                    files[i].getAbsolutePath( ) + "," +
                                    files[i].getName( ) );
            } catch (Exception e) {
                e.printStackTrace( );
            }

            if (++processedCount % 100 == 0) {
                int progress = ( int )Math.round( ( processedCount / ( double )totalCount ) * 100 );

                if (progressBar.getValue( ) != progress) progressBar.setValue( progress );
                progressBar.setString( progress + "%" + " (" + processedCount + "/" + totalCount + ")" );
            }
        } else if (files[i].isDirectory( )) {
            label.setText( "Currently working on " + files[i].getName( ) );
            walkFileSystem( files[i].listFiles( ) );
        }
    }
}

我发现当它第一次进入新文件夹时，脚本相对较快。但是当它处理更多文件（文件夹的50％）时，它会慢慢爬行。每次迭代都会创建一些东西吗？索引不应该影响我不认为的速度

从.JPG中提取经度和纬度的最快方法

0 个答案: