如何将非常大的CSV数据集加载到d3中

时间:2017-02-19 03:24:12

标签: javascript csv d3.js

正如标题所示,我有一个CSV文件(约250mb和700k行),我无法加载到d3中。我尝试按照我通常为csv文件的方式加载它,但没有运气。目前,它没有出错,我在控制台中得到一个空的数据数组。不确定文件是否太大或我加载错误。将不胜感激任何帮助。感谢。

public class FoxVote implements Runnable {


public static final String HOST = "misthalinpk.com";
public static final String USER = "deleted";
public static final String PASS = "deleted";
public static final String DATABASE = "deleted";

private Player player;
private Connection conn;
private Statement stmt;
private int votes;

public FoxVote(Player player) {
    this.player = player;
}


@Override
public void run() {
    try {
        if (!connect(HOST, DATABASE, USER, PASS)) {
            return;
        }

        String name = player.getUsername().replace(" ", "_");
        ResultSet rs = executeQuery("SELECT * FROM fx_votes WHERE username='"+name+"' AND claimed=0 AND callback_date IS NOT NULL");

        while (rs.next()) {
            String timestamp = rs.getTimestamp("callback_date").toString();
            String ipAddress = rs.getString("ip_address");
            int siteId = rs.getInt("site_id");


             if (player != null && player.getSession().getState() == SessionState.LOGGED_IN) {
                    player.getInventory().add(995, 100000);
                    player.getInventory().add(10944, 1);
                    votes++;
                    Achievements.finishAchievement(player, AchievementData.VOTE_FOR_US);
                    Achievements.doProgress(player, AchievementData.VOTE_100_TIMES);
                    if (++votes >= 15) {
                        World.sendMessage("[@red@Voting@bla@]@blu@Another 20 votes have been rewarded thanks to "+player.getUsername()+"!");
                        votes = 0;

            System.out.println("[Voting] Vote claimed by "+name+". (sid: "+siteId+", ip: "+ipAddress+", time: "+timestamp+")");

            rs.updateInt("claimed", 1); // do not delete otherwise they can reclaim!
            rs.updateRow();
        }

        destroy();
    }}} catch (Exception e) {
        e.printStackTrace();
    }
}


public boolean connect(String host, String database, String user, String pass) {
    try {
        this.conn = DriverManager.getConnection("jdbc:mysql://"+host+":3306/"+database, user, pass);
        return true;
    } catch (SQLException e) {
        System.out.println("Failing connecting to database!");
        return false;
    }
}

public void destroy() {
    try {
        conn.close();
        conn = null;
        if (stmt != null) {
            stmt.close();
            stmt = null;
        }
    } catch(Exception e) {
        e.printStackTrace();
    }
}

public int executeUpdate(String query) {
    try {
        this.stmt = this.conn.createStatement(1005, 1008);
        int results = stmt.executeUpdate(query);
        return results;
    } catch (SQLException ex) {
        ex.printStackTrace();
    }
    return -1;
}

public ResultSet executeQuery(String query) {
    try {
        this.stmt = this.conn.createStatement(1005, 1008);
        ResultSet results = stmt.executeQuery(query);
        return results;
    } catch (SQLException ex) {
        ex.printStackTrace();
    }
    return null;
}

1 个答案:

答案 0 :(得分:7)

这与D3无关,但通常使用JavaScript。 D3对可以加载和解析的文件大小没有限制。

Javascript在客户端端运行(有一些例外)。这意味着您的代码必须下载(如果在不同的服务器中)所有巨大的 CSV文件,并且不仅如此,它还必须解析大量对象中的数十万行。这太过分了。

因此,常识告诉我们要考虑:

  • 用户的连接速度
  • 用户的处理能力
  • 用户耐心地盯着空白屏幕看几分钟,等待数据下载/解析。

这是一个加载巨大CSV文件的演示(来自data.gov站点),您可以在控制台中看到加载的数据量。我还放了console.time来显示下载和解析文件所需的总时间(如果你有耐心等到最后,我没有):

console.time("totalTime:");
d3.csv("https://data.consumerfinance.gov/api/views/s6ew-h6mp/rows.csv")
    .on("progress", function(evt) {
        console.log("Amount loaded: " + evt.loaded)
    })
    .get(function(data) {
        console.timeEnd("totalTime:");
    });
<script src="https://d3js.org/d3.v4.min.js"></script>