Question

我正在使用Java读取一个大型json文件，并发布我的localhost的每一行，然后获取JSON并将其作为对象读取，然后使用MySQL将对象的一部分存储在该数据库中。

这是一个非常缓慢的过程。

我该如何优化它？

<?php
error_reporting(0);
@ini_set('display_errors', 0);

$json = $_POST['data'];

if(!empty($json)){

    $obj = json_decode($json);

    $user_id =  $obj->interaction->author->id;
    $user_link =  $obj->interaction->author->link;
    $name =  $obj->interaction->author->name;
    $user_name =  $obj->interaction->author->username;
    $user_gender =  $obj->demographic->gender;
    $user_language =  $obj->twitter->lang;
    $user_image =  $obj->interaction->author->avatar;
    $user_klout =  $obj->klout->score;
    $user_confidence =  $obj->language->confidence;
    $user_desc =  $obj->twitter->user->description;
    $user_timezone =  $obj->twitter->user->time_zone;
    $user_tweet_count = $obj->twitter->user->statuses_count;
    $user_followers_count = $obj->twitter->user->followers_count;
    $user_friends_count = $obj->twitter->user->friends_count;
    $user_location = $obj->twitter->user->location;
    $user_created_at = $obj->twitter->user->created_at;

    $tweet_id =  $obj->twitter->id;
    $tweet_text =  $obj->interaction->content;
    $tweet_link =  $obj->interaction->link;

    $tweet_created_at =  $obj->interaction->created_at;

    $tweet_location =  $obj->twitter->user->location;

    //$tweet_geo_lat =  $obj->interaction->geo->latitude;
    //$tweet_geo_long =  $obj->interaction->geo->longitude;

    $con = mysqli_connect("localhost","root","", "cohort");
    // Check connection
    if (mysqli_connect_errno()) {
      echo "Failed to connect to MySQL: " . mysqli_connect_error();
    }

    $sql = "INSERT INTO tweeters (user_id, screen_name, name, profile_image_url, location, url,
            description, created_at, followers_count,
            friends_count, statuses_count, time_zone,
            last_update, klout, confidence, gender
    )
    VALUES ('$user_id', '$user_name','$name',
    '$user_image', '$user_location', '$user_link',
    '$user_desc', '$user_created_at', '$user_followers_count',
    '$user_friends_count', '$user_tweet_count', '$user_timezone',
    '', '$user_klout', '$user_confidence', '$user_gender' )";

    if (!mysqli_query($con,$sql)) {
      //die('Error: ' . mysqli_error($con));
    }

    $sql = "INSERT INTO search_tweets (tweet_id, tweet_text, created_at_date,
        created_at_time, location, geo_lat,
        geo_long, user_id, is_rt)
        VALUES ('$tweet_id', '$tweet_text','',
                '$tweet_created_at', '$tweet_location', '',
                '', '$user_id', '')";

    if (!mysqli_query($con,$sql)) {
      //die('Error: ' . mysqli_error($con));
    }

    mysqli_close($con);
    echo json_encode(array("id" => $user_id ));
}

?>

爪哇：

    String inputfile = "D:\\Datasift\\Tweets.json"; //  Source File Name.  
        double nol = 200000; //  No. of lines to be split and saved in each output file.  
        File file = new File(inputfile);  
        Scanner scanner = new Scanner(file);  
        int count = 0;  
        System.out.println("Storing file in stack"); 
        int may = 0, june = 0, just_june=0, july = 0;
        BufferedReader br = null;

        BufferedReader  in = new BufferedReader (new InputStreamReader (new ReverseLineInputStream(file)));


        while(true) {
            String line = in.readLine();
            if (line == null) {
                break;
            }
            //System.out.println("X:" + line);
            // Send POST output.
            URL url1;
        URLConnection   urlConn;
        DataOutputStream    printout;
        DataInputStream     input;
                // URL of CGI-Bin script.
        url1 = new URL ("http://localhost/json/");
                // URL connection channel.
        urlConn = url1.openConnection();
                // Let the run-time system (RTS) know that we want input.
        urlConn.setDoInput (true);
                // Let the RTS know that we want to do output.
        urlConn.setDoOutput (true);
                // No caching, we want the real thing.
        urlConn.setUseCaches (false);
                // Specify the content type.
        urlConn.setRequestProperty
        ("Content-Type", "application/x-www-form-urlencoded");
            printout = new DataOutputStream (urlConn.getOutputStream ());
            String content =
                "data=" + URLEncoder.encode (line);
            printout.writeBytes (content);
            printout.flush ();
            printout.close ();
            // Get response data.
            input = new DataInputStream (urlConn.getInputStream ());
            String str;
            while (null != ((str = input.readLine()))){
                //System.out.println (str);                   
            }

            input.close ();
        }
        System.out.println("Lines in the file: " + count);

Answer 1

我不想以任何方式进行搜索，但为什么不用PHP来阅读文件呢？

Answer 2

如果您多次重复此过程，则一种方法是使用多个数据集。

所以，不要执行INSERT INTO表（字段，字段）VALUES（值，值）并循环多次，你会做$ ins =“INSERT INTO table（field，field）VALUES”;然后在你的foreach循环中（或每次调用代码时）构建一个数组$ ins_values [] =“（escaped_value，escaped_value）”;然后运行查询$ ins.implode（'，'，$ ins_values）。

在这种情况下，Mysql会运行得更快，但要注意Mysql在max_allowed_packet上设置数据限制，这可能需要根据情况进行调整。

希望有帮助，我已正确理解你的问题。

php - 在数据库中存储已发布的JSON对象

2 个答案: