隐蔽js http POST到Java版本

时间:2019-03-29 18:41:20

标签: java http-post instagram instagram-api

我想抓取我的Instagram关注者,并找到以下网站:https://webrobots.io/scrape-instagram-followers/

它包含一种在浏览器网站上使用AJAX调用来获取关注者的方法。

它要求用户首先登录,因此它可能具有cookie。我有一个使用硒登录Instagram的程序,并为用户获取了cookie,并且已经确认下次可以使用cookie直接登录(不需要用户名和密码)。 / p>

我想使用Java来实现这一点,并具有以下代码(请注意,我记录了cookie):

import java.io.BufferedReader;
import java.io.DataOutputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
import java.net.ProtocolException;
import java.net.URL;
import java.nio.charset.StandardCharsets;
import java.util.Set;
import lombok.extern.slf4j.Slf4j;
import org.openqa.selenium.Cookie;

@Slf4j
public class HttpPost {

    private static HttpURLConnection con;

    public static void main(String[] args) throws MalformedURLException,
        ProtocolException, IOException, DataAccessException {

        CustomerDaoImpl customerDao = new CustomerDaoImpl();
        Customer customer1 = customerDao.get(Customer.class, "loggedinuser");
        Set<Cookie> cookies = customer1.obtainCookies();

        StringBuilder stringBuffer = new StringBuilder();
        cookies.stream().forEach(c -> {
            final String cookieLine;
            if (stringBuffer.toString().equals("")) {
                cookieLine = String.format("%s=%s;", c.getName(), c.getValue());
            } else {
                cookieLine = String.format(" %s=%s;", c.getName(), c.getValue());
            }
            stringBuffer.append(cookieLine);
        });

        String cookieInOneLine = stringBuffer.toString().substring(0, stringBuffer.toString().length() - 1);

        log.info("Cookie is {}", cookieInOneLine);

        String user_id = "insusername";

        String request = "q=ig_user(" + user_id + ")+%7B%0A++followed_by.first(20)" +
                         "+%7B%0A++++count%2C%0A++++page_info+%7B%0A++++++end_cursor%2C%0A++++++has_next_page%0A" +
                         "++++%7D%2C%0A++++nodes+%7B%0A++++++id%2C%0A++++++is_verified%2C%0A++++++followed_by_viewer" +
                         "%2C%0A++++++requested_by_viewer%2C%0A++++++full_name%2C%0A++++++profile_pic_url%2C%0A" +
                         "++++++username%0A++++%7D%0A++%7D%0A%7D%0A&amp;amp;amp;ref=relationships%3A%3Afollow_list";


        String url = "https://www.instagram.com/query/";
        //String urlParameters = "name=Jack&occupation=programmer";
        byte[] postData = request.getBytes(StandardCharsets.UTF_8);

        try {

            URL myurl = new URL(url);
            con = (HttpURLConnection) myurl.openConnection();

            con.setDoOutput(true);
            con.setRequestMethod("POST");
//            con.setRequestProperty("User-Agent", "Java client");
            con.setRequestProperty("Content-Type", "application/x-www-form-urlencoded");
            con.setRequestProperty("x-csrftoken", "faHW1D3nRMmnL72Ilu7bMPuHSrG1dyUS");
            con.setRequestProperty("x-instagram-ajax", "1");
            con.setRequestProperty("Cookie", cookieInOneLine);

            try (DataOutputStream wr = new DataOutputStream(con.getOutputStream())) {
                wr.write(postData);
            }

            StringBuilder content;

            try (BufferedReader in = new BufferedReader(
                new InputStreamReader(con.getInputStream()))) {

                String line;
                content = new StringBuilder();

                while ((line = in.readLine()) != null) {
                    content.append(line);
                    content.append(System.lineSeparator());
                }
            }

            System.out.println(content.toString());

        } finally {

            con.disconnect();
        }
    }
}

但是我有一个HTTP 405,看来我不允许执行此POST请求吗?这是因为我没有正确设置headercookie吗?

我从Java得到以下输出:

2019-03-29 13:58:43.452 [main] INFO  test.HttpPost - Cookie is urlgen="{\"72.21.196.67\": 16509}:1h5jBQ:8TQTlVwg7SszekH_d0e2U5-pfso"; ds_user_id=10083971860; mid=XI8T1gAEAAFdXjl7c-veIodiYANe; shbts=1552880603.6637821; sessionid=10083971860%3A38gLQoQB6TLaGB%3A5; csrftoken=faHW1D3nRMmnL72Ilu7bMPuHSrG1dyUS; shbid=717; rur=PRN

Exception in thread "main" java.io.IOException: Server returned HTTP response code: 405 for URL: https://www.instagram.com/query/
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1894)
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492)
    at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:263)
    at insbot.HttpPost.main(HttpPost.java:81)

0 个答案:

没有答案