Last Updated: February 25, 2016
·
1.613K
· iheanyi

Remember, HTTPClients aren't just for APIs!

This past semester, I was in a Mobile Application Development course, specifically Android. Of course, we worked with APIs a little bit, but the instructor only taught how to use HTTPClient (in Android, at least), to send GET requests to a server, not post. For example:

DefaultHttpClient client = new DefaultHttpClient();
HttpGet request = new HttpGet(URLToFetch);

 HttpResponse rp = client.execute(request);

 if(rp.getStatusLine().getStatusCode() == HttpStatus.SC_OK)
{
HttpEntity result = rp.getEntity();
pageHTML = EntityUtils.toString(result);                
}

Of course, in usage with an API, this can return the relevant JSON/XML that you need, but what if the website does not have an API? What then? I was developing a mobile Class Search Application that got all the classes that my school has, but rather than recreating a database I decided just to manually submit a form to the page where the form was located.

You have to understand the underlying structure of a form before you submit a POST request, but this is easily done by just using Developer Tools within Google Chrome to view the form options, and obtain their values and their options values. Here is how I circumvented that problem (on a mobile device, at least):

private HttpClient createHttpClient()
{
    //Toast.makeText(context, "HTTP Client Created", Toast.LENGTH_SHORT);
    HttpParams params = new BasicHttpParams();
    HttpProtocolParams.setVersion(params, HttpVersion.HTTP_1_1);
    HttpProtocolParams.setContentCharset(params, HTTP.DEFAULT_CONTENT_CHARSET);
    HttpProtocolParams.setUseExpectContinue(params, true);

    SchemeRegistry schReg = new SchemeRegistry();
    schReg.register(new Scheme("http", PlainSocketFactory.getSocketFactory(), 80));
    schReg.register(new Scheme("https", SSLSocketFactory.getSocketFactory(), 443));
    ClientConnectionManager conMgr = new ThreadSafeClientConnManager(params, schReg);

    return new DefaultHttpClient(conMgr, params);
}

private String getPage( String deptKey ) {
    String pageHTML = "NO HTML FOUND";


    try {
        HttpClient client = createHttpClient();
        HttpPost post = new HttpPost(CLASS_PAGE);

        List<NameValuePair> pparams = new ArrayList<NameValuePair>();
        pparams.add(new BasicNameValuePair("TERM", "201220"));
        pparams.add(new BasicNameValuePair("DIVS", "A"));
        pparams.add(new BasicNameValuePair("CAMPUS", "M"));
        pparams.add(new BasicNameValuePair("SUBJ", deptKey));
        pparams.add(new BasicNameValuePair("ATTR","0ANY"));
        pparams.add(new BasicNameValuePair("CREDIT", "A"));
        //pparams.add(new BasicNameValuePair("SUBJ", "CSE"));
        UrlEncodedFormEntity ent;
        ent = new UrlEncodedFormEntity(pparams);

        post.setEntity(ent);
        HttpResponse rp = client.execute(post);
        if(rp.getStatusLine().getStatusCode() == HttpStatus.SC_OK) {
        HttpEntity result = rp.getEntity();
        pageHTML = EntityUtils.toString(result);
        }

    }
    catch (Exception e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    } 

    return pageHTML;

}

Now for the explanation, createHTTPClient creates an HTTPSClient, which allows for posting to a site with a Secure Socket Layer (https://), such as my school. After creating that, the HTTPClient navigates to the CLASS_PAGE (in this case (was.nd.edu/reg/srch/ClassSearchServlet). By examining the form structure, those are all the necessary values that you need in order to submit the form. In order for an HTTPPost to work on a form, you have to make sure every single form option has a valid value (you can double check this by using Chrome Developer tools, going to the network tab, then running a mock submission of the form to see what the response variables are from the form). After this is done, you merely just have to make a new UrlEncodedFormEntity, set the entity of the post and execute the post through the client, and voila! You have your response HTML and you are free to parse the results with whichever library you choose (JSoup, in this example for it's lightweight usage).

This same logic should work in other languages, such as Python and Ruby, for those of you who may be stuck in a similar situation. A classmate of mine has executed the same thing in PHP with great success, no external libraries used. Happy POSTing!

2 Responses
Add your response

Mechanize for Ruby/Perl and HtmlUnit for Java too unnecessarily big in those cases for you ?

over 1 year ago ·

@djurczak In terms of Mechanize for Ruby/Python/Perl, size doesn't necessarily matter in that case because I do not believe that they are that large. However, if you were using them in a Web Application, having multiple calls to an external library could add up and using an HTTP Post would be a way to minimize resources and speed up the execution process of the script/application.

Now, HTMLUnit for Java was the reason why I started using HTTPPosts instead of external libraries. The size of HTMLUnit is about 4-10 MB, the actual size escapes me, but either way, using a library that large for an Android application is impractical. Nobody wants to add 4 MB to their APK, when it could be used for other resources, hence why an HTTPPost is better. Also, HTMLUnit/Selenium don't work on Android anyways, and even when people have gotten them to work, they take up valuable space that the user needs.

over 1 year ago ·