Last Updated: February 25, 2016
·
3.63K
· yko

Using PhantomJS from Perl

Basically what you need it to pass a chunk of structured data from Perl to PhantomJS and let it make it's job.
In order to achieve this you serialize the data you would like to use in PhantomJS job via JSON library and pass it to a predefined phantomjs script.

First at all, you prepare a small JS template to do your job on the PhantomJS side:

/* 
  A (JavaScript) template for the PhantomJS job.
  It assumes the inputData variable is defined above 
*/

console.log("[PhantomJS] Here I shall do something useful with the following data:");
console.log(inputData);
console.log("[PhantomJS]");

/*
    Capture Screenshots of the page(s)
    https://github.com/ariya/phantomjs/wiki/Screen-Capture

    Capture network traffic of the page(s)
    https://github.com/ariya/phantomjs/wiki/Network-Monitoring

    Whatever useful you can find at the official wiki
    https://github.com/ariya/phantomjs/wiki/_pages
*/

phantom.exit();

And then you run it from your Perl program with PhantomJS. Something like this:

#!/usr/bin/env perl
use strict;
use warnings;

use File::Temp 'tempfile';
use JSON;

my $script_content = join '', <DATA>;

run_phantom(["http://google.com", "http://twitter.com"]);
# or ...
run_phantom({foo => "bar"});

sub run_phantom {
    my $data = shift;

    my $phantomjs = $ENV{PHANTOMJS} || 'phantomjs';

    # Generate temporary script file
    my $tempfile = File::Temp->new(
        TEMPLATE => "phantomjs-script-XXXXXX",
        SUFFIX   => '.js',
        # You might want to turn this off
        # for debugging purposes
        CLEANUP => 1
    );
    print $tempfile "var inputData = " . encode_json($data) . ";\n";
    print $tempfile $script_content;
    close $tempfile;

    # Run it
    system($phantomjs, $tempfile);
    if ($?) {
        my $exitcode = $? >> 8;
        die
            "-- PhantomJS exited with code $exitcode\n";
    }
    print "-- PhantomJS succeeded\n";
}

print "-- Success\n";

__DATA__

/* 
  A (JavaScript) template for the PhantomJS job.
  It assumes the inputData variable is defined above 
*/

console.log("[PhantomJS] Here I shall do something useful with the following data:");
console.log(inputData);
console.log("[PhantomJS]");

/*
    Capture Screenshots of the page(s)
    https://github.com/ariya/phantomjs/wiki/Screen-Capture

    Capture network traffic of the page(s)
    https://github.com/ariya/phantomjs/wiki/Network-Monitoring

    Whatever useful you can find at the official wiki
    https://github.com/ariya/phantomjs/wiki/_pages
*/

phantom.exit();

Just make sure that phantomjs binary is in your path or is difined by the 'PHANTOMJS' environment variable.

And it just works, without compiling the whole binding against PhantomJS source code and so on.

In most cases what you need is to process a list of <something> and get an appropriate list of results. Whatever it is: screenshots, traffic monitoring, looking for external, dumping data structures or running your automated tests with a specific set of data. If you want to use PhantomJS from Perl, you already have a list of items to process.

So the idea is to pass all that data to the JS land and process with all features PhantomJS provides.

Sure, in some rare cases you really need to directly access PhantomJS API from your program because of performance concerns. But this is a different story.

Links to the documentation for all above:

PhantomJS:

https://github.com/ariya/phantomjs/wiki
https://github.com/ariya/phantomjs/wiki/Screen-Capture
https://github.com/ariya/phantomjs/wiki/Network-Monitoring

Perl JSON library:
https://metacpan.org/module/JSON

Perl 'system()' function:
http://perldoc.perl.org/functions/system.html

2 Responses
Add your response

thanks for sharing, I recently found Rob Hammond's solution using MoJo, PhantomJS, and Selenium::Remote::Driver.

http://blogs.perl.org/users/robhammond/2013/02/web-scraping-with-perl-phantomjs.html

TMTOWTDI

--dave

http://dave.thehorners.com/tech-talk/programming/94-perl-programming

over 1 year ago ·

Thank you dave, I'll take a look at it.

over 1 year ago ·