How can I programmatically scrape a web page and "click" a javascript button?

I'm trying to scrape a web page for work where there are hundreds of table rows with a check box, and to submit the form I need to click a button which calls a javascript function. The button in html looks like this:

<a onclick="JavaScript: return verifyChecked('Resend the selected request for various approvals?');"
id="_ctl0_cphMain_lbtnReapprove"
title="Click a single request to send to relevant managers for reapproval."
class="lnkDBD" href="javascript:__doPostBack('_ctl0$cphMain$lbtnReapprove','')"
style="border-color:#0077D4;border-width:1px;border-style:Solid;text-decoration: overline;">&nbsp;Resend&nbsp;</a>

I know with libraries like beautiful soup you can submit forms by adding post data to the url, but how could I check a checkbox and "click" this javascript button? The website is a help desk of sorts, and for this particular button we can only check one request at a time which takes way too long when there are hundreds of requests that need re-submitted.

When I check the checkbox a message also pops up verifying that I want to do this, I don't know if that will affect programmatically submit it.

EDIT: I forgot to include the doPostBack method.

<script type="text/javascript"> 
<!--
var theForm = document.forms['aspnetForm'];
if (!theForm) {
    theForm = document.aspnetForm;
}
function __doPostBack(eventTarget, eventArgument) {
    if (!theForm.onsubmit || (theForm.onsubmit() != false)) {
        theForm.__EVENTTARGET.value = eventTarget;
        theForm.__EVENTARGUMENT.value = eventArgument;
        theForm.submit();
    }
}
// -->
</script>

Answers


Get Firefox and Firebug, open Firebug load up the page, and look in the console tab for what its actually sending to the server.

Then just repeat what its sending using what ever tool you like.


You're probably better off using a browser automation library like selenium for something like this.


Try Imacros. For simple browser automation it's excellent. You can record your sessions and it makes code based on that. If there is more logic, standard programming in the non-complex documentation can have you going fast. You can cal outside language / scripts as well. A few projects for example I've used this for:

1) collect business leads: a site had a list of all ther retail stores but would not give them all just close to a user input zip code. In spreadsheet put a ton of zip codes and when ran, would go through each one from csv and scrape info for store in csv file. Every 5 minutes would open VPN program on pc and change ip. Took. 20 minutes to make.

I'd your set on programming it then ok, but I find this the best way as its easier to debug if site changes , their "code" is very easy and you can call other scripts and files with ease. Firefox add on is most stable and free.


Need Your Help

How to create a canvas animation with one moving triangle/rectangle?

javascript html5 animation canvas frame

The Problem is, that the animation creates a triangle every frame. But I want, that just one triangle moves / be animated. How do I fix this?

rails 3.1 Assets:precompile rake aborted! Permission denied?

ruby-on-rails ruby-on-rails-3 chmod rails-3.1 chown

I had problems on heroku trying to upload pictures to my app. They were uploaded to AWS but was giving an "Access Denied" error in the database.