22nd October, 2006

Python IE Automation - Thorough Tutorial   

Posted in hacking | by evilbitz |

I haven’t seen a lot of info on this topic, so I thought I should post something about this:

Python IE automation is extremely easy using the InternetExplorer.Application COM object. Using this COM object you can automate IE to do all kind of stuff like automating any login process, downloading files or creating some underground bots ;)

Here is how to acquire an interface to InternetExplorer.Application:

>>> from win32com.client import Dispatch
>>> ie = Dispatch(”InternetExplorer.Application”)
>>> ie.visible = 1
>>>
>>> # navigate to your favourite website
>>> ie.navigate(website_address)
>>>

Now your browser should navigate to the website address that you have specified, when the browser is finish loading the page, you can start doing the processing of the results…

This is how you wait for the page to finish loading:

>>> while (ie.ReadyState != 4):
>>> sleep(1)
>>>

When the page is done loading, you can get an interface to the document object, this is the same document that javascript & vbscript contains.

This gives you complete DOM control (domination!) over your current page that you last navigated to.

so let’s see how we can do some nice things with it:

>>> ie.navigate(”http://search.msn.com/“)
>>> ie.document.getElementById(”q”).value = “SinglePageMarketing”
>>> ie.document.getElementById(”srch_btn”).click()
>>>

ok, now what about parsing the results?
we can do this with a DOM like approach, or we can parse the text by ourselves… i chose the later method because it’s easier.

>>> result = ie.document.body.innerHtml
>>> len(result)
5619
>>>

Put aside that the result text is in unicode, to convert it to latin use the encode function:

>>> result = result.encode(’latin-1′, ‘ignore’)

ok, now let’s get a list of all the links that were found by the search engine:

>>> import re
>>> re.findall(”your favourite regexp”, result)

well that’s it! now you know how to do the basics… it’s up to you to build your tools upon it!



There is currently one response to “Python IE Automation - Thorough Tutorial”

Why not let us know what you think by adding your own comment! Your opinion is as valid as anyone elses, so come on... let us know what you think.

  1. 1 On December 27th, 2007, Antonio Xavier said:

    Hi

    Is it possible to view the IE’s security certificate via python script? If so can you please post a sample code.

    Thanks & Rgds

    Antonio Xavier

Leave a Reply

Top »
"If you can't join them, beat them!"
Search Evilbitz: