Python mechanize, following link by url and what is the nr parameter?

I'm sorry to have to ask something like this but python's mechanize documentation seems to really be lacking and I can't figure this out.. they only give one example that I can find for following a link:

response1 = br.follow_link(text_regex=r"cheese\s*shop", nr=1)

But I don't want to use a regex, I just want to follow a link based on its url, how would I do this.. also what is "nr" that is used sometimes for following links?

Thanks for any info

Answers


br.follow_link takes either a Link object or a keyword arg (such as nr=0).

br.links() lists all the links.

br.links(url_regex='...') lists all the links whose urls matches the regex.

br.links(text_regex='...') lists all the links whose link text matches the regex.

br.follow_link(nr=num) follows the numth link on the page, with counting starting at 0. It returns a response object (the same kind what br.open(...) returns)

br.find_link(url='...') returns the Link object whose url exactly equals the given url.

br.find_link, br.links, br.follow_link, br.click_link all accept the same keywords. Run help(br.find_link) to see documentation on those keywords.

Edit: If you have a target url that you wish to follow, you could do something like this:

import mechanize
br = mechanize.Browser()
response=br.open("http://www.example.com/")
target_url='http://www.rfc-editor.org/rfc/rfc2606.txt'
for link in br.links():
    print(link)
    # Link(base_url='http://www.example.com/', url='http://www.rfc-editor.org/rfc/rfc2606.txt', text='RFC 2606', tag='a', attrs=[('href', 'http://www.rfc-editor.org/rfc/rfc2606.txt')])
    print(link.url)
    # http://www.rfc-editor.org/rfc/rfc2606.txt
    if link.url == target_url:
        print('match found')
        # match found            
        break

br.follow_link(link)   # link still holds the last value it had in the loop
print(br.geturl())
# http://www.rfc-editor.org/rfc/rfc2606.txt

I found this way to do it, for reference for anyone who doesn't want to use regex:

r = br.open("http://www.somewebsite.com")
br.find_link(url='http://www.somewebsite.com/link1.html')
req = br.click_link(url='http://www.somewebsite.com/link1.html')
br.open(req)
print br.response().read()

Or, it will work by the link's text also:

r = br.open("http://www.somewebsite.com")
br.find_link(text='Click this link')
req = br.click_link(text='Click this link')
br.open(req)
print br.response().read()

From looking at the code, I suspect you want

response1 = br.follow_link(link=LinkObjectToFollow)

nr is the same as documented under the find_link call.

EDIT: In my first cursory glance, I didn't realize "link" wasn't a simple link.


nr is used for where exactly link you follow. if the text or url you has been regex more than one. default is 0 so if you use default you will follow link first regex at all . for example the source :

<a href="link.html>Click this link</a>
<a href="link2.html>Click this link</a>

in this example we need to follow "Click this link" text but we choose link2.html to follow exactly

br.click_link(text='Click this link', nr=1)

by it you will get link2.html response


Need Your Help

Run ShellScript Without Authorization Popup for MAC Application

objective-c macos shell root privileges

I am working on MAC Application in which I want to Remove Helper tool previously installed by my application.

Update only part of a Model, validation problem?

c# asp.net-mvc linq-to-sql model

I have a sign-up page where the user can input his FirstName, LastName, Email and Password, along with other fields.