Following certain links with beautiful soup

I've been having a lot of trouble with this problem, and I think I understand the work, but then my head now has a dent in it from banging it on the desk.

What I need to do is make a program that scrapes through a webpage with beautiful soup, but it then gets a certain link (anywhere from the 3rd or 20th link down the page) then goes to that 3rd(or 20th, or whatever number) link and tries to find the 3rd link from that page, over and over, for an unspecified amount of times (im keeping it under 20 for explanation purposes. I need to find the last (3rd) link after however many searches.

I've got my program, but I can't get past the 2nd iteration! I did find a way a couple hours into it and got my answer, but it was an infinite loop, and that's not going to help me learn.

lets say this is what I have to do:

Find the link at position 7(7th link on the first page). Follow that link. Repeat this process 5 times. The answer is the last name from the link that you retrieve.

I've got a way to retrieve the name, just having trouble figuring out a loop!

I also was a little overzealous trying to find another post about this for an hour. There are many similar, but not with this exact problem that I have found. Thanks for your time. Here is what I have so far.

from urllib.request import urlopen
from bs4 import BeautifulSoup

#first page url
url = 'insertwebsitehere.com' 
html = urlopen(url).read()
soup = BeautifulSoup(html)

# Retrieve all of the anchor tags
tags = soup('a')

taglist= []
count = 0

for tag in tags:
    name = tag.contents[0]
    newtag = tag.get('href',None)
    #print (newtag)
    # add count? count += 1 , then do something when it reaches a certain count?
    #taglist.append(newtag), this method didnt really work.

I am a new coder, so I'm trying to do this without advanced techniques, and I don't necessarily need the answer, just help.

Answers


I'm in this Assignement at Python for Informatics via Coursera.

For the loop that repeats a certain amount of times I use:

for _ in range(c)

c is equal to count = input(), so the user can choose how many times want the loop to repeat, in our case is 4 times.


Need Your Help

SQL Server 2005/2008: Identify current user

sql-server sql-server-2005 tsql sql-server-2008 user-management

I have a web application, which is using a SQL Server 2005 database.

Cannot update data in a MySQL View via Workbench or MS Access

mysql ms-access view mysql-workbench insert-update

I created a MySQL Schema db1002 using the workbench. It has one table: id (key, AI) and Name (char45). and the following view: