Following certain links with beautiful soup
I've been having a lot of trouble with this problem, and I think I understand the work, but then my head now has a dent in it from banging it on the desk.
What I need to do is make a program that scrapes through a webpage with beautiful soup, but it then gets a certain link (anywhere from the 3rd or 20th link down the page) then goes to that 3rd(or 20th, or whatever number) link and tries to find the 3rd link from that page, over and over, for an unspecified amount of times (im keeping it under 20 for explanation purposes. I need to find the last (3rd) link after however many searches.
I've got my program, but I can't get past the 2nd iteration! I did find a way a couple hours into it and got my answer, but it was an infinite loop, and that's not going to help me learn.
lets say this is what I have to do:
Find the link at position 7(7th link on the first page). Follow that link. Repeat this process 5 times. The answer is the last name from the link that you retrieve.
I've got a way to retrieve the name, just having trouble figuring out a loop!
I also was a little overzealous trying to find another post about this for an hour. There are many similar, but not with this exact problem that I have found. Thanks for your time. Here is what I have so far.
from urllib.request import urlopen from bs4 import BeautifulSoup #first page url url = 'insertwebsitehere.com' html = urlopen(url).read() soup = BeautifulSoup(html) # Retrieve all of the anchor tags tags = soup('a') taglist=  count = 0 for tag in tags: name = tag.contents newtag = tag.get('href',None) #print (newtag) # add count? count += 1 , then do something when it reaches a certain count? #taglist.append(newtag), this method didnt really work.
I am a new coder, so I'm trying to do this without advanced techniques, and I don't necessarily need the answer, just help.
I'm in this Assignement at Python for Informatics via Coursera.
For the loop that repeats a certain amount of times I use:
for _ in range(c)
c is equal to count = input(), so the user can choose how many times want the loop to repeat, in our case is 4 times.