How can I use BeautifulSoup to find all the links in a page pointing to a specific domain?

How can I use BeautifulSoup to find all the links in a page pointing to a specific domain?

Answers


Use SoupStrainer,

from BeautifulSoup import BeautifulSoup, SoupStrainer
import re

# Find all links
links = SoupStrainer('a')
[tag for tag in BeautifulSoup(doc, parseOnlyThese=links)]

linkstodomain = SoupStrainer('a', href=re.compile('example.com/'))

Edit: Modified example from official doc.


Need Your Help

How to split one column of database into multiple columns and compare the values of date-time datatype with string datatype

mysql database excel sql-server-2008 join

I have two different MS excel sheets containing 40000 rows of data having around 12 columns.I would like to import the data from both the sheets in a database(any database can be used) into 2 diffe...

Insert PHP array into MySQL database as text

php mysql arrays

I am willing to insert an array to a database as normal text.