Python : how to get the text in <li> using BeautifulSoup

Here is the html file I am going to handle:

<ul class="canTouch" data-com="hrefTo,href:'/movie/246286?_v_=yes'">
    <li class='c1'>
        <b>Important text</b>
        <br><em>useless text </em><em style="margin-left: .1rem">useless text</em>
    </li>
    <li class="c2 ">
        <b>938.6</b><br/>
    </li>
    <li class="c3 ">19.7%</li>
    <li class="c4 ">19.6%</li>
    <li class="c5 ">
        <span style="margin-right:-.1rem">8.6%</span>
        <span style="padding-right:.24rem" class="_more"></span>
    </li>
</ul>

There are many ul tags in the file, and here is my code:

for ul in soup.find_all('ul')[3:]:
lis=ul.find_all('li')
for elem in lis:
    records.append(elem.text.strip())

I don't want the useless text in the em tags of the li but I need the important text in the b tag:

<li class='c1'>
    <b>Important text</b>
    <br><em>useless text<em style="margin-left: .1rem">useless text</em>
 </li>

What should I do?

Answers


The change is going to be trivial, replace:

records.append(elem.text.strip())

with:

records.append(elem.b.text.strip())

Need Your Help

How to add onclick event to exist element by javascript? (document.getElementbyID)

javascript-events onclick getelementbyid

I have a button in my project that when you click over it a function call and add onclick event to all certain elements in my project and show my hidden popup element container.

Parsing SQL to determine complexity level

perl teradata bison flex-lexer sql-parser

I have to determine the complexity level (simple/medium/complex etc) of a sql by counting number of occurrences of specific keywords, sub-queries, derived tables, functions etc that constitute the ...