5 – M5 SC 12 Searching The Parse Tree Part 1 V1

Hello and welcome back. In this notebook, we will begin to explore how to search the parse tree created by BeautifulSoup. Now, BeautifulSoup provides a number of methods for searching the tree, but we will only cover the find all method in these lessons. If you’re interested, you can learn about other search methods in the BeautifulSoup Documentation. The find all method will search an entire document for the given filter. This filter can be a string containing the HTML or XML tag name, a tag attribute, or even a regular expression. In this notebook, we will see examples of these cases. So, let’s get started. So, let’s begin by using the find all method to find tags. To do this, all we have to do is to parse the name of the tag as a string to the find all method. For example, if we wanted to find all the h2 tags in our sample HTML file, all we have to do is to parse the string h2 to the find all method. So, if we run this code, we can see that the find all method returns a list with all the h2 tags that it found, in this case only two. Because our sample HTML file only has two h2 tags. Since lists are iterables, we can loop through the h2 list and print each tag as we have done here. We can also search for more than one tag at a time by parsing a list to the find all method. Let’s see how this works. Let’s suppose we wanted to search for all the h2 and p tags in our sample HTML file. Instead of using two statements, one for the h2 tag and one for the p tag, we can just parse a list with the strings h2 and p to the find all method. If we run this code, we can see that we get all the h2 and p tags. Now, as we saw before, HTML and XML tags can have attributes. The find all method also allows us to parse some arguments such as the attribute of a tag so that we can search the entire document for the exact tag that we’re looking for. Let’s see an example. Let’s recall that in our sample HTML file we have two h2 tags, this one and this one. We can see that the first h2 tag has the attribute id equal to hub and the second h2 tag has the attribute id equals to know. Let’s suppose we wanted to search our sample HTML file only for the h2 tags that have the attribute id equals to know. In our case, the only h2 tag with such attribute is the second tag right here. To do this, we can include the id equals know in the find all method as shown here. So, if we run this code, we can see that we only get the h2 tag that has the attribute id equals to know, just as we wanted. Another property of the find all method is that it allows us to search for tag attributes directly. For example, let’s suppose we wanted to search or sample HTML file for all the tags that have the attribute id equals intro. We can do this simply by parsing the id equals intro to the find all method as shown here. So, if we run this code, we can see that we only get one match. Since the h1 tag is the only tag in our sample HTML file that has the attribute id equals intro.

%d 블로거가 이것을 좋아합니다: