7 – M5 SC 14 Searching The Parse Tree Part 3 V1

Hello, and welcome back. In this notebook, we will take a look at the recursive argument in the FindAll method. But in order to understand how the recursive argument works, we must first take a look at some basic properties of child tags. So, let’s get started. For simplicity, in the following examples, we will use a simpler HTML file named sample2. Here is what the sample2 HTML file looks like. As we can see, the HTML tag has child tags. For example, the head tag is a direct child of the HTML tag. Similarly, the title tag is a direct child of the head tag, and finally, the AI for trading string is a direct child of the title tag. BeautifulSoup provides a lot of different attributes for navigating over the tag’s children. We already saw that we can access child tags as if they were attributes of the parent tag. For example, we can access the string, AI for trading, from our BeautifulSoup object by using.head.title.get_text. Another way to navigate through a tag’s children is by using the.contents attribute of the tag object. The.contents attribute returns a list with all of the tag’s children. Let’s see an example. Let’s suppose we wanted to get a list of all the children of the head tag, we can do this by accessing the head tag first and then using the.contents attribute like we’ve done here. If we run this code, we can see that the.contents attribute has returned a list with all the children of the head tag. Also, by counting how many elements this list has, we can see how many children a parent tag has. In this case, we can see that the head tag contains four children. Another way to navigate through a tag’s children is through the.children attribute. The.children attribute works the same way as a.contents attribute except that it doesn’t return a list, but rather, it returns an iterator. For example, here, we have created a for loop that iterates over the head’s tag’s children by using the.children attribute. Now, let’s take a look at the recursive argument. If we use the FindAll method on an tag object like this, then the FindAll method will search all of the tag’s children, it’s children’s children, and so on. However, there will be times where you only want BeautifulSoup to search a tag’s direct children. To do this, we can pass the recursive=False argument to the FindAll method. Let’s see how this works. Let’s start by printing out our sample2.html file as we did before to see its structure. We can see that the head tag is directly beneath the html tag. We also see that the title tag is directly beneath the head tag. Even though the title tag is beneath the html tag, it is not directly below it because the head tag is in the way. Now, keeping that structure in mind, if we search the html tag for the title tag using the FindAll method, we will definitely find a match because the FindAll method is searching in all the descendants of the html tag. Now, let’s restrict ourselves to only look at the direct children of the html tag by using the recursive=False argument. If we run this code, we can see that now, we get no matches because the title tag is not a direct descendant of the html tag.

%d 블로거가 이것을 좋아합니다: