5 – M5 SC 5 Word Boundaries V1

We will now learn about another special sequence that you can create using the backslash, namely backslash b. The special sequence doesn’t really match a particular set of characters, but rather determines word boundaries. A word in this context is defined as a sequence of alphanumeric characters, while a boundary is defined as a whitespace, a non-alphanumeric character, or the beginning or end of a string. Let’s see how this works with an example. Here, we have a sentence that contains the word class with their friend word boundaries. In the first case, we have class as a standalone word with the boundary right before it and right after it. Here, we have class with a boundary right before it, and here we have class with a word boundary right after it. If we use class as a regular expression, we will find that we can match all three cases. Now, let’s say I only wanted it to match the word class if it was at the beginning of a word such as in class or classroom. To do this, we can add backslash b before the word class in our regular expression. This regular expression will only find the word class when it has a boundary directly before it. So, if we run this code, we can see that we only get two matches corresponding to the standalone word class and to the class in classroom, since these are the only two cases where there’s a word boundary right before the word class. Also notice that we didn’t catch the class in subclass because there’s no word boundary directly before it. Now, let’s say I only want it to match the word class if it was at the end of a word such as in class or subclass. To do this, we can add backslash b after the word class in our regular expression. This regular expression will only find the word class if it has a word boundary directly after it. So, if we run this code, we can see that we only get two matches corresponding to the standalone word, class once again, and to the class in subclass since these are the only two cases where there is a word boundary right after the word, class. Also notice that we didn’t catch the class in classroom since there’s no word boundary directly after it. Now, let’s say I only want it to match the word class, if it was a standalone word. To do this, we can add backslash b before and after the word class in our regular expression. This regular expression will only find the word class if it has boundaries directly before and after it. So, if we run this code, we can see that we only get one match corresponding to the standalone word class, since this is the only case where the word class has a boundary right before it and right after it. As with the other spatial sequences that we saw before, we also have the uppercase version of backslash b. This sequence does the opposite of backslash b, only matching when the current position is not a word boundary. Let’s see how this works. Let’s use backslash uppercase b to only find the word class when it doesn’t have a boundary directly before it. If we run this code, we can see that we only have one match. This is because the class in sub-class is the only one that doesn’t have a boundary directly before it. Similarly, let’s use backslash uppercase b to only find the word class when it doesn’t have a boundary directly after it. If we run this code, we can see, again, that we only get one match. This is because the class in classroom is the only one that doesn’t have a boundary directly after it. Finally, let us use backslash uppercase b to only find the word class when it doesn’t have a boundary directly before or after it. If we run this code, we can see that we get no matches. This is because all instances of the word class in our sentence have a boundary either before or after it. In order to have a match in this case, the word class will have to be in the middle of a word such as in the word declassified. If we run this code, we can see that we actually get one match.

%d 블로거가 이것을 좋아합니다: