1. Introduction:
The re
module in Python provides a powerful set of functions for working with regular expressions. Pattern matching with re
involves defining a pattern and searching or matching that pattern within a given string.
2. Basic Pattern Matching:
The simplest form of pattern matching involves searching for a specific sequence of characters.
import re
pattern = re.compile(r'hello')
result = pattern.search('hello world')
print(result.group()) # Output: hello
3. Character Classes and Quantifiers:
Using character classes and quantifiers to match more complex patterns.
pattern = re.compile(r'\d{2,4}')
result = pattern.search('The year is 2022')
print(result.group()) # Output: 2022
4. Anchors for Start and End:
Using anchors ^
and $
to match the start and end of a string.
pattern = re.compile(r'^\d{3}-\d{2}-\d{4}$')
result = pattern.match('123-45-6789')
print(result.group()) # Output: 123-45-6789
5. Character Classes and Negation:
Matching characters within a specific range or negating a character class.
pattern = re.compile(r'[aeiou]')
result = pattern.findall('Hello World')
print(result) # Output: ['e', 'o', 'o']
pattern = re.compile(r'[^aeiou]')
result = pattern.findall('Hello World')
print(result) # Output: ['H', 'l', 'l', ' ', 'W', 'r', 'l', 'd']
6. Groups and Capturing:
Using groups to capture specific parts of a pattern.
pattern = re.compile(r'(\d+)-(\d+)')
result = pattern.match('123-456')
print(result.group(1)) # Output: 123
print(result.group(2)) # Output: 456
7. Quantifiers and Greedy Matching:
Understanding how quantifiers operate and using non-greedy quantifiers.
pattern = re.compile(r'\d+')
result = pattern.match('123456789')
print(result.group()) # Output: 123456789
pattern = re.compile(r'\d+?')
result = pattern.match('123456789')
print(result.group()) # Output: 1
8. Named Groups:
Assigning names to groups for easier access.
pattern = re.compile(r'(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})')
result = pattern.match('2022-01-15')
print(result.group('year')) # Output: 2022
print(result.group('month')) # Output: 01
print(result.group('day')) # Output: 15
9. Search and Findall:
Using search
to find the first match and findall
to find all matches.
pattern = re.compile(r'\d+')
result = pattern.search('There are 42 apples and 36 oranges.')
print(result.group()) # Output: 42
pattern = re.compile(r'\d+')
result = pattern.findall('There are 42 apples and 36 oranges.')
print(result) # Output: ['42', '36']
10. Conclusion:
Pattern matching with the re
module is a powerful tool for working with textual data in Python. It provides a flexible and expressive way to define and search for patterns within strings. Whether you’re validating user input, extracting information, or manipulating text, regular expressions are an invaluable tool in your Python programming toolkit.
In the next sections, we’ll explore more advanced topics and practical applications of regular expressions in Python.