Regular Expressions (RegEx) are a powerful tool in Python for searching, matching, and manipulating strings. They allow you to define complex search patterns and work with text data efficiently. Whether you're filtering logs, validating input, or parsing data, RegEx is an essential skill for any Python programmer.
📌 What is RegEx?
A Regular Expression (RegEx) is a sequence of characters that defines a search pattern. It is commonly used for:
- Validating input (e.g., emails, phone numbers, passwords)
- Searching for specific patterns in text
- Replacing or splitting text based on rules
- Data cleaning and preprocessing
📌 RegEx Module in Python
Python provides the built-in re
module to work with regular expressions.
import re
Once imported, you can use various functions like findall()
, search()
, split()
, sub()
, and match objects
.
📌 RegEx Functions in Python
Function | Description | Example |
---|---|---|
findall() |
Returns all matches in a list | re.findall("ai", "The rain in Spain") → ['ai', 'ai'] |
search() |
Returns the first match object | re.search("ai", "The rain in Spain") |
split() |
Splits string by pattern | re.split("\s", "Hello World") → ['Hello','World'] |
sub() |
Replaces matches with text | re.sub("\s", "-", "Hello World") → "Hello-World" |
📌 Metacharacters in RegEx
Metacharacters are special symbols in RegEx with specific meanings.
Metacharacter | Description | Example |
---|---|---|
. | Any character except newline | "he..o" → matches "hello" |
^ | Starts with | "^Hello" → matches "Hello World" |
$ | Ends with | "World$" → matches "Hello World" |
* | 0 or more occurrences | "aix*" → matches "ai", "aix" |
+ | 1 or more occurrences | "aix+" → matches "aix" |
{} | Exact number of occurrences | "al{2}" → matches "all" |
[] | Set of characters | "[a-m]" |
() | Grouping | "(abc)?" |
| | OR | "cat|dog" → matches "cat" or "dog" |
📌 Flags in RegEx
Flags modify the behavior of RegEx patterns.
Flag | Description |
---|---|
re.I | Case-insensitive matching |
re.M | Multi-line matching |
re.S | Dot matches newline |
re.X | Allow whitespace and comments in pattern |
📌 Special Sequences
Sequence | Description |
---|---|
\d | Matches digits [0-9] |
\D | Matches non-digits |
\s | Matches whitespace |
\S | Matches non-whitespace |
\w | Matches word characters |
\W | Matches non-word characters |
\b | Matches word boundary |
\A | Matches start of string |
\Z | Matches end of string |
📌 Sets
Sets let you define a range of characters inside []
.
[abc]
→ Matches 'a', 'b', or 'c'[a-z]
→ Matches lowercase letters[0-9]
→ Matches digits[^0-9]
→ Matches non-digits
📌 Match Object
The Match object provides details about the search result.
import re
txt = "The rain in Spain"
x = re.search("ai", txt)
print(x.span()) # (5, 7)
print(x.start()) # 5
print(x.end()) # 7
print(x.string) # The rain in Spain
📌 Examples of RegEx Functions
✔ findall()
import re
txt = "The rain in Spain"
print(re.findall("ai", txt)) # ['ai', 'ai']
✔ search()
txt = "The rain in Spain"
x = re.search("ai", txt)
print("First match at:", x.start())
✔ split()
txt = "The rain in Spain"
print(re.split("\s", txt)) # ['The', 'rain', 'in', 'Spain']
✔ sub()
txt = "The rain in Spain"
print(re.sub("\s", "-", txt)) # The-rain-in-Spain
💡 Tips for Using RegEx
- Always test patterns with Regex101.
- Keep patterns simple and readable.
- Use raw strings in Python (
r"pattern"
) to avoid escaping issues. - Use grouping
()
and backreferences for complex replacements.
📝 Exercises
- Write a RegEx to validate an email address.
- Extract all numbers from the string
"Order 123, Bill 456, Item 789"
. - Split a text into words using whitespace as a delimiter.
- Replace all vowels in a string with
*
.
❓ FAQs
Q1. What is the difference between search() and match()?
match() checks only at the beginning of the string, while search() looks anywhere in the string.
Q2. How do I ignore case sensitivity in RegEx?
Use the re.I
flag.
Q3. Are RegEx patterns the same in all languages?
Mostly yes, but some implementations vary. Python uses Perl-style RegEx.
Q4. Can RegEx handle multiline text?
Yes, use the re.M
flag.
✅ Conclusion
Python RegEx is a powerful tool for string manipulation and pattern matching. By mastering the re
module, you can validate, search, replace, and split text with ease. Understanding metacharacters, flags, and match objects will make you more efficient in handling text-based data.
👉 Related Reads
- Python Programming: A Beginner’s Guide
- Python Variables
- Python Data Types
- Python Strings
- Python Operators
- Python Lists
- Python Tuples
- Python Sets
- Python Dictionaries
- Python If...Else Statements and Conditions
- Python Match Statement
- Python Functions
- Python Lambda
- Python Arrays
- Python Classes & Objects
- Python Inheritance
- Python Iterators
- Python Polymorphism
- Python Scope
- Python Modules
- Python Datetime
- Python Math
- Python Exception Handling
- Python File Handling
- Python Generators & Itertools
- Python Decorators
- Python Exercises & Projects
- Python JSON
0 Comments