CodeQL documentation

Unmatchable caret in regular expression

ID: py/regex/unmatchable-caret
Kind: problem
Severity: error
Precision: high
Tags:
   - reliability
   - correctness
Query suites:
   - python-security-and-quality.qls

Click to see the query in the CodeQL repository

The caret character ^ anchors a regular expression to the beginning of the input, or (for multi-line regular expressions) to the beginning of a line. If it is preceded by a pattern that must match a non-empty sequence of (non-newline) input characters, then the entire regular expression cannot match anything.

Recommendation

Examine the regular expression to find and correct any typos.

Example

In the following example, the regular expression r"\[^.]*\.css" cannot match any string, since it contains a caret assertion preceded by an escape sequence that matches an opening bracket.

In the second regular expression, r"[^.]*\.css", the caret is part of a character class, and will not match the start of the string.

import re
#Regular expression includes a caret, but not at the start.
matcher = re.compile(r"\[^.]*\.css")

def find_css(filename):
    if matcher.match(filename):
        print("Found it!")
        
#Regular expression for a css file name
fixed_matcher_css = re.compile(r"[^.]*\.css")

References