Duplication in regular expression character class¶
ID: py/regex/duplicate-in-character-class Kind: problem Severity: warning Precision: very-high Tags: - reliability - readability Query suites: - python-security-and-quality.qls
Character classes in regular expressions represent sets of characters, so there is no need to specify the same character twice in one character class. Duplicate characters in character classes are at best useless, and may even indicate a latent bug.
Determine whether a character is simply duplicated or whether the character class was in fact meant as a group. If it is just a duplicate, then remove the duplicate character. If was supposed to be a group, then replace the square brackets with parentheses.
In the following example, the character class
[password|pwd] contains two instances each of the characters
w. The programmer most likely meant to write
(password|pwd) (a pattern that matches either the string
"password" or the string
"pwd"), and accidentally mistyped the enclosing brackets.
import re matcher = re.compile(r"[password|pwd]") def find_password(data): if matcher.match(data): print("Found password!")
To fix this problem, the regular expression should be rewritten to