Python Regex for MAC Addresses
June 26, 2008 — GaryPost #1,000 on this blog. Fitting that it’s Python nerd shit, huh?
I needed a way to search for MAC addresses, which are unique identifiers for networking hardware. For example, if your computer has a built-in Ethernet port, as well as wireless capability, then it has 2 MAC addresses. These are always 6 groups of 2 hexadecimal characters (0 through 9, and A through F). E.g., a valid MAC address would be: 01:98:DF:9E:10:37. Theoretically, every MAC address on every computer in the world will be unique, as the naming scheme provides over 281 trillion possible combinations (281,474,976,710,656).
Canonically these groups of 2 hex digits are separated by a colon, but many people record them with hyphens instead. So I needed to search for this particular pattern of characters amid a potentially-vast amount of text. Enter regular expressions (which I totally suck at using).
The regex I came up with is:
([a-fA-F0-9]{2}[:|\-]?){6}
Going through it, piece by piece:
[a-fA-F0-9] = find any character A-F, upper and lower case, as well as any number
[a-fA-F0-9]{2} = find that twice in a row
[a-fA-F0-9]{2}[:|\-] = followed by either a “:” or a “-” character (the backslash escapes the hyphen, since the hyphen itself is a valid metacharacter for that type of expression; this tells the regex to look for the hyphen character, and ignore its role as an operator in this piece of the expression)
[a-fA-F0-9]{2}[:|\-]? = make that final “:” or “-” character optional; since the last pair of characters won’t be followed by anything, and we want them to be included, too; that’s a chunk of 2 or 3 characters, so far
([a-fA-F0-9]{2}[:|\-]?){6} = find this type of chunk 6 times in a row
Let’s give it a shot.
First, a list of strings… e.g., a row from a comma-delimited file (returned via the csv module):
import re
L = ['aseredf', '55:A8:99:66:77:11', 'wefgcre', '98-75-64-52-48-21']
X = '([a-fA-F0-9]{2}[:|\-]?){6}' # this is the regex
for s in L:
a = re.compile(X).search(s)
if a:
print s[a.start(): a.end()]
Run it:
[gary@roscoe ~]$ python test.py 55:A8:99:66:77:11 98-75-64-52-48-21
Next, a string:
import re
L = 'aseredf 55:A8:99:66:77:11 wefgcre 98-75-64-52-48-21'
X = '([a-fA-F0-9]{2}[:|\-]?){6}' # same regex as above
c = re.compile(X).finditer(L)
if c:
for y in c:
print L[y.start(): y.end()]
Run it:
[gary@roscoe ~]$ python test.py 55:A8:99:66:77:11 98-75-64-52-48-21
Fuckin’ bickety-bam, the whole stage comes crashing down.



