Python Regex for MAC Addresses

Post #1,000 on this blog. Fitting that it’s Python nerd shit, huh?

I needed a way to search for MAC addresses, which are unique identifiers for networking hardware. For example, if your computer has a built-in Ethernet port, as well as wireless capability, then it has 2 MAC addresses. These are always 6 groups of 2 hexadecimal characters (0 through 9, and A through F). E.g., a valid MAC address would be: 01:98:DF:9E:10:37. Theoretically, every MAC address on every computer in the world will be unique, as the naming scheme provides over 281 trillion possible combinations (281,474,976,710,656).

Canonically these groups of 2 hex digits are separated by a colon, but many people record them with hyphens instead. So I needed to search for this particular pattern of characters amid a potentially-vast amount of text. Enter regular expressions (which I totally suck at using).

The regex I came up with is:

([a-fA-F0-9]{2}[:|\-]?){6}

Going through it, piece by piece:
[a-fA-F0-9] = find any character A-F, upper and lower case, as well as any number
[a-fA-F0-9]{2} = find that twice in a row
[a-fA-F0-9]{2}[:|\-] = followed by either a “:” or a “-” character (the backslash escapes the hyphen, since the hyphen itself is a valid metacharacter for that type of expression; this tells the regex to look for the hyphen character, and ignore its role as an operator in this piece of the expression)
[a-fA-F0-9]{2}[:|\-]? = make that final “:” or “-” character optional; since the last pair of characters won’t be followed by anything, and we want them to be included, too; that’s a chunk of 2 or 3 characters, so far
([a-fA-F0-9]{2}[:|\-]?){6} = find this type of chunk 6 times in a row

Let’s give it a shot.

First, a list of strings… e.g., a row from a comma-delimited file (returned via the csv module):

import re
L = ['aseredf', '55:A8:99:66:77:11', 'wefgcre', '98-75-64-52-48-21']
X = '([a-fA-F0-9]{2}[:|\-]?){6}' # this is the regex
for s in L:
    a = re.compile(X).search(s)
    if a:
        print s[a.start(): a.end()]

Run it:

[gary@roscoe ~]$ python test.py
55:A8:99:66:77:11
98-75-64-52-48-21

Next, a string:

import re
L = 'aseredf 55:A8:99:66:77:11 wefgcre 98-75-64-52-48-21'
X = '([a-fA-F0-9]{2}[:|\-]?){6}' # same regex as above
c = re.compile(X).finditer(L)
if c:
    for y in c:
        print L[y.start(): y.end()]

Run it:

[gary@roscoe ~]$ python test.py
55:A8:99:66:77:11
98-75-64-52-48-21

Fuckin’ bickety-bam, the whole stage comes crashing down.

9 thoughts on “Python Regex for MAC Addresses

  1. Hey, thanks for the RegEx, but it’s slightly flawed:

    Using L = ‘aseredf 55:A8:9966:77:11 wefgcre 98-75-64-52-48-21’

    Results in 55:A8:9966:77:11 being found.

    So I made the RegEx slightly longer and got rid of the “?” sign:
    X = ‘([a-fA-F0-9]{2}[:|\-]){5}[a-fA-F0-9]{2}’

    This no doubt has a bug in it somewhere too, so use at your own risk.

    Thanks again for getting me started!

  2. Your original version misses strings such as 001122334455 with no separator at all. Windows and many devices send MACs in that fashion.

    Scotts’ modified version detects this though.

  3. Actually, Scott’s version needs the ? to make the : or – separator optional so as to match the non-delimited format you describe.

  4. Actually, it needs start and line ending anchors to work, atleast for me.

    ‘^([a-fA-F0-9]{2}[:|\-]?){5}[a-fA-F0-9]{2}$’

    Other wise, this is valid:

    05650:23:AE:67:02:945455341

  5. How about this address 00-11:22-33:44-55?
    This is negative case I think.
    P.S. For regular expression require collaborative test with develop web site ;)

Leave a comment