Python Regex for MAC Addresses

Post #1,000 on this blog. Fitting that it’s Python nerd shit, huh?

I needed a way to search for MAC addresses, which are unique identifiers for networking hardware. For example, if your computer has a built-in Ethernet port, as well as wireless capability, then it has 2 MAC addresses. These are always 6 groups of 2 hexadecimal characters (0 through 9, and A through F). E.g., a valid MAC address would be: 01:98:DF:9E:10:37. Theoretically, every MAC address on every computer in the world will be unique, as the naming scheme provides over 281 trillion possible combinations (281,474,976,710,656).

Canonically these groups of 2 hex digits are separated by a colon, but many people record them with hyphens instead. So I needed to search for this particular pattern of characters amid a potentially-vast amount of text. Enter regular expressions (which I totally suck at using).

The regex I came up with is:

([a-fA-F0-9]{2}[:|\-]?){6}

Going through it, piece by piece:
[a-fA-F0-9] = find any character A-F, upper and lower case, as well as any number
[a-fA-F0-9]{2} = find that twice in a row
[a-fA-F0-9]{2}[:|\-] = followed by either a “:” or a “-” character (the backslash escapes the hyphen, since the hyphen itself is a valid metacharacter for that type of expression; this tells the regex to look for the hyphen character, and ignore its role as an operator in this piece of the expression)
[a-fA-F0-9]{2}[:|\-]? = make that final “:” or “-” character optional; since the last pair of characters won’t be followed by anything, and we want them to be included, too; that’s a chunk of 2 or 3 characters, so far
([a-fA-F0-9]{2}[:|\-]?){6} = find this type of chunk 6 times in a row

Let’s give it a shot.

First, a list of strings… e.g., a row from a comma-delimited file (returned via the csv module):

import re
L = ['aseredf', '55:A8:99:66:77:11', 'wefgcre', '98-75-64-52-48-21']
X = '([a-fA-F0-9]{2}[:|\-]?){6}' # this is the regex
for s in L:
    a = re.compile(X).search(s)
    if a:
        print s[a.start(): a.end()]

Run it:

[gary@roscoe ~]$ python test.py
55:A8:99:66:77:11
98-75-64-52-48-21

Next, a string:

import re
L = 'aseredf 55:A8:99:66:77:11 wefgcre 98-75-64-52-48-21'
X = '([a-fA-F0-9]{2}[:|\-]?){6}' # same regex as above
c = re.compile(X).finditer(L)
if c:
    for y in c:
        print L[y.start(): y.end()]

Run it:

[gary@roscoe ~]$ python test.py
55:A8:99:66:77:11
98-75-64-52-48-21

Fuckin’ bickety-bam, the whole stage comes crashing down.

Posted in Python. 1 Comment »

Overheard in iChat: Part 6

Me: Erin found some baby clothes that said something on the front in PHP.

Me: Like: <?php “baby();”> or something like that.

Me: So that made me think: “Python onesie”.

class Baby(Human):
   def __init__(self):
       return self

Gregg: Isn’t that a singleton?

Me: I don’t know what a “singleton” is… at least how it relates to programming.

Gregg: A class that only exists to return a single object.

Me: Then, yes, it would be.

Gregg: Technically, I guess it should test if other instances exist and either kill them or refuse to create.

Gregg: But it’s a fucking baby thing, there’s not enough space on the child’s front for real design patterns.

Gregg: Looking over those last few lines, I now know why I never get invited to parties.

Me: You wouldn’t want to imply that the baby kills other instances of Baby which subclass Human.

Me: Not yet, anyway.

Me: You could always subclass differently to imply it.

class Baby(Cthulu):
    RECURSION

LayoutError in ReportLab

Attempting To: Generate a PDF file listing a few attributes of various systems. Each set of system information needs to stay together as a block of info on the page; having some on one page, and the remainder on the following page is unacceptable.

Established Method: Employ the “KeepTogether” flowable. Pass a list of things (here, Paragraph objects) to it, and it’s supposed to handle things properly. If the length of the content passed exceeds that of the page, it will auto-magically insert a page break, and place the content on the next page.

Problem: ReportLab throws “LayoutError: Splitting error” on one of the KeepTogether flowable objects. Further, this particular exception seems impossible to handle; wrapping the calls in “try” and “except” clauses appears to have no effect whatsoever, and the error is raised without fail.

Investigation, conclusion & code examples follow…

Read the rest of this entry »

My Programming Philosophy: Not the Last X Percent

When you’re a mediocre programmer (read: me), it’s easy to conjure great ideas because you know their solutions to be possible.

It’s slightly more difficult to implement the main constituent concepts of your idea, but only (and hopefully) because there’s some research and learning involved. Recognizing your skills as mediocre, there’s likely a better way to do things than your current level of knowledge permits. A bit of reading and examination of examples set forth by those less-mediocre than you result in slightly-more-elegant code for your idea.

It may be difficult to stitch together all your elegant concepts into the idea, because you want the implementation of your idea to be elegant, as well. Don’t you?

This is the universally-loathed “Last x Percent”.

If you are unfortunate enough to have developed each of the elegant concepts in isolation, your stitching will involve dealing with myopically-complex data structures, and attempts at creating workable interfaces to them. It will be hellish; you’ll wonder why you opted to do things the way you did. But you’ll be too far into the project to go back and change the basics of the concepts to retain their elegance, yet offer graceful interaction.

Or will you?

The majority of the work you should do after dreaming up a great idea is: Work it through completely, or as completely as you can.

Start by thinking of how you would implement the whole shebang with your current (mediocre, remember?) knowledge and skills. Then, make a list of things you wish existed that would make it easier (tools, definitions, whatever). Next, see if those things already exist; do a moderate amount of homework, but don’t kill yourself. If you can’t find them readily, you already have a concept of what you must do to create them. Perhaps you’ll discover new tools, definitions, whatever that will facilitate making those things on your own.

Either way, you’re making progress.

But there has to be a sweet spot.

As you learn more, your functional knowledge will grow, and how you envisage the path to realizing your elegant idea will change. As every programmer’s painful memories can attest: Scope creep is bad. But it’s worse when it’s self-imposed.

You must consciously determine when you’ve learned enough. You have to be able to cut bait on theory and start typing code that works. Otherwise, you’ll never get it done.

Fulfill your personal need to collect fragments of new knowledge. Feel satisfied that you’ve done enough homework to become a better programmer.

Then do your shit. Make it work.

You can always come back later and repeat the process.

A Reportlab Link in Table Cell Workaround

Front matter: You’re generating a table inside a PDF document with Reportlab, and one of the cells needs to feature a link to an external website. Links aren’t supported (as far as I know) outside a Paragraph object, but shoving a Paragraph into the table cell results in some pretty fucked up word wrapping and table-cell sizing.

It’s ugly, and unacceptable.

According to the Reportlab mailing list:

Paragraphs behave well inside a table cell with a fixed width.

I won’t even try to re-find the post wherein this valuable nugget lies, but it’s somewhere in the mailing-list archives.

The biggest problem here is: You cannot specify individual widths of table cells to fulfill whatever random “okay, now it works” behavior. When you create a Table object, there’s an optional cell-width argument (colWidths), but it applies across the board to every cell in the table. So, for example, if you have one cell with a number and another with a long string of text, you can’t say, “Make the number cell X wide, and make the text cell Y wide.”

Of course, you can just leave the cell-width calculation up to the Table class when you instantiate it… but that merely results in the aforementioned fucked up shit.

You see the problem, eh?

My solution is:

1. Create the link within a Paragraph. Again, AFAIK, this is the only way to link.
2. Determine the length of the linked text. Use pdfmetrics.stringWidth.
3. Create a Table with only one cell whose width is that determined in step 2. This makes things “behave”.
4. Drop that Table into the cell of the real Table you actually want displayed in the final PDF.

Here:

from reportlab.pdfbase import pdfmetrics
from reportlab.platypus import SimpleDocTemplate, Paragraph
from reportlab.platypus.tables import Table

doc = SimpleDocTemplate('filename_of_pdf')
Story = []

str = 'linked text'
url = '<link href="http://the/actual/url.html">%s</link>' % str

L = pdfmetrics.stringWidth(str, your_font_name, your_font_size)
inside_Table = Table([[Paragraph(url, your_paragraph_style)]], colWidths=L)

real_Table_data = ['a', 'b', inside_Table, 'c', 'd']
t = Table(real_Table_data)

Story.append(t)
doc.build(Story)

For the most part: Blickity-Blam!

Sometimes the pdfmetrics.stringWidth() call will still cause your shit to wrap, but throw a +1 after the call in the L declaration (to add a single point, which is 1/72 of an inch) and things will work beautifully.

Faking Javascript with Python

I needed a web form featuring 2 pulldown select lists. The options available in the 2nd pulldown would change based on the user’s selected option in the 1st pulldown. The lazyweb helped, and my final JS functions are a combination of techniques gleaned from “onChange in Select Form Elements” and “Changing Select Element Content on the Fly“.

I suppose it’s pretty effortless to conclude that I’m not much of a Javascript coder. I know just enough to get myself in trouble, but not enough to be productive.

I needed to arrange the data from the database into global arrays so the JS functions would actually work, but the Javascript needed to remain data-agnostic because the data itself will change. That, of course, means no hard-coding.

So, when it came time to try to learn how to create and modify multiple (global) arrays for use in the pulldowns, I ended up running into a number of dead-ends… the results of my shot-in-the-dark attempts and overall ignorance of “doing things the Javascript way”.

So I faked it using Python.

Read the rest of this entry »

Humiliating Firefox Feature

This is what the Firefox browser displays when you accidentally introduce recursion into the web application you’re developing, resulting in an infinite loop.

For example: When the pages you plan to use for user authentication… um… require the user to have already been authenticated to view them… and so redirects back to itself infinitely.

fferror.png

(emphasis mine) For the user, it means “someone did something wrong”.

For the developer, it means “you fucked up, buddy!”

Overheard in iChat: Part 4

Gregg: (to me) While we disagree on elements of the One True Language (i.e., Perl, Python, or PHP), I think we are totally united in saying: “HAHAHA, Twitter was down for 3 hours during the [Apple] keynote. Eat a dick Ruby on Rails fags!”

(here’s what he was talking about)

How to Serve Binary Files in Webware

  1. Override the writeHTML method of the Page superclass.
  2. Set the necessary HTTP headers:
    • Content type with proper MIME type
    • Disposition with proper file name
  3. Open and deliver the binary data.
from WebKit.Page import Page

class Page_Name(Page):
    def writeHTML(self):
        r = self.response()
        r.setHeader('Content-type', 'application/pdf')
        r.setHeader('Content-disposition', 'attachment; filename=whatever.pdf')
        file = open('/path/to/file', 'rb')
        data = file.read()
        file.close()
        self.write(data)

Installing Webware 0.9.4 on CentOS 5 w/ Apache 2

Yup. Just what it says:

Installing Webware 0.9.4 on CentOS 5 w/ Apache 2

Complete instructions, yo.