Create a valid ISIN

For a suite of tests I recently wrote, I had to create valid ISINs. The official page doesn’t give away much on how the final checksum digit should be computed and the examples on Wikipedia are, in my opinion, not particularly clear.

While I was trying to understand more, I stumbled upon a related unanswered question on stack overflow, which was basically after what I was trying to do, so I took the time to answer it.

Here I report a slightly modified version of the code snippet linked above, to create the checksum digit for the first 11 characters of an ISIN.

import string

def digit_sum(n):
    return (n // 10) + (n % 10)

alphabet = {letter: value for (value, letter) in
            enumerate(''.join(str(n) for n in range(10)) + string.uppercase)}

def isinChecksumDigit(isin):
    isin_to_digits = ''.join(str(d) for d in (alphabet[v] for v in isin))
    isin_sum = 0
    for (i, c) in enumerate(reversed(isin_to_digits), 1):
        if i % 2 == 1:
            isin_sum += digit_sum(2*int(c))
            isin_sum += int(c)

    checksum_digit = abs(- isin_sum % 10)
    return checksum_digit

Assuming countries is a list of valid, i.e. ISO-6166 compliant, country codes, you call the isinChecksumDigit as follows:

In [1]: isin = '{:2s}{:09d}'.format(random.choice(countries), random.randint(1, 10E8))

In [2]: isin
Out[2]: 'KR681111517'

In [3]: validIsin = isin + str(isinChecksumDigit(isin))

In [4]: validIsin
Out[4]: 'KR6811115171

Calling external commands in awk

Today I had a file in which some lines were displaying a base 10 number. I wanted to translate this number in base36, possibly keeping the format as it was in the original file.
The entry was something along the lines of:


I decided to resort to awk and calling an external command to do the conversion.

Turns out, there are a couple of ways you can tell awk to fire off an external command. One is using system. The other is by simply stating the command in double quotes. There’s a nice forum discussion around this topic, and for once it’s not on StackOverflow.

Here’s the first draft of the solution.

awk '{
  if (/thisField=.([0-9]+)./) {
    match($1, "([0-9]+)", m);
    "python -c \"import numpy as np; print np.base_repr(" m[0] ", 36)\" " | getline converted;
    sub("([0-9]+)", converted);
  } else {
}' orig.dat > converted.dat

The trick is to pipe the result of the command to getline so that it can be saved into a variable, that is later used.
As a side topic, notice that this rough solution is extremely slow: a numpy import will happen for every line that matches. Other solutions for this particular problem exist, such as using bc after setting a proper obase or awk only.

Git: apply a lost stash

Today’s scenario: stash save something in order to get the latest source code, stash pop to re-apply the last changes, do something silly and lose all in-flight changes.
Fear not, there’s a way to recover the “lost” stash, assuming you still have the output of the stash pop.
In fact, after the pop operation there should be a line that goes like this:

Dropped refs/stash@{0} (f2acd7f56e93236bbb813a9dab3bba18e124da04)

Armed with the SHA code, just cherry-pick it:

git cherry-pick f2acd7f56e93236bbb813a9dab3bba18e124da04 -m 1

Watch out: this will commit your change as it is. You may want to git reset --soft HEAD^1 and git commit --amend accordingly!

Monitoring WiFi packets

It’s been a long time since I last enjoyed sniffing WiFi packets… the kernel version was 2.4.x to 2.6.x, the first Centrinos made their appearance, OpenWrt was the next big thing, and we all tried to put our hands on Prism II PCMCIA PC Cards. WEP was around, along with the first tools to crack it. It was good fun.

Now, fast forward quite a few years, I can’t believe I still have to debug my own local network by means of dumpcap to monitor AP associations, etc.

So that next time I don’t have to sift through the official Wireshark help page (which, by the way, is actually very thorough), here are some quick instructions to sniff packets on a WLAN.

  1. Leave your wifi device on and connected to the access point (no need to do the ifconfig up/down and iwconfig dance anymore!)
  2. If you don’t have it already, install aircrack-ng
  3. Run airmon-ng: airmon-ng start wlan1
  4. If it says monitor mode is enabled on some device (might be a brand new one, such as mon0) you’re good to go. Sniff some packets from Wireshark (tick the “monitor mode” preference) or using dumpcap: dumpcap -i mon0 -I

Better yet, you can fake an AP and have the device you’re debugging connected.

Unpickling GitPython datetimes


I’ve been playing around with GitPython recently, in an effort to analyse the relation between commits and software quality.

One by-product of this analysis was a Pandas Series of the number of commits on a given day. Since this turned out to be a time-consuming operation (as I needed to repoint head back in time for each day I was interested in), I opted to pickle the Series. Imagine the horror when, the day after I had run the script, I discovered that unpickling the data raised an exception.

In [4]: commits = pd.read_pickle('commits.pkl')
TypeError: __init__() takes at least 2 arguments (1 given)

That error comes from, part of the Pandas library.
However, no mention is made of which class actually raised it.

Entering %debug and going up and down the stack didn’t reveal much either, so I decided to go closer to the actual unpickling operation, using cPickle.

In [10]: commits = cPickle.load(open('commits.pkl'))
TypeError: ('__init__() takes at least 2 arguments (1 given)', <class 'git.objects.util.tzoffset'>, ())

Still an error, but a more meaningful one. Let’s see what a brief inspection of tzoffset shows.

In [11]: import git.objects.util

In [12]: git.objects.util.tzoffset?
Init signature: git.objects.util.tzoffset(self, secs_west_of_utc, name=None)
File: /opt/bats/lib/python2.7/site-packages/git/objects/
Type: type

So __init__ expects a secs_west_of_utc positional argument (no default).

To still be able to unpickle your data without the need for running the script again, you just need to mock that class with a slightly modified one. Thank partial applications for that.

In [20]: git.objects.util.tzoffset = partial(git.objects.util.tzoffset, secs_west_of_utc=0)
In [21]: commits = pickle.load(open('commits.pkl'))
In [22]:

Job done – thank you functools!

Tricks for dealing with panes in tmux

Straight to the point: here’s a list of config lines/commands that I hope will ease your life as they have done with mine!

Write on all open panes at the same time

CTRL+B : to enter the tmux command line and setw synchronize-panes on.
Of course, set it “off” to disable sync’d writing.

Change panes layout

CTRL+Space to cycle through all layouts
CTRL+B META+15 to select one specific preset layout (even-horizontal, even-vertical, main-horizontal, main-vertical, tiled).
On my machine CTRL+B ESC 15 seems to do the trick instead.

Move panes around

CTRL+B CTRL+o to rotate them.
CTRL+B CTRL+{ and CTRL+B CTRL+} to move the active pane left/up or right/down.

The connection string of a SQLAlchemy connection

In the middle of a Pdb session, while debugging a test, I found myself with a SQLAlchemy connection object, which was connected to… some database. To figure out which database it was connected to, I could scan the code to see where the connection had been initialised.

However, there’s a quicker way: looking at the _dsn variable of the underlying connection object – DSN standing for Data Source Name.

(Pdb) p conn.connection._dsn
'host=thehostname dbname=master_db_02 user=the_usr password=the_password connect_timeout=5 application_name=/usr/bin/nosetests'

Using Magic Cookies to run programs remotely as root


Some magic cookies

Unbelievable how many times I fell for this – and am still falling.

The situation is as follows: you are on a remote box, using SSH and X-forwarding enabled. You can run any graphical program (say, wireshark) as that user, but as soon as you try prepending sudo you get: (wireshark:8881): Gtk-WARNING **: cannot open display: .

If you’ve been following me for long enough, you know I’ve been bitten already by a similar problem in the past. The only (minor) difference is that this time I don’t even have a DISPLAY variable set (as root).

So here’s another fix, this time using magic cookies.

Step 1, as normal user type echo $(xauth list ${DISPLAY#localhost}). You’ll get something like this back: machine/unix:25 MIT-MAGIC-COOKIE-1 41f6c7f04a706ca5e490b3edf8a26491

Step 2, as root, run xauth add followed by the line you got as output on the shell, that is: xauth add machine/unix:25 MIT-MAGIC-COOKIE-1 41f6c7f04a706ca5e490b3edf8a26491.

Exit the root shell, confidently type sudo DISPLAY="localhost:25.0"
and enjoy!

Removing latex commands using Python “re” module

Recently I had to sanitize lines in a .tex file where a \textcolor command had been used.
The command was being used the following way: {\textcolor{some_color}{text to color}}.

The main problem was that the command could have appeared any number of times in a line, so I couldn’t apply the command a set number of times.
Also, given any color could have been used, a simple “blind replace” was clearly not a good weapon in this case.

I therefore resorted to applying a reg ex recursively until the line was cleaned of any \textcolor command.

In a nutshell:

def discolor(line):
    regex = re.compile('(.*?){\textcolor\{.*?\}(\{.*?\})\}(.*)')
    while True:
            line = ''.join(, line).groups())
        except AttributeError:
            return line

The key part here is that we match not only the text inside the \textcolor command, but also what comes before and after (the two (.*?) blocks). We return them all until there are any left: when that happens, accessing .groups() will raise an AttributeError, which we catch and use as sentinel to know when to return.