Python setuptools entry points

This is a quick tutorial on how to use Python setup tools entry points to elegantly distribute command line scripts as part of your program, and to enable your code to discover and use plugins in a principled fashion.

Installing scripts

Say your program supplies a python script dothis.py. If you have been using distutils, in your setup.py script you probably have something like

setup(...,
  scripts=['dothis.py']
     )

When I first used this it was like magic. This line would cause setup to place a executable wrapper script in /usr/local/bin that would point to your dothis.py script.

setuptools has a slightly different, but ultimately better, way to handle distributing commandline scripts. You may have to refactor your code a little bit to get this to work, but the refactoring improves the layout of your code. If your original script (like some of mine) had the pattern

# Your module code

if __name__ == "__main__":
  # Parse command line args 
  # Do a bunch of stuff

You need to refactor the stuff in main into a function – let’s call it cli():

# Your module code

def cli():  # Entry point for scripts
  # Parse command line args 
  # Do a bunch of stuff

if __name__ == "__main__":
  cli()

(In general this is proper practice, but folks have been known to ignore it)

In your setup script you should now discard the scripts line and instead use:

    ...
    entry_points={
           ....
      # Command line scripts
      'console_scripts': ['runme = dothis:cli']
    },
    ...

This has several advantages, mainly relating to playing nicely with both POSIX and Windows systems. There are also options for distributing Python GUI programs!

Plugins

Setuptools gives us a really nice way to implement a plugin system without us having to write any plugin manager that keeps track of where plugins are stored and how to import them (Relative import? Absolute path? Dotted path?). It also ensures that we and the plugin writers don’t have to worry about setting up some common space where plugin code has to be installed into to be discoverable by the main app.

What you do is simply settle on a context name for your plugin system. Say my.cool.plugins. It is important to choose a unique name, since the whole Python install will know about this name and we don’t want collisions. Having the name of your package somewhere is a good bet.

In your application code, you can use something like this to find and load a plugin module:

import pkg_resources
def _load_plugin(name, plugin_entry_point):
  for v in pkg_resources.iter_entry_points(plugin_entry_point, name):
    return v.load()
  raise ImportError('No plugin called "{:s}" has been registered.'.format(name))

Where ‘my.cool.plugin’ is the value you should pass for plugin_entry_point. This function will return the result of load() which is a module the same as if we used import by our own hand.

You can also discover all the available plugin modules:

def discover_all_plugins():
  return sorted([(v.name, v.module_name) for v in pkg_resources.iter_entry_points('my.cool.plugins')],
                cmp=lambda x, y: cmp(x[0], y[0]))

But how can we make the magic happen? How can we make a plugin discoverable this way? The magic lies in setup.py for the package supplying the package. Say the plugin code lies in a file called plugin_file.py under the directory my/plugin/code:

    ...
    entry_points={
           ....
      # Register the built in plugins
      'my.cool.plugins': ['fancy_name = my.plugin.code.plugin_file']
    },
    ...

Once you run python setup.py install on this, your application will be able to see the plugin.

Many thanks to Björn Pollex who encouraged me to look into setup tools entry points as a way to clean up how I distribute scripts and implement plugins.

The Felix problem

Say you have a multiplayer game where there is no central server. You need to indicate to a pair of players if they are within a certain distance from each other, but you can’t let any of the players know the actual positions of any other players. Could you do it?

I was driving us back from the 2014 Biological Data Science meeting at Cold Spring Harbor when Felix posed this problem to me. The drive was alternatingly boring and annoying as we hit pockets of over polite New York drivers on the way out (Though there was a stretch of road along the Hutchinson River Parkway that was SPECTACULAR in terms of fall foliage. We didn’t take any pictures, but that alone made driving to and from the conference worth it)

As a complete aside, Felix had the GPS voice on his phone on and we were entertained by numerous howlers, one them being the voice calling it the “Hutchinson River Peek-wee” (pkwy) and the other being: “Take I95 New Hampshire minus Maine” which Felix declared would simply be New Hampshire, since there is no intersection between the two sets.

Anyhow, back on point, to relieve the boredom Felix was telling me crypto related things and came up with this question, which he claimed he did not have a definitive answer to.

Here are my attempts during the drive:

Players A, B want their distance checked. Players C, D are “third-parties”

Scheme 1 (Fails)

Player A sends their position along with N-1 random positions to C. Each position has a tag. Only A knows the correct position by tag. Player B does the same

Screen Shot 2014-11-08 at 11.10.02 PM

Player C computes N^2 position combinations and returns the results to both A and B.

Screen Shot 2014-11-08 at 11.10.11 PM

My idea was that when they get the results back A and B will be able to pick the correct results, because they’ll know which is their tag, but, at this point, I realized the scheme will fail, because each will get back N results with their tag in it, but not know which was the original one.

But, this led me to the second scheme which works:

Scheme 2

Player A and B send their position, along with N-1 random positions, tagged, to C. They also separately they send their tags to D.

Screen Shot 2014-11-08 at 11.13.42 PM

C sends the N^2 computation results to D who then matches the correct answer based on the tag pair from A and B and then sends the result back to A and B

Screen Shot 2014-11-08 at 11.14.03 PM

Scheme 3 – nice refinement of 2

I then realized that you can save on the large number of computations (N^2 even if they are simple) by doing the following: A adds a random number to their position (p_a + q_a) and sends this to C. B sends a similarly randomly offset number to C (p_b + q_b). Both A and B send the random offsets to D.

Screen Shot 2014-11-08 at 11.17.52 PM

C computes r_1 = (p_a + q_a) - (p_b + q_b) and sends it to D.

Screen Shot 2014-11-08 at 11.17.59 PM

D then subtracts the difference of the offsets from the result from C r_2 = r_1 - (q_a - q_b), which is r_2 = (p_a + q_a) - (p_b + q_b) - (q_a - q_b) i.e. r_2 = p_a - p_b.

Screen Shot 2014-11-08 at 11.18.06 PM

This last is the computation we want and D broadcasts the flag “In range” or “Out of range”. Note that neither C nor D know the actual positions of A and B, and A and B don’t know each others’ positions either.

I was pretty happy at arriving at this solution and by this time we had meandered our way out of New Rochelle.