Jason in a Nutshell

All about programming and whatever else comes to mind

Posts Tagged ‘Python’

You need to worry about deployment

Posted by Jason Baker on April 29, 2009

Oftentimes, people used to using PHP or Classic ASP will give up on Python because deploying Python scripts isn’t just a simple matter of copying and pasting files.  Usually, this isn’t because Python is making their lives difficult.  It’s more a matter of not thinking about how to deploy your scripts ahead of time.

Now, I’m far from a highly experience programmer, but I can tell you one thing.  Deployment is a detail that will come back to bite you if you don’t spend a little bit of time on it up front.  So here are a few pointers I’ve come up with after experiences in deploying Python scripts for web apps.

  1. Batch/Shell scripts are your friends.  A common objection is that a person just doesn’t have time to learn a new tool for deploying.  I have two responses to this: 1)  Drop that attitude otherwise you’ll never get anywhere as a programmer and 2) Don’t use them if you really don’t want to!  While batch and shell scripts aren’t the prettiest options, they’re a lot better than having nothing to automate deployment at all.  In fact, for the basic one or two page webapp, you can’t really do much better.
  2. If you invest some time in Continuous Integration, you won’t regret it.  I know what you’re saying.  Continuous Integration is a Java thing.  It’s too complicated.  And you’d be right to a point.  However, I would argue that making sense of the complication is worth your time.  It’s way too easy to deploy something that doesn’t work because somebody forgot to run their unit tests.
  3. Site-wide packages are evil.  If you aren’t already, you should really be taking advantage of virtualenv.  That is, unless of course you enjoy troubleshooting weird ImportErrors because of that egg you installed using the setuptools develop command a month ago and forgot to remove.
  4. Don’t underestimate the value of good docs.  Having good documentation is just one of those things that don’t become obviously necessary until it’s too late.  Don’t leave yourself trying to figure out how that one function you wrote a year ago works.  Write documentation as you go and use a tool like sphinx to turn it into a webpage.  This ties in with point 2.  Using Continuous Integration will make doc generation that much easier.

Admittedly, doing this stuff can be a pain.  And you might get scorn from co-workers for not having something ready the next day.  But it will be worth it.  You’ll be surprised at how much time you’ll save in the long run.

Advertisements

Posted in Programming | Tagged: , , , , , , , , | Leave a Comment »

The magic of python decorators

Posted by Jason Baker on April 25, 2009

Decorators in Python are one of the language’s more “magical” features.  Personally, I’ve tended to glaze over decorators in code because they always seemed to be fairly self-explanatory.  But how do you make your own?  Personally, I think understanding the uses of decorators and being able to write your own is one of the points where a python newbie transitions to a knowledgeable pythonista.

But what is a decorator?

In actuality, there’s not an actual language construct to define decorators.  Any function that takes a function as a parameter and returns one as a result may be used as a decorator.  These are known as “higher-order functions” in functional programming circles.

Chances are, you’ve already seen decorators in use and maybe even used them.  I’ll give you a common example of decorator usage:

class SomeClass(object):
     @property
     def x(self):
         return 5

>>> var = SomeClass()
>>> var.x
5

For those of you familiar with the concept of properties, it should be pretty straight-forward what’s going on here.  But where the heck did property come from?  I’ll give you a hint.  The above code works identically to this code:

class SomeClass(object):
    def x(self):
         return 5

    x = property(x)

 

I can write my own?!

Yes you can.  A lot of well-written libraries make very good use of decorators.  And given the right situation, the little bit of syntactic sugar they provide can do a lot of good.  But decorators aren’t just there for others to define.  Once you wrap your head around decorators, they can save you a lot of copy-and-pasting (which you’re not doing anyway, right?) when used in your own code.  To illustrate this, I want to show you a couple of very much real-world cases that I’ve found decorators to be useful.

Those pesky connection objects

I have a library that needs to call a few particular stored procedures in a SQL Server database.  Because my ORM doesn’t support stored procedures, I have to use straight adodbapi.  The calls look something like this:

import adodbapi

def LookupPerson():
    conn = adodbapi.connect(CONNECTION_STRING)
    try: 
        #do stuff here
    finally:
        conn.close()

This is all well and good for just one function.  But what happens when you need 4 or 5 of these?   And what about the visual cruft that the try finally block is adding to the function (adodbapi doesn’t support with blocks before you ask)?  I’m sure you’ve already guessed the solution by now.  Here’s how you can solve this problem:

import adodbapi

def with_connection(func):
    def _exec(*args, **argd):
        conn = adodbapi.connect(CONNECTION_STRING) 
        try:
             func(conn, *args, **argd)
        finally:
             conn.close
     return _exec 

@with_connection
def LookupPerson(conn):
    #do stuff here

LookupPerson()  #conn argument is passed by the decorator

 I think that the simplification that happens with LookupPerson here should be obvious.  But what is the purpose of the *args and **argd shenanigans?  The documentation covers arbitrary argument lists in depth, so I won’t go into too much detail.  But what happens if I want to lookup a person by name?  The LookupPerson function would be transformed to this:

@with_connection
def LookupPerson(conn, name):
    #do stuff here

LookupPerson('Bob')  #conn argument is passed by the decorator
LookupPerson(name='Jill')

 In fact, I can decorate any function that takes any number of arguments either by keyword or by position.  Pretty neat, eh?

Error handling

When doing web applications, it’s pretty important to have decent error handling.  But setting this up can be a pain.  For instance, what if I wanted to make my django application print a wonderfully informative traceback to a log file?  I could do that like this:

from traceback import format_exc

def index(request):
     try:
          #do stuff
     except:
         logging.log(format_exc())
         return HttpResponseServerError('Error!')

But this definitely can become problematic.  What happens if you duplicate this code in all of your view functions and you want to make a change to your error handling?  The solution is simple:

def handle_errors(func):
     def _handler(*args, **argd):
          try:
               func(*args, **argd)
         except:
                logging.log(format_exc())
                return HttpResponseServerError('Error!')

@handle_errors
def index(request):
     #do stuff here

 As you can tell, this allows us to make our views worry about actually doing stuff instead of constantly handling errors.  Yes, there are also middlewares for doing this kind of thing.  But then I wouldn’t have a reason to make a blog post about python decorators, would I?

Conclusions

Ok, so I’ll admit something.  Python’s decorator syntax is ugly.  Its Java-like syntax alone may even be enough to scare some off.  But as I’ve show, there are at least a few real-world cases where they are useful.

What are some other decorators that you’ve found to be useful?

Posted in Programming | Tagged: , , | 2 Comments »

Java vs C# vs Python vs Ruby: an “objective” analysis

Posted by Jason Baker on April 21, 2009

At my place of employment, we’re looking to migrate some of our old Classic ASP applications to something newer (yes, we still have actively maintained Classic ASP code). So my boss asked me to write up an analysis of the different options we have available.

Now before I give you the link, I have a few disclaimers and other random thoughts:

  • I tried to avoid editorializing so that this can be objective as possible, but it’s impossible to discuss these kinds of issues without being subjective.  Thus, don’t take this as the gospel truth.
  • There may be errors here.  In fact, I can almost guarantee that there are errors with Java and Ruby because I’m not terribly familiar with them.  If there are errors, feel free to leave comments.
  • My boss is a Java guy, so I left some blanks that I’m pretty sure he can fill in.
  • Some of these are blatant oversimplifications.  There’s only so much data that you can squeeze into a spreadsheet.
  • I’ll try and keep up with this for a while, but chances are that I won’t for long.  These languages are all being changed.
  • I’m biased towards Python.

Ok, without further ado, here’s the link:

http://spreadsheets.google.com/pub?key=p7efJLoHuYE-iw6JxBmpSQg&hl=en

Posted in Programming | Tagged: , , , , , , , | 12 Comments »

Finding a user’s group membership in Active Directory using Python

Posted by Jason Baker on March 17, 2009

I spent some time trying to figure out what the best way is to determine group membership in Active Directory using Python.  The solution is almost embarassingly easy, but can be difficult to find for someone who’s not familiar with win32 programming.  To make matters worse, google is almost totally unhelpful for this.  So here’s the solution:

import win32net
win32net.NetUserGetGroups('domain_name.com', 'username')

 

To do this, you’ll need pywin32 and a windows computer.  Calling win32net’s NetUserGetGroups function will return a list of tuples.  Each tuple will contain the group name as element 0 and an attributes flag as element 1.  Don’t ask me to elaborate any more on the attributes flag because I can’t.  🙂

Posted in Programming | Tagged: , , , , , , | 1 Comment »

The history of python

Posted by Jason Baker on January 13, 2009

I really detest making “link and run” type posts, but I feel that this is a fairly important one to make.  Guido just announced that he’s starting another blog about the history of python.  You can find that blog here.  It’s all very interesting so far.

So why are you still reading my blog when you could be learning about Python from the horse’s mouth?

Posted in Blogging, Programming | Tagged: , , | 1 Comment »

Enemies of Test Driven Development part I: encapsulation

Posted by Jason Baker on January 8, 2009

Before you leave a nasty comment below hear me out. I’m not saying that we need to abandon the idea of encapsulation.  That would be stupid.  Rather, I’m saying that to be able to do test driven development properly, you need to re-think how you handle encapsulation.

(I had thought of naming this post “Enemies of Test Driven Development part I:  The Ideas You Currently Have About Encapsulation”, but that was too long)

What’s that smell?

The most difficult part of dealing with Test Driven Development is learning how to test private methods.  The answer to that is simpler than you may think:  you don’t.

Here’s a quote from Michael Feathers (one of the gurus of testing):

It seems that reverse is true also.  Classes which are hard to instantiate and use in a test harness are more coupled than they could be, and classes with private methods that you feel the urge to test, invariably have some sort of cohesion problem: they have more than one responsibility.

All I can say is that in the community of people doing test-driven development there are a number of people who have found that this question of testing private methods doesn’t come up much in their practice.  They target both testability and good design and find that both goals nurture each other.

It seems to me that Feathers is stopping just short of calling private methods code smells.  I’m going to take it there:  private methods are a code smell.  Does this mean that every private method that’s ever been written is bad?  Of course not.  There are times when private methods are a good and wholesome thing.  But if you’re using a private method, you should really consider if your design is a good one.

Solutions

I won’t dwell much on why private methods can be indicative of bad code.  Plenty has been written on that already.  Rather, I want to focus on overcoming these challenges.  So I’ve come up with a list of solutions to this problem.  Keep in mind that these solutions are tools for the toolbox.  They may not be applicable in every situation, nor are they a complete list.

With that said, here are some possible solutions:

Make it public!

What is it? This is probably the simplest way to overcome the problem of untestability.  And in my opinion, it’s the best solution for the TDD newbie.  Why?  Nine times out of ten, it’s a result of doing what you’re told.  If your university is/was like mine, you were told to make everything private unless you had good reason to make it public.  While that is actually true, it’s not really very useful.  There’s a reason for that:  testability is a perfectly good reason to make something public.  And you should test most of your code.

How do I do it? It’s simple, suppose I want to test SomeMethod:


class SomeClass
{
    private void SomeMethod() {...}
}

I could do this:

class SomeClass
{
    public void SomeMethod() {...}
}

Simple, eh?

When should I use it? To decide if this is the avenue you should take, evaluate why you want to make the method private.  If this is just a case of not wanting to make it public because you want to simplify the class’s API, there’s a good chance you’re over-hiding and you should evaluate this solution.  If there’s a deeper reason why you don’t want to make it public, there are a few other solutions.

Use Conventions

What is it? Python’s gotten by on this method for a long time.  And it works pretty well.  The idea is that “we’re all adults here” (if anybody can tell me who to attribute that quote to, let me know!).  If you don’t want somebody to call a certain method, name it as such.  In Python, the convention is to prefix private methods with a single underscore.

C++, C#, and Java users will probably disagree with me here (and that’s fine), but I think that this is an important tactic to note.

How do I do it? Simple.  Suppose I want to test SomeMethod:

class SomeClass(object):
    def SomeMethod(self):
        ...

Then I would just do this:

class SomeClass(object):
    def _someMethod(self):
        ...

When should I use it? The most obvious case is if you’re in an environment where this is acceptable.  If you’re in a Python shop, chances are you’re ok with this.  If you’re in a “curly brace” shop, you may have problems doing this.  We can dispute whether or not the reasons for that are good, but that’s really not relevant.  If you’re in such an organization, you should probably try something else if only for no other reason than to not hear co-workers complain.

Access Denied!

What is it? Just because you aren’t making private methods doesn’t mean you can’t disallow their use.  This is where interfaces and abstract base classes come into play (for the sake of succinctness, I’ll use the word “interface” to refer to both of these unless otherwise noted for the rest of the post).  Don’t want to allow client code to access a certain method, don’t put it in the interface!  Granted, this isn’t a perfect way to prevent client code from calling a method.  But then again, neither is making the method private (even in C++, although it isn’t easy to break there).

Be careful here though, if you find yourself creating too many interfaces to allow for giving different classes access to different areas, you’re probably creating a God Object.

How do I do it? Suppose I have a class SomeClass and I want to expose everything but SomeMethod:

class SomeClass
{
    public void SomeMethod() {...}
    public void SomeOtherMethod() {...}
    public void SomeOtherOtherMethod() {...}
}

I could just create an interface ISomeClass:

interface ISomeClass
{
    public void SomeOtherMethod() {...}
    public void SomeOtherOtherMethod() {...}
}

When should I use it? If you’re working in a C++/C#/Java shop.  This is a good alternative to the “conventions” method noted above.

Using Inheritance

What is it? If you’re willing to make a method protected, you can test it by inheriting from the class and exposing a public method that calls the method.

How do I do it? Suppose you want to test SomeMethod:

class SomeClass
{
    private void SomeMethod() {...}
}

You can then do this:

class SomeDerivedClass
{
    public void SomeMethod2() {SomeMethod();}
}

When should I use it? If you’re working with legacy code, this may be your only option.  Typically this works best when you want to test a class that you don’t have the luxury of being able to change.  Keep in mind that you’re adding another layer between your code and your tests, though.  This is a bigger deal in some languages than it is in others, but in general it’s to be avoided if at all possible.

Make Another Class

What is it? Ok, so you’ve reviewed the above methods, and you’re still just not comfortable with making them public.  This is usually an indicator that your class is doing too much.  Not only does this make testing difficult, it results in more tightly coupled code that will turn into a maintenence nightmare.  The idea then, is to separate the extra functionality into a separate class.

How do I do it? Suppose I have a class to access a database that looks something like this:

class UserParser(object):
    def _fillInfo(self):
        self.Username = getUnameFromDB()
        self.EmailAddress = getEmailFromDB()
        self.Name = getNameFromDB()
    def ParseData(self):
        _fillInfo()
        do_stuff_with_filled_data()

I can do this:

class UserAccessor(object):
    def FillInfo(self):
        self.Username = getUnameFromDB()
        self.EmailAddress = getEmailFromDB()
        self.Name = getNameFromDB()

class UserParser(object):
    def __init__(self):
        self.user = UserAccessor()
    def ParseData(self):
        self.user.FillInfo()
        do_stuff_with_filled_data(self.user)

When should I use it? There are a couple of situations when you would use this:

  1. When you can’t or won’t use any of the other methods.
  2. When you have a significant amount of methods you need to test but don’t want to make public.

Conclusion

Wow, this post ended up being longer than I thought!  I’m sure that there are a lot of techniques for doing this, and I’m sure that I’m missing some stuff.  So let me know them in the comments.

Update

Thanks to Przemek Owczarek for pointing out another method!

Other Posts

Posted in Programming, TDD | Tagged: , , , , , , , , | 5 Comments »

The best tool for the job: Duck typing vs Interfaces

Posted by Jason Baker on December 11, 2008

A question on StackOverflow got me thinking today:  what are the advantages of duck typing in comparison to interface-based languages (like C# or Java) and vice versa?  Notice that I’m not talking about the differences between dynamic and static typing.  These two comparisons go hand-in-hand to be sure, but they are two different things in my opinion.

The basic idea behind duck typing is that “if it walks and talks like a duck, then it’s a duck.”  Thus, if you want to do an operation on a certain type of object, then you should do that operation on a certain type of object.  If the interpreter has a problem, then the interpreter/runtime will let you know (and hopefully you’ve got some error handling in place).

On the other hand, interface-based languages solve this problem by having a pre-established contract between objects and code.  Thus, an object will guarantee that it will implement certain methods if the client code will guarantee not to call anything else.  If that contract is broken, then the complier will likely let you know at compile time (though that isn’t always the case).

So what are the pros/cons of these two methods?

On first glance, it would seem that the benefits of type safety would make interface-based designs more worthwhile in almost all cases.  Indeed, interfaces have helped me prevent a lot of needless errors that I frequently encounter in using dynamic languages.  It does this by forcing you to think your class hierarchy a long way in advance.

This is where C# and Java get their enterprise-y reputations.  If you need to develop a complex hierarchy of classes, interface-based typing is the only way to go.

This is simultaneously interface-based typing’s greatest strength and greatest weakness.  Indeed, it’s also duck typing’s greatest strength and weakness.  As wrong as it feels, not every class hierarchy needs to be so well-planned.  Heck, sometimes you don’t need class hierarchies at all (there seem to be a lot of python programmers who are totally against inheritance altogether).  Of course this does require a lot more unit testing, but you’ve already signed onto the TDD bandwagon anyway, right?

This is where Python gets its famed extreme levels of productivity.  By having a good set of unit tests, a Python programmer can churn out good code at breakneck pace.

At any rate, both ways of programming have their strengths and weaknesses, and there’s not any good answer as to what way of thinking about it is the best way to go.  So go with your gut.  If your first reaction at tackling a task is to say “gee, it really would be nice to be able to write a lot of code fast without having to go through a lot of tedium” then go with Python or Ruby or most other dynamic languages.  If you find yourself saying “I need a very complex class hierarchy to do this” then chances are you should be using C# or Java.

Posted in Programming | Tagged: , , , , , , , , | 2 Comments »

Python distutils and you: Part I

Posted by Jason Baker on June 4, 2008

In the past few days, I’ve grown a lot more familiar with python’s packaging options than I ever hoped to become. So, I thought I would share some of the things I learned in case anyone else runs across a need to use this in their GSoC projects (or any other project for that matter).

The setup script

This is the meat of python distutils. Surprisingly, there’s really not all to writing the setup.py script itself. Suppose I have a file called “foo.py” that I’d like to install on someone’s computer. Here’s what a setup.py script will look like: (examples shamelessly copied from the distutils documentation)

from distutils.core import setup
setup(name='foo',
      version='1.0',
      py_modules=['foo'],
      )

That wasn’t so difficult was it? All you need now is to put foo.py in the same directory as setup.py and you’re good to go!

Packages

Now suppose that instead of a single module, we want to put a whole package into our distribution (a package being a directory containing an __init__.py). That’s where the packages keyword comes into play. Just pass a list of package names to the setup function.

So, let’s assume we have a directory ‘foo’ in the same directory as our setup.py script. This directory contains the files __init__.py, bar.py, and bar3.py. Here’s how we would do that:

from distutils.core import setup
setup(name='foo',
      version='1.0',
      packages=['foo'],
      )

Now, suppose we want to get something we can distribute. That’s pretty easy. All you have to do is type

python setup.py sdist

That will create a directory called ‘dist’. If you look inside dist, you will see foo.tar.gz (or foo.zip for you windows users out there). Now all your user has to do is type in

sudo python setup.py install

and foo will go into the end user’s site-packages directory.

Scripts

Now this is all well and good if all we want to distribute is a library. But I’m packaging an application. Suppose I have a python file called ‘dofoo.py’ that that will import modules from foo when the end user runs it. Here’s how we would put that into the file:

from distutils.core import setup
setup(name='foo-program',
      version='1.0',
      packages=['foo'],
scripts=['dofoo.py'],
      )

What does this do? Well first of all, the installer will automagically change the shebang line in dofoo.py to the end user’s python location if the first line of it begins with #! and contains the word python. Secondly, dofoo.py will be installed to /usr/bin unless the end user puts it elsewhere. Ideally, the key is to focus on what functionality something provides here rather than where it will go in the user’s file system since the user has ultimate control over what goes where.

Conclusion

Well, the part about writing the setup.py script ended up being longer than I had originally intended. There is still a lot more metadata that can go into setup.py. But hopefully that should be sufficient to give you a general idea of what distutils is all about. Next time I plan on talking a little bit more about the manifest and possibly some more distribution options.

Posted in Programming | Tagged: , , , , | Leave a Comment »