Jason in a Nutshell

All about programming and whatever else comes to mind

Archive for January, 2009

What I like about FOSS (or I don’t like your software and I intend to fix it)

Posted by Jason Baker on January 26, 2009

So I was reading one of Jeff Atwood’s latest posts (I generally try to avoid the obligatory Coding Horror link, but I feel I should give credit where credit is due) and noticed something pretty amazing while reading one of his links.  Someone’s forked Open Office(OOo).  This is a big deal.  Go-oo promises to be a “meritocracy” driven piece of software.

I’m not going to address the issue of whether or not this is a needed step.  Indeed, there have been plenty of people who criticize Sun’s handling of OOo.  But I personally don’t use OOo very much nor am I involved in its development process.  Thus, I’d rather focus on what this means to the Open Source community and even the programming community as a whole.

Dinnerware and open source

For those of you who aren’t familiar with open source terminology, a fork is basically when a developer on an open source project says “I don’t like the way development on this software is proceeding.  I’m just going to make my own version.”  Usually, this causes some hurt feelings on all sides.  But that’s not necessarily always the case.  I’d classify forks into three categories:

The “I hate you fork” - This seems to be the most common kind of fork, and coincidentally is probably the most likely to fail.  Essentially, this happens when one of the developers is fed up with the way a piece of software is developed and feels that it must be taken in a totally different direction.  The classic example of this would probably be emacs vs xemacs.

The “I like you, but I want to take this a different direction” fork.  This fork happens when someone wants to use existing open source software to build an entirely different piece of software.  The best example of this would be debian vs ubuntu.

The “I like you, but can’t you just do x better?”  This is a fork that happens when someone feels that a piece of software they like has one critical flaw that they feel they can fix.  These tend to be the forks that are most likely to succeed, but not in the sense that they’re likely to stick around and get lots of people to download them.  These forks are likely to be integrated back into their source at some point.  The best example of these that I can think of is emacs vs EmacsW32.

Sometimes you have to kill software to love it

In the history of open ource software, Open Office has been one of the biggest players.  I would put it right up there with Linux and Firefox.  But the more time passes, the more OOo loses that distinction.

Let’s put this into perspective.  Imagine if tomorrow, Microsoft Office were to become a stagnant piece of software with a totally fubared development process (I know, some of you will argue that this is already the case.  Just bare with me and pretend that it’s just now happening).  What would happen?  Probably not much.  It would take any competitor years to catch up with where Microsoft is today, and the only option would be to rewrite everything from the ground up.

I mean, could you imagine if one of the devs from the Office team decided to take the source code and make their own version?  That dev would be sued blind.  On the other hand, open source software presents a built-in solution:  the fork.  When a piece of software is broken and needs fixing that its own development team can’t or won’t make, something drastic has to happen or it will die.  And in the case of Open Office, it sounds like something drastic is happening.

So what happens now?

The best thing that Sun can do is pick up on the changes that go-oo makes and do their best to integrate the project.  Unless somebody at Sun completely drops the ball on this (which isn’t impossible mind you), there’s simply no way that they can fail.  OOo has brand recognition that nobody short of Microsoft Office has.  I see one of two things happening here:

  1. Sun completely ignores go-oo and in so doing, makes Open Office obsolete.
  2. Sun is proactive about picking up changes made by go-oo and in so doing makes it obsolete.

Hopefully Sun does what’s best for its software and chooses option 2.  But I’m getting the distinct impression that there’s a high chance of #1 happening.

Posted in Programming | Tagged: , , , , , , , , , , , , , , , | Leave a Comment »

The history of python

Posted by Jason Baker on January 13, 2009

I really detest making “link and run” type posts, but I feel that this is a fairly important one to make.  Guido just announced that he’s starting another blog about the history of python.  You can find that blog here.  It’s all very interesting so far.

So why are you still reading my blog when you could be learning about Python from the horse’s mouth?

Posted in Blogging, Programming | Tagged: , , | 1 Comment »

Enemies of Test Driven Development part II: YAGNI

Posted by Jason Baker on January 11, 2009

Just like in this post’s predecessor, don’t take the title of this post to mean that YAGNI should be abandoned.  YAGNI is still a very good principle that has saved me from writing a lot of crap code.  But many of us are trained to apply it in such a manner that runs counter to Test Driven Development.

Let’s take the following scenario.  I have a wrapper around one of the components of my data model.  The idea is that you insert data into it and then flush it, like so:

wrapper.Insert(some_collection);
wrapper.Flush();

The idea being that as soon as this code is ran, the contents of some_collection will be inserted into a database.  Let’s look at how the Flush method works at the backend:

void Flush()
{
    using (var dc = new SomeDataContext())
    {
        Flush((ISomeDataContext)dc);
    }
}

void Flush(ISomeDataContext dc)
{
    //do stuff with dc
}

Just to clear up any doubts you may have, the latter overload of Flush is not used in the production code anywhere.  I added this in just to make the code more generic.

You are going to need it

For those of you properly trained in the arts of YAGNI, these last two sentances probably set off alarm bells.  But take a look at this code:

class MockDC : ISomeDataContext
{
    public bool MethodCalled {get; set;}
    public void MethodThatMustBeCalledByFlush () {MethodCalled = true;}
}

[TestMethod]
public void TestFlush()
{
    var wrapper = new SomeWrapper();
    wrapper.Insert(stuff);

    var dc = new MockDC();
    wrapper.Flush((ISomeDC)dc);
    Assert.IsTrue(wrapper.MethodCalled);
}

Newbies to Test Driven Development (myself included) tend to find it difficult to write code that serves no purpose other than to enhance testability.  The thing is: code that makes other code more testable is needed.  And the above tests either wouldn’t have been possible or would have been hideously complicated without overloading Flush to be more generic.

So what is YAGNI good for?

Some people feel that TDD has largely subsumed YAGNI.  I can definitely see where they’re coming from.  But I think that YAGNI is still a good basic principle.

YAGNI is much like optimization in that it’s very difficult to apply at a micro-level.  Thus, if you find that you’re not adding necessary methods because of YAGNI, you’re probably using it wrong.  YAGNI is best applied at a high level.

One of my first assignments at my current job was to write a program to transfer some stuff from one database to another.  I thought I would try to go a bit above and beyond the call of duty.  See, my job has a lot of programs like these.  Just think of how much duplicated code must be out there!  Why not write some kind of framework to make this kind of stuff easier and allow code reuse?

The sky was the limit from this point on.  Before too much longer, I had a model that included the pipeline and observer design patterns.  Along with a component model that allowed linking components in all sorts of need ways.  I got about halfway through it.  At a certain point, it just became too difficult to make the .Net type system work with what I was wanting to do.

I eventually just decided to code it the “regular” way without any of that cool stuff.  I finished it within a few days.  Now, a judicious use of YAGNI would have saved me a lot of time here.  So what’s the lesson?  With apologies to Albert Einstein, the lesson is to apply YAGNI as much as possible, but no more.  Once you get a sane approach going, I think you’ll find that YAGNI is a tool that will help your programming rather than hurt it.

Other Posts:

Posted in Programming | Tagged: , , , , , , | 1 Comment »

Enemies of Test Driven Development part I: encapsulation

Posted by Jason Baker on January 8, 2009

Before you leave a nasty comment below hear me out. I’m not saying that we need to abandon the idea of encapsulation.  That would be stupid.  Rather, I’m saying that to be able to do test driven development properly, you need to re-think how you handle encapsulation.

(I had thought of naming this post “Enemies of Test Driven Development part I:  The Ideas You Currently Have About Encapsulation”, but that was too long)

What’s that smell?

The most difficult part of dealing with Test Driven Development is learning how to test private methods.  The answer to that is simpler than you may think:  you don’t.

Here’s a quote from Michael Feathers (one of the gurus of testing):

It seems that reverse is true also.  Classes which are hard to instantiate and use in a test harness are more coupled than they could be, and classes with private methods that you feel the urge to test, invariably have some sort of cohesion problem: they have more than one responsibility.

All I can say is that in the community of people doing test-driven development there are a number of people who have found that this question of testing private methods doesn’t come up much in their practice.  They target both testability and good design and find that both goals nurture each other.

It seems to me that Feathers is stopping just short of calling private methods code smells.  I’m going to take it there:  private methods are a code smell.  Does this mean that every private method that’s ever been written is bad?  Of course not.  There are times when private methods are a good and wholesome thing.  But if you’re using a private method, you should really consider if your design is a good one.

Solutions

I won’t dwell much on why private methods can be indicative of bad code.  Plenty has been written on that already.  Rather, I want to focus on overcoming these challenges.  So I’ve come up with a list of solutions to this problem.  Keep in mind that these solutions are tools for the toolbox.  They may not be applicable in every situation, nor are they a complete list.

With that said, here are some possible solutions:

Make it public!

What is it? This is probably the simplest way to overcome the problem of untestability.  And in my opinion, it’s the best solution for the TDD newbie.  Why?  Nine times out of ten, it’s a result of doing what you’re told.  If your university is/was like mine, you were told to make everything private unless you had good reason to make it public.  While that is actually true, it’s not really very useful.  There’s a reason for that:  testability is a perfectly good reason to make something public.  And you should test most of your code.

How do I do it? It’s simple, suppose I want to test SomeMethod:


class SomeClass
{
    private void SomeMethod() {...}
}

I could do this:

class SomeClass
{
    public void SomeMethod() {...}
}

Simple, eh?

When should I use it? To decide if this is the avenue you should take, evaluate why you want to make the method private.  If this is just a case of not wanting to make it public because you want to simplify the class’s API, there’s a good chance you’re over-hiding and you should evaluate this solution.  If there’s a deeper reason why you don’t want to make it public, there are a few other solutions.

Use Conventions

What is it? Python’s gotten by on this method for a long time.  And it works pretty well.  The idea is that “we’re all adults here” (if anybody can tell me who to attribute that quote to, let me know!).  If you don’t want somebody to call a certain method, name it as such.  In Python, the convention is to prefix private methods with a single underscore.

C++, C#, and Java users will probably disagree with me here (and that’s fine), but I think that this is an important tactic to note.

How do I do it? Simple.  Suppose I want to test SomeMethod:

class SomeClass(object):
    def SomeMethod(self):
        ...

Then I would just do this:

class SomeClass(object):
    def _someMethod(self):
        ...

When should I use it? The most obvious case is if you’re in an environment where this is acceptable.  If you’re in a Python shop, chances are you’re ok with this.  If you’re in a “curly brace” shop, you may have problems doing this.  We can dispute whether or not the reasons for that are good, but that’s really not relevant.  If you’re in such an organization, you should probably try something else if only for no other reason than to not hear co-workers complain.

Access Denied!

What is it? Just because you aren’t making private methods doesn’t mean you can’t disallow their use.  This is where interfaces and abstract base classes come into play (for the sake of succinctness, I’ll use the word “interface” to refer to both of these unless otherwise noted for the rest of the post).  Don’t want to allow client code to access a certain method, don’t put it in the interface!  Granted, this isn’t a perfect way to prevent client code from calling a method.  But then again, neither is making the method private (even in C++, although it isn’t easy to break there).

Be careful here though, if you find yourself creating too many interfaces to allow for giving different classes access to different areas, you’re probably creating a God Object.

How do I do it? Suppose I have a class SomeClass and I want to expose everything but SomeMethod:

class SomeClass
{
    public void SomeMethod() {...}
    public void SomeOtherMethod() {...}
    public void SomeOtherOtherMethod() {...}
}

I could just create an interface ISomeClass:

interface ISomeClass
{
    public void SomeOtherMethod() {...}
    public void SomeOtherOtherMethod() {...}
}

When should I use it? If you’re working in a C++/C#/Java shop.  This is a good alternative to the “conventions” method noted above.

Using Inheritance

What is it? If you’re willing to make a method protected, you can test it by inheriting from the class and exposing a public method that calls the method.

How do I do it? Suppose you want to test SomeMethod:

class SomeClass
{
    private void SomeMethod() {...}
}

You can then do this:

class SomeDerivedClass
{
    public void SomeMethod2() {SomeMethod();}
}

When should I use it? If you’re working with legacy code, this may be your only option.  Typically this works best when you want to test a class that you don’t have the luxury of being able to change.  Keep in mind that you’re adding another layer between your code and your tests, though.  This is a bigger deal in some languages than it is in others, but in general it’s to be avoided if at all possible.

Make Another Class

What is it? Ok, so you’ve reviewed the above methods, and you’re still just not comfortable with making them public.  This is usually an indicator that your class is doing too much.  Not only does this make testing difficult, it results in more tightly coupled code that will turn into a maintenence nightmare.  The idea then, is to separate the extra functionality into a separate class.

How do I do it? Suppose I have a class to access a database that looks something like this:

class UserParser(object):
    def _fillInfo(self):
        self.Username = getUnameFromDB()
        self.EmailAddress = getEmailFromDB()
        self.Name = getNameFromDB()
    def ParseData(self):
        _fillInfo()
        do_stuff_with_filled_data()

I can do this:

class UserAccessor(object):
    def FillInfo(self):
        self.Username = getUnameFromDB()
        self.EmailAddress = getEmailFromDB()
        self.Name = getNameFromDB()

class UserParser(object):
    def __init__(self):
        self.user = UserAccessor()
    def ParseData(self):
        self.user.FillInfo()
        do_stuff_with_filled_data(self.user)

When should I use it? There are a couple of situations when you would use this:

  1. When you can’t or won’t use any of the other methods.
  2. When you have a significant amount of methods you need to test but don’t want to make public.

Conclusion

Wow, this post ended up being longer than I thought!  I’m sure that there are a lot of techniques for doing this, and I’m sure that I’m missing some stuff.  So let me know them in the comments.

Update

Thanks to Przemek Owczarek for pointing out another method!

Other Posts

Posted in Programming, TDD | Tagged: , , , , , , , , | 4 Comments »

Bold in blog posts NOT considered harmful

Posted by Jason Baker on January 1, 2009

Jimmy Bogard makes a post about the 10 things to retire in 2009.  Of course, within that list is the obligatory “scold bloggers for using bold” item:

One of my bigger pet peeves is anything in bold.  It’s a cheap trick to grab attention, yet it always works.  Yes, Atwood does it all the time and although it grabs attention, it’s the blogger equivalent of that dork that “quotes” “all” “his” “words”.  The last thing we need is another cookie-cutter Atwood knockoff.

Before I make any arguments, I’d like to let Jeff’s blog speak for itself (this post):

But not all user interface conventions are created equal. Some are timeless. Some are there by default, because nobody bothered to sufficiently question them. Some grow old and outlive their usefulness. How do we discriminate between conventions that actually help us and those that are merely.. expected?

The answer, of course, is to try multiple approaches and collect usage data to determine what works and what doesn’t. This is (relatively) easy for web apps, which is why Amazon, Yahoo and Google are all notorious for doing it. They’ll serve up experimental features to a tiny fraction of the user base, collect data on how those features are used, then feed that back into their decision making process.

If we built UI with an iron-clad guarantee that we would “do it like everyone else”, would we have ever experienced the ultra simple Mom-friendly Tivo UI? Or Windows Media Center’s amazing, utterly un-Windows-like ten foot UI? Would Office 12 be using theinnovative new ribbon instead of traditional toolbars and menus? Heck, would we have ever made the transition from character mode to GUIs?

Did you read the quote?  Be honest.  Of course you didn’t.  It was a block of text totally unrelated to the subject of this article.  But do you know what it was about?  You probably have an idea.  It’s about figuring out what conventions we use for the sake of having conventions and the ones that are actually useful.  The rest of it is just supporting info.

To be fair, I don’t think that this is an issue that only Jimmy faces as I’ve seen this gripe before.  But I feel that a lot of people misunderstand what Jeff is trying to do (and maybe Jeff is doing something wrong that I’m missing to encourage that misunderstanding).  The thing is, if you’re just adding bold as a cheap trick to get attention, you’re doing it wrong.

In my first public speech class, I was taught to outline my speech something like this:

Today I plan to talk to you at x, y, and z.  Let’s talk about x first

Now that I’ve talked about x, let’s move on to y.

Now that I’ve talked about y, let’s talk about z.

Ok, we’ve talked about z, so let’s wrap it up.

When you’re talking, all those transitions can get repetitive.  In fact, that’s the point.  No matter how effective a speaker you are, your listeners are going to space out or daydream or otherwise not pay attention to you at various places during your speech.  And if someone zones out, you want to make sure they won’t be lost when they start paying attention again.

The same is true about blogging.  No matter how good a writer you are, people aren’t going to read your whole damn post.  Thus, you should assume that at any given point your user may go into “skim mode.”  And when a user does go into skim mode, you want to make sure that they don’t miss something important.  So what, praytell, is a blogger to do?  Well, there are multiple approaches.  But it seems that some of the most successful bloggers will choose to go the bold route fairly often.

And for the record, yes I did add bold phrases to the article for the sheer irony of it.  That’s how I roll.

Posted in Blogging | Tagged: , , , , | 4 Comments »

 
Follow

Get every new post delivered to your Inbox.