Jason in a Nutshell

All about programming and whatever else comes to mind

Posts Tagged ‘C#’

Java vs C# vs Python vs Ruby: an “objective” analysis

Posted by Jason Baker on April 21, 2009

At my place of employment, we’re looking to migrate some of our old Classic ASP applications to something newer (yes, we still have actively maintained Classic ASP code). So my boss asked me to write up an analysis of the different options we have available.

Now before I give you the link, I have a few disclaimers and other random thoughts:

  • I tried to avoid editorializing so that this can be objective as possible, but it’s impossible to discuss these kinds of issues without being subjective.  Thus, don’t take this as the gospel truth.
  • There may be errors here.  In fact, I can almost guarantee that there are errors with Java and Ruby because I’m not terribly familiar with them.  If there are errors, feel free to leave comments.
  • My boss is a Java guy, so I left some blanks that I’m pretty sure he can fill in.
  • Some of these are blatant oversimplifications.  There’s only so much data that you can squeeze into a spreadsheet.
  • I’ll try and keep up with this for a while, but chances are that I won’t for long.  These languages are all being changed.
  • I’m biased towards Python.

Ok, without further ado, here’s the link:

http://spreadsheets.google.com/pub?key=p7efJLoHuYE-iw6JxBmpSQg&hl=en

Advertisements

Posted in Programming | Tagged: , , , , , , , | 12 Comments »

Enemies of Test Driven Development part II: YAGNI

Posted by Jason Baker on January 11, 2009

Just like in this post’s predecessor, don’t take the title of this post to mean that YAGNI should be abandoned.  YAGNI is still a very good principle that has saved me from writing a lot of crap code.  But many of us are trained to apply it in such a manner that runs counter to Test Driven Development.

Let’s take the following scenario.  I have a wrapper around one of the components of my data model.  The idea is that you insert data into it and then flush it, like so:

wrapper.Insert(some_collection);
wrapper.Flush();

The idea being that as soon as this code is ran, the contents of some_collection will be inserted into a database.  Let’s look at how the Flush method works at the backend:

void Flush()
{
    using (var dc = new SomeDataContext())
    {
        Flush((ISomeDataContext)dc);
    }
}

void Flush(ISomeDataContext dc)
{
    //do stuff with dc
}

Just to clear up any doubts you may have, the latter overload of Flush is not used in the production code anywhere.  I added this in just to make the code more generic.

You are going to need it

For those of you properly trained in the arts of YAGNI, these last two sentances probably set off alarm bells.  But take a look at this code:

class MockDC : ISomeDataContext
{
    public bool MethodCalled {get; set;}
    public void MethodThatMustBeCalledByFlush () {MethodCalled = true;}
}

[TestMethod]
public void TestFlush()
{
    var wrapper = new SomeWrapper();
    wrapper.Insert(stuff);

    var dc = new MockDC();
    wrapper.Flush((ISomeDC)dc);
    Assert.IsTrue(wrapper.MethodCalled);
}

Newbies to Test Driven Development (myself included) tend to find it difficult to write code that serves no purpose other than to enhance testability.  The thing is: code that makes other code more testable is needed.  And the above tests either wouldn’t have been possible or would have been hideously complicated without overloading Flush to be more generic.

So what is YAGNI good for?

Some people feel that TDD has largely subsumed YAGNI.  I can definitely see where they’re coming from.  But I think that YAGNI is still a good basic principle.

YAGNI is much like optimization in that it’s very difficult to apply at a micro-level.  Thus, if you find that you’re not adding necessary methods because of YAGNI, you’re probably using it wrong.  YAGNI is best applied at a high level.

One of my first assignments at my current job was to write a program to transfer some stuff from one database to another.  I thought I would try to go a bit above and beyond the call of duty.  See, my job has a lot of programs like these.  Just think of how much duplicated code must be out there!  Why not write some kind of framework to make this kind of stuff easier and allow code reuse?

The sky was the limit from this point on.  Before too much longer, I had a model that included the pipeline and observer design patterns.  Along with a component model that allowed linking components in all sorts of need ways.  I got about halfway through it.  At a certain point, it just became too difficult to make the .Net type system work with what I was wanting to do.

I eventually just decided to code it the “regular” way without any of that cool stuff.  I finished it within a few days.  Now, a judicious use of YAGNI would have saved me a lot of time here.  So what’s the lesson?  With apologies to Albert Einstein, the lesson is to apply YAGNI as much as possible, but no more.  Once you get a sane approach going, I think you’ll find that YAGNI is a tool that will help your programming rather than hurt it.

Other Posts:

Posted in Programming | Tagged: , , , , , , | 1 Comment »

Enemies of Test Driven Development part I: encapsulation

Posted by Jason Baker on January 8, 2009

Before you leave a nasty comment below hear me out. I’m not saying that we need to abandon the idea of encapsulation.  That would be stupid.  Rather, I’m saying that to be able to do test driven development properly, you need to re-think how you handle encapsulation.

(I had thought of naming this post “Enemies of Test Driven Development part I:  The Ideas You Currently Have About Encapsulation”, but that was too long)

What’s that smell?

The most difficult part of dealing with Test Driven Development is learning how to test private methods.  The answer to that is simpler than you may think:  you don’t.

Here’s a quote from Michael Feathers (one of the gurus of testing):

It seems that reverse is true also.  Classes which are hard to instantiate and use in a test harness are more coupled than they could be, and classes with private methods that you feel the urge to test, invariably have some sort of cohesion problem: they have more than one responsibility.

All I can say is that in the community of people doing test-driven development there are a number of people who have found that this question of testing private methods doesn’t come up much in their practice.  They target both testability and good design and find that both goals nurture each other.

It seems to me that Feathers is stopping just short of calling private methods code smells.  I’m going to take it there:  private methods are a code smell.  Does this mean that every private method that’s ever been written is bad?  Of course not.  There are times when private methods are a good and wholesome thing.  But if you’re using a private method, you should really consider if your design is a good one.

Solutions

I won’t dwell much on why private methods can be indicative of bad code.  Plenty has been written on that already.  Rather, I want to focus on overcoming these challenges.  So I’ve come up with a list of solutions to this problem.  Keep in mind that these solutions are tools for the toolbox.  They may not be applicable in every situation, nor are they a complete list.

With that said, here are some possible solutions:

Make it public!

What is it? This is probably the simplest way to overcome the problem of untestability.  And in my opinion, it’s the best solution for the TDD newbie.  Why?  Nine times out of ten, it’s a result of doing what you’re told.  If your university is/was like mine, you were told to make everything private unless you had good reason to make it public.  While that is actually true, it’s not really very useful.  There’s a reason for that:  testability is a perfectly good reason to make something public.  And you should test most of your code.

How do I do it? It’s simple, suppose I want to test SomeMethod:


class SomeClass
{
    private void SomeMethod() {...}
}

I could do this:

class SomeClass
{
    public void SomeMethod() {...}
}

Simple, eh?

When should I use it? To decide if this is the avenue you should take, evaluate why you want to make the method private.  If this is just a case of not wanting to make it public because you want to simplify the class’s API, there’s a good chance you’re over-hiding and you should evaluate this solution.  If there’s a deeper reason why you don’t want to make it public, there are a few other solutions.

Use Conventions

What is it? Python’s gotten by on this method for a long time.  And it works pretty well.  The idea is that “we’re all adults here” (if anybody can tell me who to attribute that quote to, let me know!).  If you don’t want somebody to call a certain method, name it as such.  In Python, the convention is to prefix private methods with a single underscore.

C++, C#, and Java users will probably disagree with me here (and that’s fine), but I think that this is an important tactic to note.

How do I do it? Simple.  Suppose I want to test SomeMethod:

class SomeClass(object):
    def SomeMethod(self):
        ...

Then I would just do this:

class SomeClass(object):
    def _someMethod(self):
        ...

When should I use it? The most obvious case is if you’re in an environment where this is acceptable.  If you’re in a Python shop, chances are you’re ok with this.  If you’re in a “curly brace” shop, you may have problems doing this.  We can dispute whether or not the reasons for that are good, but that’s really not relevant.  If you’re in such an organization, you should probably try something else if only for no other reason than to not hear co-workers complain.

Access Denied!

What is it? Just because you aren’t making private methods doesn’t mean you can’t disallow their use.  This is where interfaces and abstract base classes come into play (for the sake of succinctness, I’ll use the word “interface” to refer to both of these unless otherwise noted for the rest of the post).  Don’t want to allow client code to access a certain method, don’t put it in the interface!  Granted, this isn’t a perfect way to prevent client code from calling a method.  But then again, neither is making the method private (even in C++, although it isn’t easy to break there).

Be careful here though, if you find yourself creating too many interfaces to allow for giving different classes access to different areas, you’re probably creating a God Object.

How do I do it? Suppose I have a class SomeClass and I want to expose everything but SomeMethod:

class SomeClass
{
    public void SomeMethod() {...}
    public void SomeOtherMethod() {...}
    public void SomeOtherOtherMethod() {...}
}

I could just create an interface ISomeClass:

interface ISomeClass
{
    public void SomeOtherMethod() {...}
    public void SomeOtherOtherMethod() {...}
}

When should I use it? If you’re working in a C++/C#/Java shop.  This is a good alternative to the “conventions” method noted above.

Using Inheritance

What is it? If you’re willing to make a method protected, you can test it by inheriting from the class and exposing a public method that calls the method.

How do I do it? Suppose you want to test SomeMethod:

class SomeClass
{
    private void SomeMethod() {...}
}

You can then do this:

class SomeDerivedClass
{
    public void SomeMethod2() {SomeMethod();}
}

When should I use it? If you’re working with legacy code, this may be your only option.  Typically this works best when you want to test a class that you don’t have the luxury of being able to change.  Keep in mind that you’re adding another layer between your code and your tests, though.  This is a bigger deal in some languages than it is in others, but in general it’s to be avoided if at all possible.

Make Another Class

What is it? Ok, so you’ve reviewed the above methods, and you’re still just not comfortable with making them public.  This is usually an indicator that your class is doing too much.  Not only does this make testing difficult, it results in more tightly coupled code that will turn into a maintenence nightmare.  The idea then, is to separate the extra functionality into a separate class.

How do I do it? Suppose I have a class to access a database that looks something like this:

class UserParser(object):
    def _fillInfo(self):
        self.Username = getUnameFromDB()
        self.EmailAddress = getEmailFromDB()
        self.Name = getNameFromDB()
    def ParseData(self):
        _fillInfo()
        do_stuff_with_filled_data()

I can do this:

class UserAccessor(object):
    def FillInfo(self):
        self.Username = getUnameFromDB()
        self.EmailAddress = getEmailFromDB()
        self.Name = getNameFromDB()

class UserParser(object):
    def __init__(self):
        self.user = UserAccessor()
    def ParseData(self):
        self.user.FillInfo()
        do_stuff_with_filled_data(self.user)

When should I use it? There are a couple of situations when you would use this:

  1. When you can’t or won’t use any of the other methods.
  2. When you have a significant amount of methods you need to test but don’t want to make public.

Conclusion

Wow, this post ended up being longer than I thought!  I’m sure that there are a lot of techniques for doing this, and I’m sure that I’m missing some stuff.  So let me know them in the comments.

Update

Thanks to Przemek Owczarek for pointing out another method!

Other Posts

Posted in Programming, TDD | Tagged: , , , , , , , , | 5 Comments »

ETL in Subsonic

Posted by Jason Baker on December 12, 2008

Since my university has migrated to a new registration and enrollment system (based on Oracle), the department I work for has been scrambling to make all of our systems work with it.  So here’s the task:  pull data from the database, and put it in one of our legacy applications’ database (SQL Server).  Sounds simple right?  Well, actually it is.  But it may not be apparent at first glance how a person would go about doing this.

For inserting records into SQL Server, we’re using LINQ to SQL.  It’s just so easy to set up that there’s no reason not to.  Unfortunately, that isn’t the case for dealing with Banner since LINQ to SQL doesn’t natively support Oracle.  There are plugins for this, but I have yet to encounter any that are of high enough quality to be used in a production system.  So my choice for this is SubSonic.

It’s easy to see why at first glance some people feel they have to “drag it around.”  But you can actually do some pretty interesting things with SubSonic if you spend some time and go through their (sub-par IMO) documentation.

Here’s the basic situation.  We need to be able to pull every newly admitted student into our system.  This is broken up into two tables, a person table and an admissions application table.  These can be represented in C# (using SubSonic) somewhat like this:

class Person
{
    public string FirstName {get; set;}
    public string LastName {get; set;}
    public string Gender {get; set;}
    public int PersonId {get; set;}
} 
class AdmissionApplication
{
    public string AdmitCode {get; set;}
    public string AdmitPeriod {get; set;}
    public string SiteName {get; set;}
    public int PersonId {get; set;}
}

I’ll spare you the business logic behind this, but essentially we’re joining these two tables together by PersonId and determining if the student was admitted since the last time we checked.  The data that we get will get put into a profile record in our SQL Server database.  This data will look something like this:

class Profile
{
    public string fname {get; set;}
    public string lname {get; set;}
    public char? gender {get; set;}
    public int pidm {get; set;}
}

There are a couple things I’d like to draw your attention to.  First of all, the data from the two tables is mainly the same, but the names are just different enough to make a difference.  Secondly, in our Oracle database, the gender is stored as a varchar (for some strange reason: it’s only one character).  In SQL Server, it’s stored as a single (nullable) character.  The first problem is easy enough to solve, but it takes a little bit of digging around on the SubSonic website to find it (I found the answer here: http://subsonicproject.com/2-1-pakala/subsonic-version-2-1-pakala-preview-the-new-query-tool/).

Here’s what the query looks like:

var sel = new Select(
    "ADMISSION_APPLICATION.PERSON_ID as 'pidm'",
    "FIRST_NAME as 'fname'",
    "LAST_NAME as 'lname'",
    "GENDER as 'gender'",
); 
var results = sel
  .From<Person>()
  .InnerJoin(AdmissionApplication.Schema.Name, AdmissionsApplication.Columns.PersonId,
             Person.Name, Person.PersonId);
  .ExecuteTypedList<Profile>()

(yes, I have been accused of abusing the var keyword)

This will automatically translate all of the column names from the result set into column names in Profile so that we have a List of Profiles.  But wait!  Don’t compile just yet.  We still have to tackle the issue of how to translate the string into that pesky nullable char.  That’s fairly simple to do.  We just have to add something to our Profile class:

class Profile
{
    public string fname {get; set;}
    public string lname {get; set;}
    public char? gender {get; set;}
    public int pidm {get; set;}
    public string StrGender
    {
        set
        {
            if (value == null || value.Length < 1){
                gender = null;
            }
            else{
                gender = value[0];
            }
        }
        get
        {
            return gender.ToString();
        }
    }
}

Then we just have to change that first part of the query above to this:

var sel = new Select(
    "ADMISSION_APPLICATION.PERSON_ID as 'pidm'",
    "FIRST_NAME as 'fname'",
    "LAST_NAME as 'lname'",
    "GENDER as 'StrGender'",
);

And we’re done!  Obviously, there are a few places where this code is fairly naive, but this is just an example.

Posted in Programming | Tagged: , , , , , , , , | Leave a Comment »

The best tool for the job: Duck typing vs Interfaces

Posted by Jason Baker on December 11, 2008

A question on StackOverflow got me thinking today:  what are the advantages of duck typing in comparison to interface-based languages (like C# or Java) and vice versa?  Notice that I’m not talking about the differences between dynamic and static typing.  These two comparisons go hand-in-hand to be sure, but they are two different things in my opinion.

The basic idea behind duck typing is that “if it walks and talks like a duck, then it’s a duck.”  Thus, if you want to do an operation on a certain type of object, then you should do that operation on a certain type of object.  If the interpreter has a problem, then the interpreter/runtime will let you know (and hopefully you’ve got some error handling in place).

On the other hand, interface-based languages solve this problem by having a pre-established contract between objects and code.  Thus, an object will guarantee that it will implement certain methods if the client code will guarantee not to call anything else.  If that contract is broken, then the complier will likely let you know at compile time (though that isn’t always the case).

So what are the pros/cons of these two methods?

On first glance, it would seem that the benefits of type safety would make interface-based designs more worthwhile in almost all cases.  Indeed, interfaces have helped me prevent a lot of needless errors that I frequently encounter in using dynamic languages.  It does this by forcing you to think your class hierarchy a long way in advance.

This is where C# and Java get their enterprise-y reputations.  If you need to develop a complex hierarchy of classes, interface-based typing is the only way to go.

This is simultaneously interface-based typing’s greatest strength and greatest weakness.  Indeed, it’s also duck typing’s greatest strength and weakness.  As wrong as it feels, not every class hierarchy needs to be so well-planned.  Heck, sometimes you don’t need class hierarchies at all (there seem to be a lot of python programmers who are totally against inheritance altogether).  Of course this does require a lot more unit testing, but you’ve already signed onto the TDD bandwagon anyway, right?

This is where Python gets its famed extreme levels of productivity.  By having a good set of unit tests, a Python programmer can churn out good code at breakneck pace.

At any rate, both ways of programming have their strengths and weaknesses, and there’s not any good answer as to what way of thinking about it is the best way to go.  So go with your gut.  If your first reaction at tackling a task is to say “gee, it really would be nice to be able to write a lot of code fast without having to go through a lot of tedium” then go with Python or Ruby or most other dynamic languages.  If you find yourself saying “I need a very complex class hierarchy to do this” then chances are you should be using C# or Java.

Posted in Programming | Tagged: , , , , , , , , | 2 Comments »