Jason in a Nutshell

All about programming and whatever else comes to mind

ETL in Subsonic

Posted by Jason Baker on December 12, 2008

Since my university has migrated to a new registration and enrollment system (based on Oracle), the department I work for has been scrambling to make all of our systems work with it.  So here’s the task:  pull data from the database, and put it in one of our legacy applications’ database (SQL Server).  Sounds simple right?  Well, actually it is.  But it may not be apparent at first glance how a person would go about doing this.

For inserting records into SQL Server, we’re using LINQ to SQL.  It’s just so easy to set up that there’s no reason not to.  Unfortunately, that isn’t the case for dealing with Banner since LINQ to SQL doesn’t natively support Oracle.  There are plugins for this, but I have yet to encounter any that are of high enough quality to be used in a production system.  So my choice for this is SubSonic.

It’s easy to see why at first glance some people feel they have to “drag it around.”  But you can actually do some pretty interesting things with SubSonic if you spend some time and go through their (sub-par IMO) documentation.

Here’s the basic situation.  We need to be able to pull every newly admitted student into our system.  This is broken up into two tables, a person table and an admissions application table.  These can be represented in C# (using SubSonic) somewhat like this:

class Person
{
    public string FirstName {get; set;}
    public string LastName {get; set;}
    public string Gender {get; set;}
    public int PersonId {get; set;}
} 
class AdmissionApplication
{
    public string AdmitCode {get; set;}
    public string AdmitPeriod {get; set;}
    public string SiteName {get; set;}
    public int PersonId {get; set;}
}

I’ll spare you the business logic behind this, but essentially we’re joining these two tables together by PersonId and determining if the student was admitted since the last time we checked.  The data that we get will get put into a profile record in our SQL Server database.  This data will look something like this:

class Profile
{
    public string fname {get; set;}
    public string lname {get; set;}
    public char? gender {get; set;}
    public int pidm {get; set;}
}

There are a couple things I’d like to draw your attention to.  First of all, the data from the two tables is mainly the same, but the names are just different enough to make a difference.  Secondly, in our Oracle database, the gender is stored as a varchar (for some strange reason: it’s only one character).  In SQL Server, it’s stored as a single (nullable) character.  The first problem is easy enough to solve, but it takes a little bit of digging around on the SubSonic website to find it (I found the answer here: http://subsonicproject.com/2-1-pakala/subsonic-version-2-1-pakala-preview-the-new-query-tool/).

Here’s what the query looks like:

var sel = new Select(
    "ADMISSION_APPLICATION.PERSON_ID as 'pidm'",
    "FIRST_NAME as 'fname'",
    "LAST_NAME as 'lname'",
    "GENDER as 'gender'",
); 
var results = sel
  .From<Person>()
  .InnerJoin(AdmissionApplication.Schema.Name, AdmissionsApplication.Columns.PersonId,
             Person.Name, Person.PersonId);
  .ExecuteTypedList<Profile>()

(yes, I have been accused of abusing the var keyword)

This will automatically translate all of the column names from the result set into column names in Profile so that we have a List of Profiles.  But wait!  Don’t compile just yet.  We still have to tackle the issue of how to translate the string into that pesky nullable char.  That’s fairly simple to do.  We just have to add something to our Profile class:

class Profile
{
    public string fname {get; set;}
    public string lname {get; set;}
    public char? gender {get; set;}
    public int pidm {get; set;}
    public string StrGender
    {
        set
        {
            if (value == null || value.Length < 1){
                gender = null;
            }
            else{
                gender = value[0];
            }
        }
        get
        {
            return gender.ToString();
        }
    }
}

Then we just have to change that first part of the query above to this:

var sel = new Select(
    "ADMISSION_APPLICATION.PERSON_ID as 'pidm'",
    "FIRST_NAME as 'fname'",
    "LAST_NAME as 'lname'",
    "GENDER as 'StrGender'",
);

And we’re done!  Obviously, there are a few places where this code is fairly naive, but this is just an example.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: