Over the last couple of months there is a concept in object-oriented programming that has been bothering me. This is the notion of ‘programming to an interface, not an implementation’. This is a principle I first learned about when reading “Design Patterns: Elements of Reusable Object-Oriented Software” and then further solidified when I read “Head First Design Patterns”. So when I say “programming to an interface, not an implementation” what exactly do I mean? Bill Venners interviewed Erich Gamma (interview here), one of the “Gang of Four” that co-authored “Design Patterns”, for Artima Developer during which time he asked the same question. The paraphrased answer is that it helps to limit the introduction of dependencies in your code. Limiting dependencies makes your code easier to test, easier to deploy and easier to change.
What is the problem?
You may now asking, “Mike, why does this bother you? It seems like a straightforward concept and I can’t imagine that you would disagree with it.” Well, you are mostly correct. I don’t at all disagree with this principle of reusable object-oriented design. In fact I try to embrace it. The reason it is bothering me is because I don’t believe it is really that simple of a concept and it is ‘accidentally’ ignored by developers quite often.
So, lets go back a couple of months. As some of you know (and most of you don’t) I am currently a lead developer for a loan management platform for Selling Source, Inc. Part of my job as a lead developer is to oversee the architecture of our software and make sure it is going in the right direction. One of our projects at the moment is to revamp the search functionality of the software. I’ll try not to bore you with too many details (if you want more maybe I’ll write TWO blog posts this quarter.) We are denormalizing some of the frequently updated (and obnoxiously wide) tables involved in searching in order to allow moving indexes off of the primary tables in order to improve the update and insert speeds on the primary tables while improving the search speed in the new tables. While doing this we also decided it would be a good time to cleanup some of the code related to searching. One of the ideas was instead of building queries with string concatenations as weare currently doing we could utilize a ‘query builder’ to create the query. This would allow for more extensibility as responsibility for building the query would reside in an object that could be modified and manipulated much more elegantly than a string.
In the interest of not reinventing the wheel I had one of my developers research other, already existing query builders. The results of his search pointed us to the Zend Framework query builder: Zend_DB_Select. Now I did have a few requirements in my head for what I would like to see out of this external query builder. This first one being that I wanted it to be stand alone. I did not want to have to bring in a bunch of extra baggage that I simply did not need. I was not looking for a framework, we already have one of those. I was looking for a component that we could plug into our existing framework.
Therein lied the problem with Zend_DB_Select. If you look at Zend_DB_Select from the view point of “What are its dependencies?” You will quickly find that there is a class wide dependency on Zend_Db_Adapter_Abstract. This is an abstract class that essentially acts as a wrapper for the existing PHP database libraries. It provides a wealth of functionality including a unified interface for database interaction, a hook for a query profiler and some utility functions. I am sure in the grand scheme of the Zend Framework this is a very good and worth while class. However I would to make the argument that there is no reason for Zend_DB_Select to have a dependency on this class and the basis for that argument is an aspect of limiting your dependencies by truly programming to an interface.
How We Can Use It
So there are four ways that I would be able to use Zend_Db_Select as it is implemented right now.
- Change my code to utilize Zend’s db adapter for all of its connection needs. While the Zend adapter may well be better than the one I am using, I already have established use in over 100,000 lines of code of our own database connection classes. I just need to redesign our search code. Replacing all uses would be completely outside of the scope which in the business world is a bad place to be.
- Create a Zend_DB_Adapter instance whenever you want to use the query builder. This is not a very good option either. We are using InnoDB for our database. InnoDB performance suffers pretty much linearly with the number of open connections. Creating two connections to the same database on most every request is just a bad idea.
- Create an extension of Zend_DB_Adapter that decorates (wraps) an existing instance of my connection object. Of all three options so far this one is the most appealing. That being said I still do not think it is ideal. The database connection libraries we use at my office is much more encapsulated than the Zend counterpart. There is alot that Zend_DB_Adapter does that is not in our connection class, it is spread out to other classes such that it can be loaded on an as needed basis. This would make it take some time to write an adapter and would still necessitate us bringing Zend_DB_Adapter into our code base for basically a single purpose.
- Modify Zend_Db_Select to remove the dependency on Zend_Db_Adapter_Abstract. This is actually the path we are currently exploring. The large downside to this is that we won’t be able to quickly get the upgrades or bug fixes from Zend Framework. However I am not too concerned. What we need this class to do is significantly more limited than its current feature set so we will be able to shrink it significantly and with less code comes more stability or at the very least fewer things that could break.
We have identified that this dependency on Zend_Db_Adapter_Abstract is going to cause us all sorts of greif. Lets look at WHY Zend_DB_Select is dependent on Zend_Db_Adapter_Abstract. First, Zend_DB_Select is given the Zend_Db_Adapter_Abstract instance in the constructor. It is then set to the protected member variable $_adapter. So, lets see what all in $_adapter is called throughout the class:
- quoteInto(): 3
- quoteIdentifier(): 6
- quoteTableAs(): 1
- quoteColumnAs(): 3
- limit(): 1
- query(): 1
- getFetchMode(): 1
The first thing you will notice is that all but 3 calls are to methods that are responsible for quoting an object. Now, indulge me for a moment and imagine that those were the only calls made. Why should I need a full fledged connection adapter object to perform quoting? I do realize that some quoting operations depend on the connection (character set, rdbms differences, etc.) But this should not mean that I have to use (in some form or another) Zend’s database adapters. I don’t want to use it. Mine works extremely well for what I want to do and I am already using it throughout my application.
Moving Closer to Integration Utopia
Now, in a dream world Zend_Db_Select would already be in a state that we could easily plug it into our code without having to go with one of the above four options. What if we created an interface that had a public method for each of the Zend_DB_Adapter_Abstract quote methods? It would look something like this:
public function quote($value, $type = null);
public function quoteInto($text, $value, $type = null, $count = null);
public function quoteIdentifier($ident, $auto=false);
public function quoteColumnAs($ident, $alias, $auto=false);
public function quoteTableAs($ident, $alias = null, $auto=false);
The Zend_DB_Adapter could easily be made to implement this interface (you would just have to add the interface to the class definition.) Then I could change the type hint in Zend_Db_Select’s constructor to use the new interface. So how would this help my cause? Well now I have a new option. This new option would be to create an implementation of this interface that worked with my existing connection objects. This would be significantly easier than alternative 3. The interface would be well defined and significantly smaller than the Zend_Db_Adapter_Abstract class.
From a design standpoint this interface makes more sense too. The only thing we care about is quoting. So why not specify (via the constructor type hint) that this is really the only thing we are concerned with. It would make the true dependencies of our class much more transparent. From an extensibility and flexibility stand point this is very useful.
So, we still have those other pesky three calls to the Zend_Db_Adapter_Abstract class.
- limit(): 1
- query(): 1
- getFetchMode(): 1
Lets explore why these are there.
The reason why limit() is being used is because different database platforms have different ways to specify limits and offsets. For MySQL you can use LIMIT , LIMIT , or LIMIT OFFSET . Oracle from what I understand doesn’t implement limits at all. Now one thing that I find uniquely curious is that all of a sudden in Zend they have moved query building back into their adapters in a very explicit way. So what would I do? Well, since we have already established that quoting really should be connection specific and we have no identified that limits are really connection specific (by virtue of being RDBMS specific) then we could make a small change to the interface above. The first would be to add the limit() function. The next would be to change the name. It is no longer a ‘Quoter’ it is now essentially performing all connection specific alterations for our queries. So lets try out the name Zend_Db_IQueryModifier. Now before you go yelling at me for that name, let me first say that I suck at class naming. You are more than welcome to ‘insert your name here’.
This leaves us with a finding home for query and getFetchMode(). I have a problem with these two method calls. They are both used on a method called query() in the Zend_Db_Select. For the kind of object I am looking for, I don’t think these should be here. We already have a class that is responsible for executing queries. Instead of calling ‘query()’ on my query builder object I would much rather pass that object to my adapter. Which is actually what the Zend Framework implementation of this method does. If you do not want to make your adapter dependent on the query object (which I personally would not want to do either) you could have the Zend_Db_Select have a method to return the query. (Zend_Db_Select implements __toString()).
With these changes I (or anyone for the matter) could use the Zend_Db_Select class completely stand alone with no need for any additional, non-exception Zend classes.
I am a Lover, Not a Hater (Oh and I’m Stupid Too)
It is not my purpose to pick on Zend Framework. I am confident that they are not the only example of not clearly thinking through dependencies and I am equally confident that many people find Zend Framework to be very useful for their needs. They just happened to have an easy real world example of ways that code can be made better and I happened to have to spend a significant amount of time inspecting the code for reasons other than writing an article. Just to prove I am not trying to point the finger I have an example that is arguably worse than the one above and it is in code that I wrote. A bug was recently opened for PHPUnit that essentially asked me why I had type hints for the default database connection class of the database extension in several of my files. There was no real need for this. I wasn’t as interested in a wrapped version of the class as I was in simply retrieving meta data. In fact to make things even worse I ALREADY HAD an interface for this data. I just simply did not use it in this case. It was a terrible oversight of mine and is an example of even when you know better these kind of dependencies can slip by. If you would like more details I can outline them in another post.
I am Almost Done
So, at the expense of ending this ridiculously long blog posting lets go back to the Erich Gamma interview. In the spirit of programming to interfaces, not implementations, I believe you should limit dependencies to responsibilities not complete classes. If a class truly has only one responsibility, then it is probably okay to use that class name as the type hint. If a class has multiple responsibilities and you have deemed it necessary than I would strongly recommend you create interfaces to indicate the public methods for those responsibilities and use those in your type hints as appropriate.
Now, if you read through the entire interview (and I suggest you do) you will notice that Erich makes some points that might on the surface seem to contradict what I am saying. Erich states “An abstract class is good as well. In fact, an abstract class gives you more flexibility when it comes to evolution. You can add new behavior without breaking clients.” This is a very valid point. He does however go on to state, “As always there is a trade-off, an interface gives you freedom with regard to the base class, an abstract class gives you the freedom to add new methods later.” So you do have a decision to make. My issue with Zend_Db_Select turned out to be that the dependency in code was deeper than it needed to be. Sometimes you will have a dependency on an abstract class that will be perfectly fine. So long as you keep responsibilities to a minimum via composition among other means (which the second point in that interview helps with.) Then you should find it very easy to program to interfaces.