PHPUnit Database Extension (DBUnit Port)

I have completed the initial feature set for the Database extension to PHPUnit. This is a essentially a port of DBUnit to PHP.

For those that may not have read any of my previous postings on the subject the goal of this software is to extend the functionality of PHPUnit to allow using seed data to put a given database into a known state prior to executing each test. It also provides an easy mechanism to compare database contents with an expected dataset.

The database extension has recently been merged into the PHPUnit 3.2 branch and is scheduled to be released in that version. Sebastian Bergmann will be introducing the extension in his Advanced Testing with PHPUnit talk at PHP|Works 2007 in Atlanta September 13 – 14. If you would like to tinker around with the database extension prior to it’s release you can always download the latest copy of PHPUnit 3.2 from svn: svn://svn.phpunit.de/phpunit/phpunit/branches/3.2. The source can also be browsed at http://www.phpunit.de/browser/phpunit/branches/3.2/PHPUnit/Extensions/Database.

Please continue reading for an example of how you can now use PHPUnit to even more effectively test data-centric applications.

In this example I am going show you how to create a new test case to test the database functionality of a class providing functionality to edit the details of a bank account.

To use the database extension I will create a new test case extending the class PHPUnit_Extensions_Database_TestCase.

 

There are two methods that you must have in any database test case. Those methods are getConnection() and getDataSet(). The first method getConnection() must return an object that implements the PHPUnit_Extensions_Database_DB_IDatabaseConnection interface. In most cases you will return an instance of the PHPUnit_Extensions_Database_DB_DefaultDatabaseConnection. The constructor for this class takes two parameters. The first is a PDO object that is connected to the database you plan on using to test your functionality. The second parameter is the name of the database schema you will be testing against. I realize that PHPUnit_Extensions_Database_DB_DefaultDatabaseConnection is a handful to type so there is a helper method in the TestCase class that will instantiates a database connection with the same arguments: createDefaultDBConnection(PDO $connection, string $schema).

You can choose a variety of methods to get the PDO object to pass to this method. For this example I will create the connection in the test case constructor and add it as a protected class property. For large test suites you may want to use a static singleton method of a test utility class to help get rid of some of the overhead of opening and closing connections. Here is our test case with the getConnection() method implemented:

For my test I have simply chosen to use the memory database in sqlite. I have chosen this simply because it is faster than an sqlite database in the filesystem and this is just a tutorial so persistence of the file is simply not needed. I have called the BankAccount::createTable() method to create the schema that the BankAccount object uses. The downside to using the memory database is you have to create your schema every time. Normally you would have a persistant database for testing that would always have the necessary schema to test your code. Note that when using sqlite what you pass as the $schema parameter does not really matter since sqlite does not really have the concept of schemas.

This brings us to the next required method. The getDataSet() method is used to return the seed data that is used in setUp() and tearDown() operations for each test. There are currently two different ways to easily load datasets: the Flat XML DataSet and the XML DataSet. The flat XML data set is an extremely simple yet somewhat limited data set where the XML data set is slightly more complicated but much more robust when dealing with null values. For our example we will use the flat xml data set.

In the flat xml data set you create a root element named dataset and then create xml element for every row in every table of your database. The element name is equal to the table name the row resides in. The element will have one attribute for each column in the table. If there is an attribute missing for one element that exists in another element then that colum for the row with the missing attribute will be set to NULL. Please note that colname=”” will not set that column to null. It will set it to an empty string (or whatever that translates to for that articular column’s data type.

The flat xml data set we will be using is as follows:

The classname of the flat xml data set is PHPUnit_Extensions_Database_DataSet_FlatXMLDataSet and it’s only argument is the path to xml file that you are loading. Once again, this is an astronomically long class name so there is a helper method to instantiate this class: createFlatXMLDataSet().

Here is our test case with the getDataSet() method implemented:

Now in a default implementation the database extension will perform a ‘clean insert’ operation on your database using the the dataset returned by getDataSet. The clean insert operation will truncate any tables in the data set (not all tabes in the database) and then insert the rows in the dataset. The default implementation will do nothing on tear down. There are several other operations that can be used by adding the getSetUpOperation() and getTearDownOperation() methods. These methods should return a PHPUnit_Extensions_Database_Operation_* class. These operations can be returned using the test case’s getOperations() method. The default set up implementation is the equivelant of creating the following getSetUpOperation().

There is one more thing to take note of before we dive into our first test. The Database_Testcase is heavily dependant on it’s implementation of setUp() and tearDown() for it’s functionality to work. If you need to perform additional setup or teardown actions make sure you call the parent versions of these functions as well.

The functionality I would like to test in this example is testing new account creation. In the bank account class, whenever a new BankAccount class is instantiated it is supposed to create a new bank account with a zero balance in the database. To test this we will need to create another dataset representing our seed data with the expected changes.

Using this data set we can create a new test that instantiates a bank account object and then we will assert that the contents of the database are equal to the dataset we just created. In order to compare the database to your dataset you need to create a dataset containing the contents of your database. This is done by calling the createDataSet() method of the database connection object you created in the getConnection() method. You can then pass it to the assertDataSetsEqual() function. Also notice that you can just reuse the createFlatXMLDataSet() method to load the expected dataset.

If you have a large database data set you can also compare at the table level using the test case method assertTablesEqual(). You can pass it individual tables by calling the getTable($tableName) method. This method exists for any of the data set classes. You can also limit tables at the database level by passing an array of table names to the createDataSet() method. So if your database had multiple tables the above function could have been rewritten as below to only compare at the table level.

Or if you didn’t even want to bother loading tables you won’t be testing against:

Hopefully that is enough to pique your interest. As always, if you have any questions feel free to ask me.

35 thoughts on “PHPUnit Database Extension (DBUnit Port)”

  1. That is a good point and definantly something that should be kept in mind when working with your test suites. It also makes another good point of the fact that it is safest to use a seperate connection for your fixture. Again because of the potential of auto insert IDs, affected rows, current errors etc.

  2. Nice addition!

    I wonder if inheritance (dbtestcase extending the regular one) is the Right Thing; I think people would need to use databases in existing testcases etc. Probably using a composition to do this would be cleaner.

    Is it also possible to use a ‘fixture’ like approach such as is common in ruby? Or php-based datasets instead of XML? That would make it easier to create pseudo-recent datasets such as ‘create a record with its date set to yesterday’.

  3. I am actually planning on slipping a method to add DB tests to existing test cases in before the 3.2 release. That is actually the purpose of the PHPUnit_Extensions_Database_*DatabaseTester classes.

    I read a little bit about ruby fixtures and I am not sure what aspects of it you are wanting to use. I do have plans for adding a csv format to the dataset types as well as a composite dataset (to allow for combining multiple datasets.) I could also add a ruby fixture as a potential data set style. As far as loading fixtures in the ruby style, I don’t know that I would want to depart that much from how PHPUnit already does things. If you have some more info that would helpful in understanding ruby fixtures that would help.

    Lasty there is sort of a way to build datasets in php using the DefaultDataSet/DefaultTable/DefaultTableMetaData classes. I will probably refine this to allow using a multi-dimensional array to create these data sets. IE
    $phpDataSet = $this->createPHPDataSet(array(
    ‘table1′ => array(
    array(‘col1′ => 1, ‘col2′ => 2, ‘col3′ => 3),

    ),

    ));

    Thank you for the feedback it is extremely useful. If you have any other ideas feel free to open tickets at the PHPUnit Trac: http://www.phpunit.de/newticket

  4. In ruby it’s common to use ‘YAML’ as format for the fixtures; but the format isn’t really that important. The important feature of fixtures is that you can have ‘named sets’. For example, you could create 2 fixtures:

    account:
    balance: 100

    account_negative:
    balance: -100

    When running testcases that do tests on accounts that have a negative balance, you’d tell your testcase to load ‘account_negative’ and your other testcases to use ‘account’. This way, you can create specific sets of testdata for specific situations.

  5. This is going to be a really useful addition to PHPUnit. I’m glad to see this get rolling.

    Is there any significant advantage of using XML files to define the datasets, versus plain old PHP arrays? Personally, I would prefer just including PHP arrays instead.

    Also, I’m not familiar with Ruby and its testing systems, but I do like the idea of the named data sets mentioned by Ivo Jansch, above.

  6. Hi, Is there any possibility to use PHPUnit for FTP
    so I can test the connection and other stuff before
    using the real one.

    Greets ken,

  7. I wanted to be able to do this for use in our projects now so I have subclassed PHPUnit_Extensions_Database_DB_FilteredDataSet to create a new class WriteableFilteredDataset. To this I added a method writeFlatXmlDataset to write out an XML file that adheres to the FlatXMLDataSet XML structure. You can then load this using the standard Database TestCase delivered in Mike’s work (ie using the createFlatXmlDataSet method).

    I also subclassed PHPUnit_Extensions_Database_DB_DefaultDatabaseConnection to create a new class SeedDatabaseConnection and added a new method called createWriteableDataSet which returns an instance of the aforementioned subclass WriteableFilteredDataset.

    Props to you Mike for keeping it extensible!

    So in your test suite you can create a new SeedDatabaseConnection object that connects to a seed database containing tables full of extractable data. Then when you want to, in your test cases you can extract the data you want out from the seed database into xml files and then load those xml files into the database that your tests are running against.

    So I guess my idea is to have this seed database that contains a clone of data from the production install. Then if you need some test data to run a certain test against you go into the seed database, refine that set of test data manually using SQL, then extract that refined data to xml files that your tests will use to set up their fixtures. Then you can refresh the seed database again from production and repeat the process to capture different test data sets when you need to. Does that make sense?

    If anyone is interested in the aforementioned subclasses I can send em…I’m contactable via the contact link on the workingsoftware website.

    cheers,
    dan

  8. It’s probably me being stupid, but…

    I don’t really get it. You create a PDO-object and then you send it to the function you are testing. I am not prepared to rewrite my project to make all functions accept a PDO-object.

    What am I missing?

  9. you don’t have to use a pdo object in the code you are testing. In fact it is preferred (imo) that you use a completely different connection anyways. I just used the same connection for brevity of the example.

    The only requirement for PDO is for the setting up of the tests themselves. You are not required to use PDO in your code.

  10. Every public testWhatever() method tests some functionality of the class we want to test.

    Do they all share the dataset returned by getDataSet()? Is there a way to specify different initial state for different testWhatever() methods?

    The initial state dataset is class-level or can it be method-level? Or maybe the latter is not a good idea whatsoever?

    I want to test a class that is like a finite state machine, and every testWhatever() method is a transition. Therefore, every test method focuses on a different initial state. If I design a class-level dataset as the union of the initial testmethod-level datasets, phpunit forces me to include them again in every final dataset to compare with -except the corresponding to the current test method, substituted by the end state of the transition.
    That’s very verbose and redundant, and scales badly when you add test methods.

    Alternatively, I can use a test class for each test, but that’s class-verbose; for a system class to test, I need as many test classes as functionalities I want to test; It doesn’t look as a clean implementation of a unit test to me, does it?

    Thanks!

  11. Thanks a lot for your work. It is very helpful.

    I didn’t find a way to build quickly a dataset from the database, so I develop the following php script. I hope it will be useful to some people…

    Cheers !

    OS.

    [code]
    //Deals with the optional GET parameter
    $displayInBrowser = FALSE;
    $openMarkupChar = '< ';
    if (isset($_GET['display_in_browser'])) {
    $displayInBrowser = TRUE;
    }

    //Retrieve the list of the tables
    $queryResult = mysql_query('SHOW TABLES');
    $tablesListArray = array();
    while($tableNameArray = mysql_fetch_array($queryResult)) {
    $tablesListArray[] = $tableNameArray[0];
    }

    if ($displayInBrowser) {
    echo "

    ";
    }
    [/code]

  12. Im just wondering about performances on large sets of data.
    If the test is done on a database containing a copy of the application in production, it may contain several thousand if not more lines of data.
    Dumping that into xml and comparing it with a reference xml is almost certain to have very poor performances, no ?

  13. Hello!

    First I must say thanks for the great tutorial you wrote. I was ripping my hair off trying to get things to work, but I just can’t do it! I wrote a test case in order to test some db data, following your tutorial. Everything I wrote is by the book, nothing exotic. But everytime I run phpunit I get this error: Class PHPUnit_Extensions_Database_TestCase could not be found in classitem.php

    What’s wrong?? Please help me!

    Thanks in advance!!

  14. Oliver,
    you are correct, this would have pretty severe performance implications. The idea is that you would have a smaller database with small datasets that allow you to test specific scenarios from the data stand point.

    Of course that’s not to say you can’t use a larger data set. This is possible to, you just have to understand the performance implications you are bringing up.

    What we do at my company is create a clean database with only reference data and then using dbunit we only populate the data required for the code to run.

  15. Hi Mike,
    Thanks for this great missing feature in PHPUnit.

    Just as mjimenez I would like to know why I’m limited to only call one xml seed file per test class? Some test methods might require different sets of data.

    And also, when you seed your test DB with createFlatXMLDataSet() and you’re doing inserts, updates or deleting from a table that is not in the seed XML file, it will use the real database.
    IMO this causes confusion as failure to remember to include a table in the XML file will have unexpected results for the end-user.

    Could you please clarify these two issues Mike? Thank you.

  16. Oh and one more thing…

    The second argument on method createDefaultDBConnection() is required but doesn’t seem to be doing much when not using SQlite.

  17. The reason why you are limited to one dataset is for simplicity. Your test cases as far as database testing is concerned should revolve around using a common dataset.

    I don’t quite understand what your second comment is about. All I can say is PHPUnit can’t do what you don’t tell it to in regards to setting up the database. Imo it would cause significantly more confusion if tables were getting truncated when you told it not to.

    Lastly, the schema parameter is necessary to be able to do ither required introspection on the database. There is no sql standard way to pull your current schema nor is there a way to pull this information from PDO, and it is actually sqlite that this does nothing for :).

  18. I had this problem too. For me, it was resolved by ensuring I had the filename of my test case the same as the test class name, e.g. filename BankAccountDBTest.php. I notice your error message mentioned classitem.php, is this the name of your class?

  19. Thanks for the comment about the class name and file name needing to match! Very helpful, the error message isn’t helpful in determining this cause.

    Is there any reason for the requirement? I’m working with some legacy code, and now have to have multiple coding styles and naming schemes (renaming all existing files, functions, and classes to match this is not an option). I can understand the coding style, but I don’t understand why it should be enforced.

    Either way, it would be helpful to mention this in the documentation.

    Thanks for the extension, aside from the above, so far it looks good!

  20. i am getting this error ” Class ‘PHPUnit_Extensions_Database_DataSet_FlatXMLDataSet’ not found”

    i have that class avaialble ..
    not sure what is wrong.

    Please help
    Manasi

  21. Largely a lack of time on my part. There will be a fairly thorough description of this functionality in a soon to be released book. I am also attempting to put some more time in on my personal projects which would include some long awaited documentation and bug fixes. I have to feed the family first ;).

  22. Great example. I used this to get myself going. The manual seems to be lacking in examples. Perhaps this example should be included in the manual.

  23. Thanks for great tutorial.

    I’ve two question.
    1. I’m using XMLDataset and it works fine. But is there a way if I want to use getDataSet() method to use more than one xml files. For example, I’ve two xml files with has data eg; table1.xml and table2.xml. I’ve a test which uses both data(table1.xml and table2.xml). Rather than creating another xml file table1_table2.xml and use it within datasource, I want to use existing 2 datasources. It would be great to know if this is possible.

    2. I was trying to use return $this->getOperations()->TRUNCATE() method. Is anyone been able to make it work ? I’m using phpunit 3.6 Suggestion would be great.

    Thanks
    P

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">