PHPUnit beyond basics: Dataproviders

Testing a lot of data in a small test

Once you have set up your first unit tests, and you have a good configuration, its time to add a lot of tests. Lets take a look at using data providers, as a way to test with a lot of data.

For this example we’ll test a piece of code that is supposed to do the following. Given the array ['a', 'b', 'c'], return the string a a-b a-b-c b b-c c. This function has to combine the array values to that string.

Now we could write a test that tests this case. However, we need to make sure this works in more than just this case. It needs to handle duplicate entries, an empty array, an array with just one element, and so on. We could add a test method for every case, but there is an easier solution: dataproviders.

public function testItCombines(): void
{
  $this->assertSame('a a-b a-b-c b b-c c', KeyCombiner::combine(['a', 'b', 'c']));
}

A dataprovider, will provide the input for your tests. Normally a test method does not take any arguments, but with a dataprovider, you can execute it many times, with different data every time. So lets change our test, to use a data provider, for just this case.

/**
 * @dataProvider provideCombineCases
 */
public function testItCombines(array $input, string $expectedOutput): void
{
  $this->assertSame($expectedOutput, KeyCombiner::combine($input));
}

public function provideCombineCases(): Generator
{
  yield [['a', 'b', 'c'], 'a a-b a-b-c b b-c c'];
}

We configure our test method to have a dataprovider by the annotation: @dataProvider provideCombineCases. (If you are using PHPUnit 10, you can use the Datprovider attribute rather than the annotation.). This has to refer to a public method on the same class, which will return an iterable. This iterable has to be an iterable of arrays, where every array matches the arguments of the method it provides data to.

In our case, we have a Generator which provides an array, containing an array, and a string. These match our test method, which takes an array, and a string. Now, adding more test cases, is as simple as adding a new case to the data provider. So lets add the test cases we talked about. And, if we discover a bug, we can add a new test case to the provider, and make sure we cover that too.

public function provideCombineCases(): Generator
{
  yield [['a', 'b', 'c'], 'a a-b a-b-c b b-c c'];
  yield [[], ''];
  yield [['a', 'a', 'b'], 'a a-a a-a-b a a-b b'];
  yield [['b'], 'b'];
}

But, we’re not done here. If we add keys to our dataprovider, we get that information when our tests fail, or when we use the --testdox flag. Currently it will output something along the lines of “Test it combines with dataset 0”.

public function provideCombineCases(): Generator
{
  yield 'multiple keys' => [['a', 'b', 'c'], 'a a-b a-b-c b b-c c'];
  yield 'an empty list' => [[], ''];
  yield 'duplicate keys' => [['a', 'a', 'b'], 'a a-a a-a-b a a-b b'];
  yield 'a single key' => [['b'], 'b'];
}

Now, if we use the testdox flag, we get an output like “Test it combines with multiple keys”. And if we get a failure, the error message will also contain the key you used.

Additional things

Multiple providers

You can use multiple dataproviders on a single method. This can be useful for validation methods. Where you can split up the cases that pass validation, and cases that dont.

/**
 * @dataProvider provideAlnumCases
 * @dataProvider provideNonAlnumCases
 */
public function testItValidatesAlnum(string $input, bool $isAlnum): void
{
  $this->assertSame($isAlnum, Validator::isAlnum($input));
}

public function provideAlnumCases(): Generator
{
  yield ['a', true];
  // more
}

public function provideNonAlnumCases(): Generator
{
  yield ['_', false];
  // more
}

Code coverage

Any code that is executed during the dataprovider does not count toward code coverage. So if your object under test is only constructed in dataproviders the __construct method won’t be considered covered.

Errors

When an Exception (or Error) is thrown in your dataprovider, none of the test cases are executed. Instead it will count as a single (failed) test.

When

Data providers are ran before the setUp or setUpBeforeClass method is called. Meaning you can’t use any properties set in those methods.

Avatar
Gert de Pagter
Software Engineer

My interests include software development, math and magic.