What are the benefits of using Repositories?

Sep 08, 2014

Table of contents:

Round up of previous examples
What are Repositories?
Writing the first repository
Conclusion

So far in this series of blog posts we’ve looked at building and working with Repositories a couple of times.

A Repository is basically a layer that sits between your project’s domain and the database.

Whilst I think The Repository Pattern is becoming pretty well known, it seems to me that there is still confusion over why exactly you would use it, and what the real benefits of it are.

In today’s tutorial I’m going to be looking once again at The Repository Pattern to uncover the real beauty of the design.

Round up of previous examples

Before I jump into the exploration of The Repository Pattern, first we should look back at the two previous posts in this series that also mention the use of Repositories.

The first mention of Repositories was in the post Creating flexible Controllers in Laravel 4 using Repositories. This was the first tutorial that mentioned using a Repository as a layer between your controller and your database. By injecting an instance of an object that implements an interface, we can very easily switch out objects that also implement the same interface.

The next mention of using Repositories was in the post Eloquent tricks for better Repositories. In this post I looked at how you can create an abstract Repository to allow you to reuse common database querying methods. Many of the foundational aspects of a Repository will be consistent from implementation to implementation and so it makes sense to write reusable code.

What are Repositories?

So if you’ve already read those previous posts I’m guessing you are probably already have a fair idea of what a Repository is used for.

But do you understand the real reasons behind using a Repository in the first place? Whilst on the surface the reasoning behind The Repository Pattern seems pretty obvious, I think many people overlook the nuances.

I think there are basically four main benefits of using The Repository Pattern in an application.

Data storage as a detail of the application

The first big benefit of using The Repository Pattern is it moves you closer to thinking about the database as merely a detail of the overall application.

A lot of applications get their first burst of growth through the design of the database schema. Whilst many CRUD-centric applications are very much database oriented, this is the wrong approach for an entirely different suite of applications.

The database is a detail of your application. You should not design your application around how you intend to store the data.

The benefit of using The Repository Pattern in this instance is that you can write the Repository interface at the beginning of the project without really thinking about the actual technical details of how you are going to store your data.

For example, you might have the following UserRepository interface:

interface UserRepository
{
    public function findUserById($id);

    public function findUserByUsername($username);

    public function add(User $user);

    public function remove($id);
}

Instead of bothering to set up the database, you can instead write an in memory implementation that simply stores the data in a really lightweight way:

class InMemoryUserRepository implements UserRepository
{
    /** @var */
    private $users;

    public function findUserById($id)
    {
        return $this->users[$id];
    }

    public function findUserByUsername($username)
    {
        return array_filter($this->users, function ($user) use ($username) {
            return $this->user->username === $username;
        });
    }

    public function add(User $user)
    {
        $this->users[$user->id] = $user;
    }

    public function remove($id)
    {
        unset($this->users[$id]);
    }
}

You can then continue to build out the real important bits of your application knowing that whenever you get to the point of really needing to store data, you can simply write a Repository implementation that satisfies the requirements of your Repository interface.

Much easier for testing

The second great reason, very much related to the first reason, for using The Repository Pattern is because it make testing your code a lot easier.

Whenever you need to add or query data from the database in your application, instead of hard coding that dependency, you can inject an instance of an object that satisfies the requirements of your Repository interface.

For example, you wouldn’t want to write the following code in your application:

public function find($id)
{
    $repository = new EloquentUserRepository;

    return $repository->findUserById($id);
}

By creating a new instance of EloquentUserRepository directly in the method you’ve coupled that dependency to your code.

Instead you should inject an object that meets the requirements of an interface:

public function __construct(UserRepository $repository)
{
    $this->repository = $repository;
}

public function find($id)
{
    return $this->repository->findUserById($id)
}

By injecting an object that satisfies an interface we can very easily inject a different implementation during testing that does not require the test to hit the database:

public function test_find_user()
{
    $repository = new InMemoryUserRepository;
    $controller = new UserController($repository);

    $user = $controller->findUserById(1);

    $this->assertInstanceOf('User', $user);
}

A one-way dependency

Good applications are comprised of a number of different layers that each have a single responsibility within the software stack.

The very top layer is the User Interface. The User Interface is used for displaying data to the user, accepting user input and sending it that input back into the application.

Next we have the HTTP layer that accepts the user input and directs where requests should be sent.

Next we have the application layer that co-ordinates what services we need in order to satisfy the page request.

Next we have the domain layer where the real business logic of the application sits.

And finally at the very bottom we have the database. The database is essentially on the other side of a wall under the domain layer because it’s not really our responsibility.

So as you can see, an application is comprised of a number of different layers. Each of these layers have a single responsibility within the application.

Each layer is also essentially oblivious to the layers below. The User Interface does not care whether the application is written in PHP, Ruby or Java. As long as the HTTP layer can send and receive requests, that’s all that meters.

The HTTP layer does not care about what the Application layer does to satisfy the request. It only cares about sending an appropriate response.

The Application does not care how the Domain layer decides what is considered accepted and what is considered against the business rules, the Application layer has no knowledge of “business rules”.

The Domain layer does not care about how the data is actually stored on the other side of the wall, it only cares about sending and receiving data to satisfy the requests from above.

So as you can see, each layer is totally oblivious to the layers below.

By using The Repository Pattern it allows us to create a one-way dependency between the domain and the data mapping layers.

The in-memory illusion

If we look at the definition of a Repository from the book Patterns of Enterprise Application Architecture we will find the following quote:

Mediates between the domain and data mapping layers using a collection-like interface for accessing domain objects.

One of the most important characteristics of The Repository Pattern is the fact that it provides a “collection-like” interface.

This means that you should think of accessing the data in your database in the same way you would working with a standard collection object.

The fact that we use databases in applications is really because we need some way of storing the data in a persistence storage, and so it makes sense to use a database.

We should therefore think of the data as if it is stored in a collection, rather than letting the terminology of the database creep into our application.

This means instead of having methods such as save(User $user) we should use add(User $user).

Why is this important? Well at the end of the day we’re still going to be using databases for a long time yet. The fact remains though, that the database should not dictate the design or implementation of your application.

By modelling the interaction with the database as behind the curtain of a collection-like interface we move further away from the database-centric application design that has held us back for so long.

Writing the first repository

A couple of weeks ago we looked at The Specification Pattern.

The Specification Pattern is a way of encapsulating business rules around the selection of objects within an application. In order to “select” those objects, we need a way of querying the database.

In that previous tutorial we injected an interface of UserRepository into the Specification Object.

This is a good example of not allowing the database to hold up progress of the really important bits of the application. We can simply inject an instance of the interface and worry about the database later.

Today we will look at the first tentative steps at writing the UserRepository interface.

Create a new file under Cribbb\Domain\Model\Identity called UserRepository.php:

<?php namespace Cribbb\Domain\Model\Identity;

interface UserRepository
{
}

Note: I’ve renamed the Users namespace to Identity so the purpose of the code is more explicit. All the code from the last couple of weeks is still there.

The first two methods I will add will be used to find a user by their email address or username. These two methods were required by the Specification object:

/**
 * Find a user by their email address
 *
 * @param Email $email
 * @return User
 */
public function userOfEmail(Email $email);

/**
 * Find a user by their username
 *
 * @param Username $username
 * @return User
 */
public function userOfUsername(Username $username);

The next method I will create will be for adding a new user to the application. I’m going to need this method when I write the code to register a new user.

As I mentioned above, we should think about the Repository as if it were an in-memory collection, rather than a gateway to a database:

/**
 * Add a new User
 *
 * @param User $user
 * @return void
 */
public function add(User $user);

As you can see I’m requiring that an instance of User is passed in. The Repository is responsible for storing and retrieving objects. The Repository is not responsible for taking a raw array of data attributes from the request and creating the User object internally.

And finally I will add a method to return the next identity to use. If you remember back to last week you will know that instead of using auto-incrementing id’s I’m going to be using UUIDs instead.

When you add an item to a collection, it is the collection that is responsible for providing the next identity to be used. Whilst in this case it is not the collection itself that is generating the id, we should conceptually follow the same principle:

/**
 * Return the next identity
 *
 * @return UserId
 */
public function nextIdentity();

You will have noticed that I’m creating the UserRepository right in the heart of the Identity portion of the application’s domain.

The UserRepository is very much part of the application’s business logic. However, the real implementations of the Repository are concerns of the infrastructure.

Therefore when we come to actually writing the implementation for the UserRepository we will house that file under the infrastructure namespace:

<?php namespace Cribbb\Infrastructure\Repositories;

use Doctrine\ORM\EntityRepository;
use Cribbb\Domain\Model\Identity\UserRepository;

class UserDoctrineORMRepository extends EntityRepository implements
    UserRepository
{
}

Conclusion

Repositories are important not only on a technical level, but also how we conceptionally think about the different layers of the application.

Using Repositories in our applications has a number of benefits.

Firstly, they prevent you from getting bogged down with the technical details of the infrastructure of the project.

Secondly, they make it much easier to test the various components of the application that interact with the database.

Thirdly, they provide a one-way dependency so the lines between layers are not blurred.

And finally, they provide the illusion of an in-memory collection so that the terminology of the persistent storage does not creep into the language of our applications.

For me, working with repositories just makes the data storage aspect of building web applications a whole lot easier!

This is a series of posts on building an entire Open Source application called Cribbb. All of the tutorials will be free to web, and all of the code is available on GitHub.