This article seeks to enhance your grasp of "Repository," a central idea in Domain-Driven Design (DDD). We will delve into the essential functions and significance of a repository and demonstrate an example of its implementation in the Go programming language.
- Considers an application that pulls (and modifies) data from a relational database.
- The provided example code is crafted using Go.
First, let's consider what a repository means.
Repository Pattern:
A repository organizes the data fetched from the database into a structured format and offers an interface for accessing domain objects.
A repository organizes the data fetched from the database into a specific structure and offers an interface to interact with domain objects.
This matches the common concept of what a repository is. Now, let's explore a more detailed definition within the framework of DDD.
Take a look at the DDD Reference. This resource is rooted in Eric Evans’s book about DDD, offering a summary of different DDD concepts. As mentioned on page 17:.
Accessing aggregates through standard language queries is essential. However, using numerous traversable associations purely to locate items can complicate the model. In well-developed models, queries typically reflect domain concepts, but they can also create issues due to their complexity. The heavy technical burden of using most database access tools often overwhelms client code, causing developers to simplify the domain layer, rendering the model less meaningful. A query framework might cover much of this technical complexity, allowing developers to extract the necessary data from the database more efficiently, either automatically or declaratively, but this only resolves part of the challenge. Unrestricted queries might extract specific fields from objects, violating encapsulation, or generate select objects from within an aggregate, bypassing the aggregate root and hindering these objects from enforcing domain model rules. Consequently, domain logic shifts to queries and application layer code, reducing entities and value objects to mere data holders.
Access queries to aggregates expressed in the ubiquitous language.
The widespread use of associations solely for locating information can complicate the model. In well-developed models, queries typically represent domain concepts. Nevertheless, queries can lead to issues. The immense technical difficulty of implementing most database access frameworks soon overwhelms the client code. This results in developers oversimplifying the domain layer, rendering the model insignificant.
A query framework can handle much of the technical intricacies, allowing developers to extract the precise data they require from the database more automatically or declaratively. However, this approach addresses only a portion of the issue.
Unrestricted queries can extract particular fields from objects, violating encapsulation, or create specific instances from within an aggregate. This sidesteps the aggregate root, preventing these objects from enforcing the domain model rules. As a result, domain logic shifts into queries and application layer code, reducing entities and value objects to mere data holders.
A repository's function is to offer access to aggregates. Merely "executing SQL to fill a structure with data and return it" only scratches the surface of what a repository does.
A repository is always paired with an aggregate. As such, a solid grasp of aggregates is crucial for effectively implementing Domain-Driven Design (DDD).
Let's delve into the concept of "aggregate." What exactly is an aggregate? If you turn to page 16 in the DDD Reference, right before the repository section, you'll find a detailed explanation of aggregates.
Group related entities and value objects into aggregates and establish clear boundaries for each. Designate one entity as the root for every aggregate, ensuring that external objects refer only to this root (internal member references should be limited to single-operation usage). Define the properties and invariants applicable to the entire aggregate, assigning the root or a designated framework mechanism the responsibility for their enforcement. Use the same aggregate boundaries to manage transactions and distribution. Apply consistency rules synchronously within each aggregate boundary, but handle updates asynchronously across different boundaries. Keep each aggregate confined to a single server, but distribute different aggregates across various nodes.
Group related entities and value objects into aggregates, clearly defining their boundaries. Select one entity to serve as the root of each aggregate and permit external objects to reference only this root entity (references to internal components should be limited to single operations). Establish properties and invariants for the entire aggregate, assigning the responsibility for enforcing these rules to the root entity or an appropriate framework mechanism.
Utilize consistent aggregate boundaries to manage transactions and distribution.
Within an aggregate boundary, ensure that consistency rules are enforced in real-time. When updating across different boundaries, manage these updates asynchronously.
Keep a single aggregate on one server and spread out different aggregates across various nodes.
An entity, as described in the DDD Reference, is a domain object distinguishable by a unique ID. Within a domain model, you'll find both entities and value objects (Value Object), with the key difference being that value objects lack an ID.
The word "model" has a long history in the realm of application development, though its significance can change based on the user and the situation. Historically, in the sphere of web application development, "model" referred to the component in the MVC (Model-View-Controller) framework.
Many of you might be familiar with a well-known pattern referred to as "Active Record." This design pattern is commonly used by Ruby on Rails and ORM libraries. In an Active Record structure, a "model" is utilized to store data for a single entry in a table.
In Ruby on Rails, the classes that utilize Active Record are referred to as "models" and are located in the models/
directory. Due to Rails' influence, the term "model" in application development typically denotes an object that contains data for a single table row.
An Aggregate in Domain-Driven Design (DDD) is distinct from what Ruby on Rails refers to as a “model”.
As previously discussed, an aggregate within Domain-Driven Design (DDD) can encompass data for the root entity as well as for additional entities. Consequently, an aggregate does not align with a single table but rather corresponds to a one-to-many relationship (n ≥ 1). This means an aggregate in DDD is designed to handle more than just a single table record.
An aggregate serves as a domain model. Even if it's made up of just one entity, you should still treat it as a type of aggregate. A repository offers an interface to manage aggregates.
We understand that an aggregate consists of multiple entities. But the question remains: how do we determine which entities should form part of an aggregate?
The DDD Reference states:
An aggregate is a unit that establishes the attributes and invariants, and it's responsible for maintaining them.
An aggregate represents a unified entity that establishes the attributes (properties) and invariants, ensuring their maintenance.
Put simply, an aggregate refers to a component that characterizes the features of a domain model and ensures that the consistent rules required by that domain model are upheld.
Invariants are basically the relationships between data that need to be consistently preserved. This is what we mean by data consistency. To keep data consistency intact (invariants), updates need to happen within a single transaction. This is the fundamental idea behind an aggregate. In summary:.
Entities Composing an Aggregate: These are groups made up of entities and value objects that need to be managed as a single unit to ensure data consistency (invariants) in the database. One entity in the group acts as the root of the aggregate. Other aggregates can reference only this root entity, rather than the internal entities of the aggregate.
There isn’t a one-size-fits-all solution for accurately defining the limits of an aggregate. Establishing aggregate boundaries necessitates thoughtful consideration of factors such as database design, scalability, and domain expertise. This task becomes even more demanding in intricate fields, such as inventory management systems.
Rule of Thumb: Minimize the boundaries of an aggregate as much as you can. This helps to decrease the size of database transactions and consequently lowers the technical debt of the domain model.
How can we maintain small aggregate boundaries? To achieve this, we need to think about it during the table design phase.
The boundaries set for an aggregate will serve as the limits for asynchronous processing as the service or product expands in the future. It's important to determine these boundaries carefully to prevent data inconsistency, even when transactions are divided. Occasionally, this necessitates splitting tables in ways that might not be initially anticipated.
When aggregates expand in size, the quantity of tables involved in synchronous transactions rises, which can cause performance to suffer and create maintenance challenges. Although it may initially appear that entities need to be updated within a single transaction, a more detailed look might reveal they can be handled separately. It's essential to avoid being limited by preconceived notions.
Note: When dividing transactions at aggregate boundaries, it's crucial to think about the potential for partial transaction failures. Ideally, a single operation in the user interface should not cross multiple aggregate boundaries. Since these boundaries are closely linked with the user interface, it's important to collaborate with product managers and designers from the outset to ensure everyone is on the same page.
Let's take a look at the shopping cart interface for an e-commerce website. Imagine there are tables representing both the shopping cart and the individual cart items. Use case X in the application layer works with the data from these tables (for instance, adding an item to the shopping cart).
ER Diagram
For a non-DDD approach, the interface would be designed like this, maintaining a direct relationship where each table corresponds to a specific repository.
Repository Interface: We set up repositories for both the ShoppingCart and CartItem tables.
package repository
type ShoppingCart interface {
GetByUserID(uuid.UUID) (*model.ShoppingCart, error)
Insert(*model.ShoppingCart) error
Update(*model.ShoppingCart) error
}
type CartItem interface {
GetByShoppingCartID(id uuid.UUID) ([]*CartItem, error)
Insert(*model.CartItem) error
Update(*model.CartItem) error
}
Model Definition:
Models are defined as structures that represent a single record from each table. Without interfaces, it's impossible to encapsulate domain logic.
package model
type ShoppingCart struct {
ID uuid.UUID
UserID uuid.UUID
Status model.ShoppingCartStatus
}
type CartItem struct {
ID uuid.UUID
ProductID string
Quantity int
}
Potential Issues:
This approach has the following potential issues:
- Models turn into simple holders for table records.
- For use case X, domain logic is handled.
- The application layer is responsible for maintaining aggregate invariants.
- Use case X must take into account revised and fresh database actions.
Now, let's create a shopping cart repository while adhering to DDD principles.
The ShoppingCart entity serves as the foundation of the aggregate, with the CartItem entity being contained within the ShoppingCart aggregate.
Begin by defining a repository that corresponds to the ShoppingCart aggregate. In contrast to the earlier example, you won't find a CartItem repository here. This distinction arises because repositories are created for each aggregate, rather than for individual tables. The ShoppingCart's AddItem
method is a case of domain logic employed in use case X.
Class Diagram:
Repository Interface:
The repository manages the tasks of reading and updating aggregates, which typically eliminates the necessity to distinguish between Insert
and Update
. This responsibility is handled by the repository’s Save
method.
package domain
type ShoppingCartRepository interface {
GetByUserID(uuid.UUID) (ShoppingCart, error)
Save(ShoppingCart) error
}
Aggregate Model Definition:
The ShoppingCart entity is the core component of the aggregate. Aggregates bundle domain logic, which means they're represented as interfaces rather than structures, and you'll find them in the domain
package. Below is the aggregate interface definition:
package domain
type ShoppingCart interface {
ID() uuid.UUID
AddItem(Product Product, Quantity int) error
}
Improved Points:
- The repository has the capability to completely manage the complexities of the database, including decisions related to insertion and updates.
- There's no necessity to set up a repository for every table, which minimizes the repetition of SQL code.
- By linking models together, we can encapsulate the domain logic.
- The application layer no longer deals with domain logic and database complexity.
Embracing DDD makes the roles of domain logic, database access, and the application layer more clear. This refinement lowers application complexity and eases cognitive strain, significantly boosting the pace of development in the long run.
Is It Necessary to Separate Aggregate and Repository Implementations?
Where should the implementations of the model and repository interfaces be located? While there's no absolute answer, I personally think it's better to keep the implementations of repositories and models within the same package.
Reasons for Defining Repository and Aggregate Model Implementations in the Same Package:
The purpose of the repository was to simplify the complexities involved in accessing the database, as well as in creating and saving aggregates. It's crucial for the repository to set up attributes essential for forming the aggregate. Because it handles attributes that aren't publicly exposed by the interface, it must be placed within the same package.
For instance, take a look at the Save
method of the repository. This particular implementation encapsulates the process of deciding whether to add a new entry or update an existing one in the database.
package shoppingcart
// Struct implementing domain.ShoppingCartRepository
type repositoryImpl struct {}
// Struct implementing domain.ShoppingCart
type shoppingCartImpl struct {
ID uuid.UUID
// ...Other properties
}
func (r *repositoryImpl) Save(model domain.ShoppingCart) error {
instance := model.(*shoppingCartImpl)
if instance.ID == nil {
// If ID is nil, it's a new registration
return r.insert(instance)
}
return r.update(instance)
}
Next, take a look at how the GetByUserID
method is implemented in the repository. When there is no data available, the repository is capable of returning an empty shopping cart. This means that irrespective of the aggregate's existence, the appropriate instance of the aggregate is always returned, thereby streamlining the repository interface.
package shoppingcart
func (r *repositoryImpl) GetByUserID(userID uuid.UUID) (domain.ShoppingCart, error) {
data, err := r.findByUserID(userID)
if errors.Is(err, sql.ErrNotFound) {
// Return an empty shopping cart aggregate if no data exists
return newEmptyShoppingCart(), nil
}
if err != nil {
return nil, err
}
// Create an aggregate using the data of the already saved shopping cart
return newShoppngCart(data), nil
}
In his book *A Philosophy of Software Design*, a Stanford University professor introduces the idea of a Deep Module. This concept describes a module with a straightforward and narrow interface on the outside while housing extensive functionality and complexity within. On the other hand, modules that feature complicated interfaces but offer minimal internal functionality are termed Shallow Modules. Deep Modules are particularly valued for their low cognitive load, high reusability, and ease of comprehension. For instance, the Go language’s net/http
package provides a simple interface yet encompasses many features and intricacies necessary for building an HTTP server, making it user-friendly. Similarly, a computer operating system's file system serves as another example of a Deep Module. Over recent years, the concept of Deep Module has garnered significant support within the IT industry.
Source: Depth of module
When a model uses an interface, you can't directly access or modify the struct's values that represent the actual entity from outside the model. Nonetheless, creating interface methods for every attribute of the model is quite a hassle and can be inconvenient to manage.
Solution: Create a value object to hold attribute values.
Create a struct that can store the public characteristics of the domain model. Utilize this struct to retrieve and modify the model’s attributes.
Here’s a sample of what a ShoppingCart
model (interface) looks like, along with a struct that holds its attributes. By grouping the retrievable and updateable attributes within the struct, there’s no requirement to create separate interface methods.
Here is the definition of Interface. The method Attrs()
allows access to the model’s attributes.
package domain
type (
ShoppingCart interface {
ID() uuid.UUID
UserID() uuid.UUID
Items() []CartItem
Attrs() *valueobject.ShoppingCartAttrs
}
)
The attributes' Value Object is detailed below...
package valueobject
type (
ShoppingCartAttrs struct {
Status ShoppingCartStatus
// Other fields
}
)
Here is an example of how the ShoppingCart model can be implemented.
package shoppingcart
type shoppingCart struct {
id uuid.UUID
attrs *valueobject.ShoppingCartAttrs
userID string
version int64
}
func (s *shoppingCart) ID() uuid.UUID {
return s.id
}
func (s *shoppingCart) UserID() uuid.UUID {
return s.userID
}
func (s *shoppingCart) Attrs() *valueobject.ShoppingCartAttrs {
return s.attrs
}
In this approach, a struct called XxxAttrs
and an Attrs()
method are created for all models (the struct's name could also be something like XxxProps
or XxxData
).
Additionally, attributes you prefer to keep concealed in the model, such as values that are immutable, can be hidden by defining them directly within the struct instead of adding them to the struct meant for attribute values. For instance, in the example above, the userID
of ShoppingCart
cannot be altered externally.
Points:
- You can effortlessly add attributes by simply modifying the struct.
- You can manage how attributes are seen by either incorporating them into the attribute struct or by specifying them right within the model struct.
- Simple to manage since it’s merely a struct.
Applying repositories and aggregates in line with DDD principles offers a way to encapsulate domain logic, which, in turn, boosts the maintainability of applications. Figuring out the boundaries of aggregates demands a thorough examination of both the domain and table design, but it's often considered one of the more rewarding parts of software development. Moving forward, I plan to embrace DDD more frequently to delve into more enjoyable aspects of application development.