Implementing Event Store in C#

Implementing Event Store in C#

What is this all about?

When looking for examples of EventStore in Event Sourcing implementation, you will probably find a lot of them, very detailed in theory, but lacking practical implementation. In this article my goal is to explain how to implement simple, yet robust, event store, test it and what are the pitfalls you should expect based on my own experience.

Event store presented here will be implemented in .NET Core, C# and MS SQL LocalDB server as a database.

Full project which contains working solutions is available on my github

Short introduction to Event Sourcing

I will not go into much detail about what Event Sourcing is since this topic is covered in many books and articles around the web.

Instead, I will focus on what is important when considering implementing an actual Event Store. But, nevertheless I will quickly summarize the main idea behind Event Sourcing and main challenges.

Aggregates are considered as a whole represented by the Aggregate Root. Conceptually an Aggregate is loaded and saved in its entirety. (Evans, 2001).

Event Sourcing is a software architecture pattern which states that state of the application should be persisted as sequence of Events. Main benefits of Events Sourcing are:

  • Auditability — since state is constructed from sequence of events it is possible to extract detailed log since the begging and up to current date
  • Projections/queries — it is possible to recreate state in different time frame since we have all the events from the beginning. For example it would be possible to check what was the bank account state one year ago. This also allows to generate some queries/reports we never thought of when starting the system, since all the data is in the Events.
  • Performance when inserting data — since EventStore is append only store with optimistic locking, we would expect to have much less deadlocks(or concurrency exceptions) happening. Also, no long running transactions which insert graph of data in multiple tables.
  • Flat database structure — we would usually use 1(or max 2 tables) as event store. In both cases it would be de-normalized form with some form of weak serialization field to store payload of the event like JSON for example. This means that when adding new fields to the database it is not needed to add them to any table — simply adjusting Event and adding required field will save it into JSON. This allows much more rapid write side development time

As with every pattern we must be aware of limitations/challenges. If used incorrectly, Event Sourcing will probably cause more harm than good. So, the main challenges we should keep in mind are:

  • Storage Growth — since data store is append only, table will grow indefinitely. This can be mitigated using snapshots or retention policy strategies.
  • Replaying Events — if the amount of events for constructing Aggregate is large it might lead to some performance issue when reconstructing the current state of the aggregate. This can be mitigated by using snapshots.
  • Events versioning and events handling — when changing existing event, or adding/deleting features, a code which projected old events MUST be in place, since it is used to reconstruct the state to the actual state. This means that if some feature is deprecated, its code cannot be removed since it is used to reconstruct the state at that time. This challenge is a bit harder to overcome but it can be mitigated.

Event Store considerations

Requirements for the event store are the following:

  • It is append only which means there is no update only insert
  • It should store aggregate state and allow fetching events for given aggregate in order they were saved.
  • It should use optimistic concurrency check: Optimistic concurrency check does not use locking on database level, therefore reducing risk of deadlocks. Instead, concurrency check is done when saving.

Optimistic concurrency check

When inserting into a database, where multiple clients exist, it can happen that 2 or more clients are trying to modify the same aggregate. Since we don’t use pessimistic concurrency check, there will be no lock and no waiting, but the check itself would be applied when trying to persists the actual data.

To make things clear let us consider an example:

Assume that there are two requests that want to modify the same aggregate named Aggregate. Implementing concurrency check should be done on database level.

  1. Both of them will fetch current version from the event store which is 1
  2. First aggregate is saved, second aggregate version is set to 2
  3. Second aggregate in this case will fail concurrency check since its version is 2 and expected version is 3. This indicates that data has been changed since it was read. In this case saving second aggregate should fail with Concurrency exception.

Example of optimistic check using Aggregate version

Database schema

Event Store will be one table, append only, which allows version tracking per aggregate and implement concurrency check on database level.

SQL for Event Store table with JSON field for data(this is where event payload is serialized):

CREATE TABLE [dbo].[EventStore](
    [Id] [uniqueidentifier] NOT NULL,
    [CreatedAt] [datetime2] NOT NULL,
    [Sequence] [int] IDENTITY(1,1) NOT NULL,
    [Version] [int] NOT NULL,
    [Name] [nvarchar](250) NOT NULL,
    [AggregateId] [nvarchar](250) NOT NULL,
    [Data] [nvarchar](max) NOT NULL,
    [Aggregate] [nvarchar](250) NOT NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO

SQL for EventStore example table

AggregateId and Version are two fields used for concurrency check. We create Unique index with these two fields. AggregateId is the id of our aggregate and can be whatever we want(therefore it is defined as string). Depending on the domain it can be GUID, int, combination of two, it doesn’t really matter.

Note that AggregateId is defined as nvarchar(250)

CREATE UNIQUE NONCLUSTERED INDEX [ConcurrencyCheckIndex] ON [dbo].[EventStore]
([Version] ASC, [AggregateId] ASC) WITH (
    PAD_INDEX = OFF, 
    STATISTICS_NORECOMPUTE = OFF, 
    SORT_IN_TEMPDB = OFF, 
    IGNORE_DUP_KEY = OFF, 
    DROP_EXISTING = OFF, ONLINE = OFF, 
    ALLOW_ROW_LOCKS = ON, 
    ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

Unique index is enforced on Version and Aggregateld fields on the table on database level

Using this we ensure that the same AggregateId/Version combination is never saved. Instead a unique index check failed exception is thrown by the database. This is a transient error, which means that retry mechanism(see Retry Pattern) can(and should) be implemented on the client side.

Example project short introduction

Project is built using .NET Core 3.1.

Project architecture is layered with inversion of control.

Layers:

  • RestAPI — Web API which contains DTO and REST Controller definitions
  • Infrastructure — Factories, database model and repository implementations are defined here
  • Core — contains business logic as well as repository interface for aggregate. This project has no references to any other project or any other third party libraries( except Tactical DDD nuget which is pure C# code )

Other projects:

  • DbMigration — migrations projects used to initialize database
  • EventStoreTests — testing project, which demonstrates integration tests for event store

For Core business logic, there is only one aggregate named Person and two domain events:

  1. PersonCreated — this event is published when person is created
  2. AddressChanged — this event is published when address for given person has changed

How to set up and run the project can be found in readme file for github repository.

EventStore implementation

Let us take a look at the actual code that implements the event store. I will put only code snippets here, while fully functional project can be found on my github.

Interface for the EventStore can be defined as:

public interface IEventStore
    {
        Task SaveAsync(IEntityId aggregateId, 
            int originatingVersion, 
            IReadOnlyCollection<IDomainEvent> events,
            string aggregateName = "Aggregate Name");

        Task<IReadOnlyCollection<IDomainEvent>> LoadAsync(IEntityId aggregateRootId);
    }

The interface defines two methods.

  • SaveAsync method is used to persist Aggregate as stream of events. The aggregate itself is described as a collection of domain events, with a unique name.
  • LoadAsync method fetches aggregate, using AggregateId as param, from the event store and emits it as an array of events. This array can be used to load the aggregate.

IEntityId and IDomainEvent are both imported from a Tactical DDD nuget, which I strongly recommend for DDD in C#. Both of these are simple interfaces which mark EntityId class and DomainEvent class.

Let us analyze the actual implementation of these two methods:

Persisting Events

For persisting Aggregate into the EventStore we need 3 parameters:

  1. AggregateId — this is the id of the aggregate. In our case it will be class which implements IEntityId interface
  2. OriginatingVersion — this is the version of the aggregate that is being saved. Once saved version is incremented by one. As explained this is used in optimistic concurrency check
  3. IReadOnlyCollection<IDomainEvent> — this is the actual list of events that needs to be persisted into database. Each event is persisted as new row.

Implementing SaveAsync

Full implementation for this method can be found in EventStoryRepository.cs file.

First, insert query is created using provided parameters. For this, we use micro ORM Dapper which allows mapping parameters using @ notation.

var query = $@"INSERT INTO {EventStoreTableName} ({EventStoreListOfColumnsInsert}) VALUES (@Id,@CreatedAt,@Version,@Name,@AggregateId,@Data,@Aggregate);";
var listOfEvents = events.Select(ev => new
{
  Aggregate = aggregateName,
  ev.CreatedAt,
  Data = JsonConvert.SerializeObject(ev, Formatting.Indented, _jsonSerializerSettings),
  Id = Guid.NewGuid(),
  ev.GetType().Name,
  AggregateId = aggregateId.ToString(),
  Version = ++originatingVersion
});

Parameter name ( @Name for example) is matched with the object property and mapped. That is why in the next line, a list of anonymous objects is created with the same properties as defined in query.

Properties are:

  • Aggregate – this is a string name for an aggregate.
  • CreatedAt — is a date/time when the event has been created
  • Data — this is the event payload, serialized as JSON string. Complete event is serialized into JSON using provided jsonSettings.
  • Id — this can be any type of id. For this example I used Guid.
  • Name — this is the actual event name
  • AggregateId — this is the id of the aggregate. Using this field, events for given aggregate can be filtered.
  • Version — increments each time for given aggregate. Used in optimistic concurrency check.

List of events is then mapped to the actual query using:

await connection.ExecuteAsync(query, listOfEvents);

This line of code is using ExecuteAsync method from Dapper ORM which will map listOfEvents properties with the parameters defined in query string and create actual queries.

When persisted in this way, each event is persisted as new row in EventStore table, with the Data payload of the actual event. Here is how it looks like:

Each Event is saved as new row in database. Version changes per Aggregate and sequence is always incremental

When inspecting Data column this is the payload:

{
“$type”: “Core.Person.DomainEvents.PersonCreated, Core”,
“PersonId”: “d91f903f-3fb1–4b68–9a59-c1818c94f104”,
“FirstName”: “damir6”,
“LastName”: “bolic7”,
“CreatedAt”: “2020–02–20T07:24:54.0490305Z”
}

The payload of this event is actually mapped from PersonCreated event which is emitted when new person is created:

public class PersonCreated : DomainEvent
    {
        public string PersonId { get; }
        public string FirstName { get; }
        public string LastName { get; }

        public PersonCreated(
            string personId, 
            string firstName, 
            string lastName)
        {
            PersonId = personId;
            FirstName = firstName;
            LastName = lastName;
        }
    }

This domain event is published when new Person aggregate is created

DomainEvent class can be defined as follows:

public class DomainEvent : IDomainEvent
    {

        public DomainEvent()
        {
            CreatedAt = DateTime.UtcNow;
        }

        public DateTime CreatedAt { get; set; }
    }

Basically, CreatedAt is added based on IDomainEvent from DDD Tactical nuget.

Loading aggregate

Loading aggregate is done using AggregateId. For given aggregate all events are loaded and then Aggregate is constructed using those events which in turn results with the new Aggregate object in memory.

Aggregate is loaded using SQL query as list of events using EventStoreRepository.LoadAsync() method. The real magic happens when events are deserialized from JSON and converted to DomainEvent object:

var events = (await connection.QueryAsync<EventStoreDao>(query.ToString(), 
                                                         aggregateRootId != null ? 
                                                         new { AggregateId = aggregateRootId.ToString() } : null))
    .ToList();

 var domainEvents = events.Select(TransformEvent).Where(x => x != null).ToList().AsReadOnly();
 return domainEvents;

Selecting all events for the aggregate and transforming them to domain events

As we can see, events are fetched as EventStoreDao list of object, which are then converted to DomainEvent using TransformEvent method:

private IDomainEvent TransformEvent(EventStoreDao eventSelected)
 {
 var o = JsonConvert.DeserializeObject(eventSelected.Data, _jsonSerializerSettings);
 var evt = o as IDomainEvent;
 return evt;
 }

Here, actual payload of the event eventSelected.Data is deserialized into object which is then converted to IDomainEvent interface. Note that, should this conversion fail it will return null.

Once list of domain events is fetched Person aggregate can be constructed.

Testing

Testing of the event store is not hard.

For unit testing it has defined interface IEventStore which can be mocked.

For integration tests – in memory database can be used. In the example project, LocalDB is used for both testing and in actual project. This is located in the EventStoreIntegrationTests.cs file.

Summary

The goal of this blog is to show using concrete example how to implement simple Event Store in C#. For this we used some of the DDD concepts like Aggregate, Repository, Entity and ValueObject.

Example project, included with this blog, aims to be simple demonstration of the principles defined here.

Author: Damir Bolić

Life after Event Sourcing

Life after Event Sourcing

I am not going to talk about implementing Event Sourcing, pros and cons, or when and where to use it. I want to share my personal developer’s perspective and experience gathered from the two projects I worked on. Both were aiming to develop microservices using domain-driven design principles. The first project (let us call it A) had Event Sourcing and the second one (project B) did not. In both cases, a relational database was required to be used for data persistence.

Project A produced a microservice running in production without major issues. After resolving a few challenges with projection mechanisms, we end up implementing a faithful domain model representing business processes, domain events holding important domain facts and well determined ubiquitous language spoken by the team and the business experts. When the team was assigned to develop microservice B, the same practices were carried over. But then, I realized that it would not go as smoothly as before.

Useless Domain Events

When I first heard Once you get the hang of using Domain Events, you will be addicted and wonder how you survived without them until now, it sounded a bit pretentious and exaggerated, but that is exactly how I have felt since I was introduced to Domain-Driven Design. In my opinion, the greatest power of DDD comes from domain events.

When using event sourcing, everything is spinning around domain events. Working on project A could have been described as talking, thinking, storming, modeling, projecting, storing, trusting and investigating domain events. On the project B, however, we were saving only the latest state of the aggregate and were conveniently deriving the read models from that as well. So, no storing and no domain events needed for projections. To make things worse, there was nothing needed to be subscribed to the domain events the aggregate is publishing. Of course, we decided not to use the concept of domain events, but as a consequence, we had to change the mindset we were used to and continue with a weaker DDD.

Painful refactoring

By gradually expanding domain knowledge, continuous refactoring and adjusting of the domain model were the common practices during development of the microservice A. With event sourcing one can completely remodel the aggregates without having to change anything in the data persisting mechanism. Aggregate properties are just placeholders being filled with data loaded from stored facts — domain events. But when one is storing only the latest state of aggregates, every change drags adjusting database, altering or creating new tables or migrations of existing data. Even though we had created suitable data access objects mapped from the aggregate, changes on the database or in the mappers were inevitable and often time-consuming.

Less focus on the core domain

It did not take long for the team to feel that our focus has moved from the core domain to infrastructure while working on our project B. Even for the slightest change, we were spending too much time, the precious time we used to spend discussing and remodeling the business domain. Not to mention pull requests with more files to review and more code to cover with tests.

Wrapping up

After going through these pain points, I concluded that event sourcing is a natural fit and a powerful supporting tool that strengthens the domain-driven design. Event sourcing brings easier and faster development, testing and debugging. It helps to focus on what is really important — the facts happening in the system shaped as domain events. It facilitates adjusting and continuous improvement of the core domain model. The life of a developer is much easier with event sourcing. But I need to say, it does not mean this should be the primary factor when deciding should a system be event sourced or not. But that is another story for another time.

Author: Nusreta Sinanović

Let’s build a house

Let’s build a house

While staring out of the window of an airbus A319–100, traveling to Amsterdam for “DDD / Event Sourcing Europe”, I had a sudden rush of inspiration and decided to try and scribble a quick blog post.

Here goes nothing…

When talking about software, we often hear the term “architecture” mentioned a lot. That inevitably reminds us of buildings and construction. While this may be true up to some extent, there are a lot of discrepancies between the approaches we take towards architecting a building (a house for example) and a typical web application software system. Much more so if we employ agile practices while doing so (which I hope we all do).

Software architecture is more about defining and formalizing constraints and gathering quality attributes that we need to satisfy while implementing business requirements.

A good architect tries to defer most of the architectural decisions that don’t affect the most important quality attributes directly. These decisions can then be made downstream by the developers in charge of building the actual solution (the architect himself should be one of them).
This indeed means that software architecture “is not” about making “all” of the decisions upfront, but rather, only the important ones.

Let’s look at a simple example. Let us contrast building a house in the real world versus building a virtual house (a hypothetical software system, might be a simple web application).

Some things of concern, but not limited to might be:

  1. Land (Infrastructure)
  2. Floor plans (Architecture / technical diagrams)
  3. Foundations (Initial project setup)
  4. Floors and the roof, walls, bulkheads, etc… (Software Layers)
  5. Functional rooms + furniture (Actual features that bring value)

So how do these compare?

Land (Infrastructure)

When building a house we need to purchase (allocate) the entire piece of land that our house will sit on top of. Plus, we need to account for some more if we maybe decide to make some extensions to our hose (a garage maybe, a garden …).

This implies that most of the time we need to know the size of the land needed upfront which in turn implies that we are required to make a large upfront cost in order to purchase the land even before we even start with the actual construction.

In contrast, when building a software system the agile way, we try to avoid these upfront commitments. Why? Because we can!

Why is this the case?

Ask yourself, what is software? If you think hard enough, you will come to the conclusion that software, in essence, is just a pile of text, compiled to machine language (instructions) in order to be executed to (hopefully) perform some useful work for us people.

And what is the nature of “text”.
Well, most importantly, text can be changed, expanded, erased, reorganized, burned and thrown away. It’s subject to change.

If you think otherwise, you are living a lie and becoming aware of this asap will do you much, much good.

But even if you choose not to be aware of this, I can tell you one thing for sure: The clients are!

So, going back to our example, how much land (infrastructure) do we allocate for our software system? The answer is, as little as possible, or to put it differently, only as much as we need to.

Staying agile is all about iterations, tight feedback loops and building as little as needed when we need it.

We try to avoid making big upfront commitments and incurring costs that are uncalled for. As a side note, if you’ve ever wondered what cloud computing is all about, now you know (pay only for what you use)…

For the software system in our example (our virtual house), we would allocate/purchase only enough resources we need for our first iteration, feature or sprint. Nothing more, nothing less, because unlike with land, we can always but more.

Floor plans

When building a house in the real world, in 99% of the cases we have the whole plan ready upfront. We hire an architect, decide on almost all features (rooms, layout), materials, etc and again, we make another upfront payment and only now we can start building foundations for our house.

After this point, pretty much every decision is immutable, and any change to the initial plan is hard to make or would surely incur a lot of extra unplanned cost because once concrete hardens it’s really tough to break.
That’s just the nature of it.

We take this for granted because we understand the reasoning behind it and have learned to live it.

But surprisingly this isn’t true for software. Many people still compare building software with building houses, and this cannot be further than the truth.

As we said, software is different. A good architect does not make all of the decisions up front. Why? Because software is easy to change (and will change, it’s just text) so we don’t want to limit our legroom that we will most surely need later.

Most importantly, we don’t want to spend any more money than necessary. Will we like that bedroom on the second floor? Will we even need it?!

Everything else

After we have the foundations are laid down, it’s time to start building our floors.

Floor by floor, layer by layer, weeks and months pass while we wait with our fingers crossed and hope for the best.

Finally, when the house is done, final work can begin and only then can we start throwing in the furniture.

A few months later, we finally get to use our house.

I think you can see the pattern by now.

Architecture deals with horizontal slices. Floor by floor, layer by layer, because it’s easier to do it this way (and it’s cost-effective).

But when building software systems, we deal with vertical slices (or should). What does this mean?

When creating software solutions, we start from the most important thing (the core), build all the vertical slices for it, and then work our way out (horizontally) by adding new features to the core or improving the core itself.

In our virtual house example, this could be anything really, but let’s take a bedroom as an example.

Let’s pretend that we sat with the customer and decided that having a place to sleep in asap would be of the highest priority.

Thus, our main goal should be to deliver a single useful feature for our customer which can be tested out and for which we can get constructive feedback at the earliest moment possible.

I hope you can understand why this can be useful (in contrast to waiting for months for first feedback)

So, some of our basic goals are:

How would we go about doing this? We do the simplest thing. We simply follow the path of the least resistance. In analogy to the real house, building our virtual house might follow steps similar to these:

We are now ready to call in our client and receive some constructive feedback about it. Hell, he could even spend the night.

Once we have our feedback, the client decides what is the next most important thing for him…

This way we ensure that the customer always gets what he wants, and not what he asked for! — Remember what we said about the nature of software! Customers have a nose for this and they will always want something other than what they initially specified.

So in a nutshell:

Final words

If you take away anything, let it be this:

Building software is nothing like building a single house/building.

Building software is much more involved than this. To build, evolve and maintain a software system is more similar to city planning than to building a single house. (Let’s face it, it’s never a single house anyway).

Cities like software, have lives of their own. Cities evolve, grow (or even shrink) in size. New buildings are added (features) all the time, old ones are refurbished or demolished(refactoring).

New water and heating lines are added, removed and replaced on a daily basis. Metro lines are bored under the city.

While doing all of this, you can never shut down a city. You can close down some parts of it but as a whole, it must always remain functional.

This is why building software is neither simple nor easy, but only by embracing change, and failure (this might be a topic for another blog post) we can be prepared, and triumph!

Author: Anes Hasičić

Generic repository pattern using Dapper

Generic repository pattern using Dapper

Implementing Repository pattern in Dapper can be done in many ways. In this blog I will explain what repository pattern is and how to implement it using reflection in C#.

When searching around the web I found various implementations none of which was satisfactory — mostly due to having to manually enter table names/field names, too complex implementations with Unit of work etc.

Similar repository, as presented here, is in use in production CQRS/ES system and works quite well.

Even tho use case depicted here is quite “unique”, I think implementation of this repository can be applied for most of the relational base database systems, with minimum refactoring.

What is Repository pattern?

When talking about Repository pattern it is important to distinguish between DDD implementation of repository and generic repositorypattern.

Generic repository is simple contract defined as an interface on per object basis. That means that one repository object is related to one table in database.

DDD repository pattern works with aggregate root object and persists it one or more tables or if event sourcing is used as series of events in event store. So in this instance, repository is actually related not to one database but to one aggregate root which can map to one or more tables. This is a complex process due to impedance mismatch effect which better handled with ORM’s, but this is not our use case.

Generic repository UML diagram:

  • GenericRepository abstract class implements IGenericRepository interface. All shared functionality is implemented in this class.
  • ISpecificRepository interface should have methods required for specific use case( if any)
  • SpecificRepository class inherits from GenericRepository abstract class and implements any specific methods from ISpecifiRepository.

Unit Of Work and transaction handling

Unit of Work pattern implements single transaction for multiple repository objects, making sure that all INSERT/UPDATE/DELETE statements are executed in order and atomically.

I will not be using Unit Of Work but rather save each repository directly using Update/Insert method. The reason for this is that these repositories are designed toward a specific use case detailed below.

All transaction handling is done manually by wrapping multiple repository command into .NET Transaction object. This gives a lot more flexibility without adding additional complexity.

Repository implementation

Let us define interface first:

public interface IGenericRepository<T>
    {
        Task<IEnumerable<T>> GetAllAsync();
        Task DeleteRowAsync(Guid id);
        Task<T> GetAsync(Guid id);
        Task<int> SaveRangeAsync(IEnumerable<T> list);
        Task UpdateAsync(T t);
        Task InsertAsync(T t);
    }

Bootstrap code for repository class has the responsibility to create SqlConnection object and open the connection to database. After that, Dapper will utilize this connection to execute SQL queries against database.

public abstract class GenericRepository<T> : IGenericRepository<T> where T: class
    {
        private readonly string _tableName;

        protected GenericRepository(string tableName)
        {
            _tableName = tableName;
        }
        /// <summary>
        /// Generate new connection based on connection string
        /// </summary>
        /// <returns></returns>
        private SqlConnection SqlConnection()
        {
            return new SqlConnection(ConfigurationManager.ConnectionStrings["MainDb"].ConnectionString);
        }

        /// <summary>
        /// Open new connection and return it for use
        /// </summary>
        /// <returns></returns>
        private IDbConnection CreateConnection()
        {
            var conn = SqlConnection();
            conn.Open();
            return conn;
        }

        private IEnumerable<PropertyInfo> GetProperties => typeof(T).GetProperties();

Make sure you have connection string named MainDb. I am using MSSQL LocalDb, lite MSSQL version database provided with Visual Studio.

<add name="MainDb"
connectionString="Data Source=(localdb)\mssqllocaldb;Integrated Security=true;Initial Catalog=dapper-examples;"
providerName="System.Data.SqlClient"/>

Implementing most of these methods, except for Insert and Update, are quite straightforward using Dapper.

public async Task<IEnumerable<T>> GetAllAsync()
        {
            using (var connection = CreateConnection())
            {
                return await connection.QueryAsync<T>($"SELECT * FROM {_tableName}");
            }
        }

        public async Task DeleteRowAsync(Guid id)
        {
            using (var connection = CreateConnection())
            {
                await connection.ExecuteAsync($"DELETE FROM {_tableName} WHERE Id=@Id", new { Id = id });
            }
        }

        public async Task<T> GetAsync(Guid id)
        {
            using (var connection = CreateConnection())
            {
                var result = await connection.QuerySingleOrDefaultAsync<T>($"SELECT * FROM {_tableName} WHERE Id=@Id", new { Id = id });
                if (result == null)
                    throw new KeyNotFoundException($"{_tableName} with id [{id}] could not be found.");

                return result;
            }
        }

        public async Task<int> SaveRangeAsync(IEnumerable<T> list)
        {
            var inserted = 0;
            var query = GenerateInsertQuery();
            using (var connection = CreateConnection())
            {
                inserted += await connection.ExecuteAsync(query, list);
            }

            return inserted;
        }

 

For SaveRangeAsync list of items is provided which are saved to database and returns number of items saved. This can be made atomic by wrapping foreach in Transaction object.

Implementing Insert and Update queries

Implementing insert and update requires a bit more work. In general idea is to use reflection and extract field names from model class and then generate insert/update query based on field names. Field names will be used as parameter names for Dapper therefore it is important to make sure that DAO class field names are the same as column names in actual table.

In both cases the idea is the same: take object, provided as input parameter, and generate SQL query string with parameters. The only change is that different query is generated, INSERT or UPDATE.

Both methods use reflection to extract field names from model object. This class can be made static, since its not using any of instance variables and for performance purposes.

private static List<string> GenerateListOfProperties(IEnumerable<PropertyInfo> listOfProperties)
        {
            return (from prop in listOfProperties let attributes = prop.GetCustomAttributes(typeof(DescriptionAttribute), false)
                where attributes.Length <= 0 || (attributes[0] as DescriptionAttribute)?.Description != "ignore" select prop.Name).ToList();
        }

 

What this does is extracts a list of attribute names into List<string> using reflection. It will not extract fields marked with ignore description attribute.

Once we have this list we can iterate it and generate actual query:

public async Task InsertAsync(T t)
        {
            var insertQuery = GenerateInsertQuery();

            using (var connection = CreateConnection())
            {
                await connection.ExecuteAsync(insertQuery, t);
            }
        }

private string GenerateInsertQuery()
        {
            var insertQuery = new StringBuilder($"INSERT INTO {_tableName} ");
            
            insertQuery.Append("(");

            var properties = GenerateListOfProperties(GetProperties);
            properties.ForEach(prop => { insertQuery.Append($"[{prop}],"); });

            insertQuery
                .Remove(insertQuery.Length - 1, 1)
                .Append(") VALUES (");

            properties.ForEach(prop => { insertQuery.Append($"@{prop},"); });

            insertQuery
                .Remove(insertQuery.Length - 1, 1)
                .Append(")");

            return insertQuery.ToString();
        }

 

Update method has some small differences:

public async Task UpdateAsync(T t)
        {
            var updateQuery = GenerateUpdateQuery();

            using (var connection = CreateConnection())
            {
                await connection.ExecuteAsync(updateQuery, t);
            }
        }

private string GenerateUpdateQuery()
        {
            var updateQuery = new StringBuilder($"UPDATE {_tableName} SET ");
            var properties = GenerateListOfProperties(GetProperties);

            properties.ForEach(property =>
            {
                if (!property.Equals("Id"))
                {
                    updateQuery.Append($"{property}=@{property},");
                }
            });

            updateQuery.Remove(updateQuery.Length - 1, 1); //remove last comma
            updateQuery.Append(" WHERE Id=@Id");

            return updateQuery.ToString();
        }

 

Additional thing we need to take care of here is what happens if record for updating is not found. There are couple of solutions for this, some include throwing an exception others returning empty object or somehow notifying calling code that update was not done.

In this case we are relying on Dappers executeAsync method which return int which is a number of affected rows.

Example of generic repository usage:

public static async Task Main(string[] args)
        {
            var userRepository = new UserRepository("Users");
            Console.WriteLine(" Save into table users ");
            var guid = Guid.NewGuid();
            await userRepository.InsertAsync(new User()
            {
                FirstName = "Test2",
                Id = guid,
                LastName = "LastName2"
            });


            await userRepository.UpdateAsync(new User()
            {
                FirstName = "Test3",
                Id = guid,
                LastName = "LastName3"
            });


            List<User> users = new List<User>();

            for (var i = 0; i < 100000; i++)
            {
                var id = Guid.NewGuid();
                users.Add(new User
                {
                    Id = id,
                    LastName = "aaa",
                    FirstName = "bbb"
                });
            }

            var stopwatch = new Stopwatch();
            stopwatch.Start();
           
           
            Console.WriteLine($"Inserted {await userRepository.SaveRangeAsync(users)}");

            stopwatch.Stop();
            var elapsed_time = stopwatch.ElapsedMilliseconds;
            Console.WriteLine($"Elapsed time {elapsed_time} ms");
            Console.ReadLine();
        }

 

Use case in CQRS/ES Architecture

CQRS stands for Command Query Responsibility Segregation, and is an architectural pattern, which separates read model from write model. The idea is to have two models which can scale independently and are optimized for either read or write.

Event Sourcing(ES) is a pattern which states that state of the object is persisted into database as list of events and can be later reconstructed to the latest state by applying these events in order.

I will not go into explaining what these two patterns are and how to implement them, but rather focus on specific use case I’ve dealt with in one of my projects: How to use relational database (MSSQL) for read model and event store, utilizing data mapper Dapper and Generic repository pattern. I will also touch, albeit briefly, event sourcing using the same generic repository.

Example architecture:

Before going any further let us consider why using data mapper would be more beneficial than using ORM for this particular case:

Impedance Mismatch in Event Sourcing

An object-relational impedance mismatch refers to a range of problems representing data from relational databases in object-oriented programming languages.

Impedance mismatch has a large cost associated. Reason for this is that developer has to know both relational model as well as object oriented model. Object Relational Mappers(ORMs) are used to mitigate this issue but not eliminate it. They also tend to introduce new problems: like virtual properties requirement by EF, private properties mapping issue, polluting domain model etc.

When using using Events as storage mechanism in Event Store, there is no impedance mismatchThe reason for this is that events are domain concept and are persisted directly in Event store without any need for object relational mapping. Therefore, need for using ORM is minimal, and using Dapper/Generic repository becomes more practical.

Database model considerations

In this use case MSSQL will be used for both write and read sides, which adds to re-usability for dapper repository since it can be used on both read and write sides.

Primary key consideration

In this example I used Guid (.NET Guid and uniqueidentifier MSSQL datatype) as primary key identifier. It could have been something else like long or int, or string.

In any case this would require some additional work on Dapper repository. First, interface would need to change to accept additional primary key type. Then, depending on the type, there might be some additional work to modify queries generated.

Having more than one column as primary key would also imply some additional work and in this case using dapper/generic repository pattern would probably counter productive. We should opt for using full blown ORM in this case!

Bulk inserts with Dapper

Dapper is NOT suitable for bulk inserts, ie. performing large number of INSERT statements. The reason is that ExecuteAsync method, internally will use foreach loop to generate insert statements and execute them. For large number of records this is not optimal, and I would recommend using either SQL Bulk copy functionality or Dapper extension which allows bulk copy(its commercial extension) or simply bypassing dapper and working with database directly.

Transactions handling

Use case for applying transaction is when saving into more than one table atomically. Saving into event store can be this example: save into AggregateRoot table and Events table as one transaction.

Transactions should be controlled manually either on Command level (in CQRS Command implementation) or inside repository.

This example with two tables is inspired by Greg Young’s design which can be found here: https://cqrs.files.wordpress.com/2010/11/cqrs_documents.pdf

using (var transaction = new TransactionScope(TransactionScopeAsyncFlowOption.Enabled))
            {
                // Fetch Aggregate root if exists, create new otherwise
                var aggRoot = await _aggregateRootRepository.FindById(aggregateId.Identity);
                if (aggRoot == null)
                {
                    aggRoot = new AggregateRootDao
                    {
                        Id = Guid.NewGuid(),
                        CreatedAt = DateTime.Now,
                        AggregateId = aggregateId.Identity,
                        Version = 0,
                        Name = "AggregateName"
                    };

                    await _aggregateRootRepository.InsertAsync(aggRoot);
                }
                else
                {
                    if (originatingVersion != aggRoot.Version)
                        throw new EventStoreConcurrencyException($"Failed concurrency check for aggregate. Incoming version: {originatingVersion} Current version: {aggRoot.Version}");
                }

                // Optimistic concurrency check
                var domainEvents = events as IDomainEvent[] ?? events.ToArray();

                foreach (var e in domainEvents)
                {
                    // Increment Aggregate version for each event
                    aggRoot.Version++;
                    e.Version = aggRoot.Version;

                    // Store each event with incremented Aggregate version 
                    var eventForRoot = new EventDao()
                    {
                        CreatedAt = e.CreatedAt,
                        AggregateRootFk = aggRoot.Id,
                        Data = JsonConvert.SerializeObject(e, Formatting.Indented, _jsonSerializerSettings),
                        Version = aggRoot.Version,
                        Name = e.GetType().Name,
                        Id = e.Id,
                    };

                    await _eventsRepository.InsertAsync(eventForRoot);
                }

                // Update the Aggregate
                await _aggregateRootRepository.UpdateAggregateVersion(aggRoot.Id, aggRoot.Version);


                transaction.Complete();
            }

 

If aggregate root is not found, it is created and inserted in AggregateRoot table. After that each event is converted to domain event and saved into Events table. All this is wrapped in transaction and will either fail or succeed as an atomic operation. Note that transaction has TransactionScopeAsyncFlowOption.Enabled option, which allows transaction to invoke async code inside its body.

Conclusion

Implementation here can be further optimized for use in CQRS/ES systems however that is outside of the scope of this post. This implementation gives enough flexibility to extend required specific repository with new functionality easily, just by adding custom SQL queries to the repository.

Author: Damir Bolić

Domain Driven Design and the art of Pushing Back

Domain Driven Design and the art of Pushing Back

Closing thoughts

In order to successfully practice domain driven design, developers can’t just be given requirements. They need to be actively involved in refining business processes and suggest changes to them. Period.

Author: Anes Hasičić

TDD & Pair Programming in practice

TDD & Pair Programming in practice

When I was a Java software engineer with a few months of experience I encountered TDD only through participating in a few workshops or from books & blogs. I thought “yeah, yeah, it is all nice in theory, but who really uses TDD in practice”? C’mon, really? Just do a story or a task as fast and as good as you can, maybe write some integration or unit tests, maybe leave it “for later” … meh… who really cares… It’s a legacy project so we don’t want to put too much effort into it and hopefully we’ll rewrite it soon into a better technology and/or framework. What about Pair programming? LOL. No way am I going to work with some random dude all the time. What if the person annoys me or they will not want to do anything, or even worse, what if they are so much better and faster that they will not want to explain anything and I will not get the chance to learn? Also, who has the responsibility for the story in pair programming?

All these were my prejudice & questions about the aforementioned concepts. In this blog, I’m going to tell you how working on a Belgian public sector project has given me an opportunity to work in pair & made me disciplined about writing tests before the implementation.

TDD

TDD (Test Driven Development) is, simply put, a way of developing where we first write a test and the implementation comes afterwards. To avoid repeating or copying definitions, I’ve provided some useful links for you to check out, and the rest you can find easily on the Internet.

Here you can watch what Uncle Bob has to say about it:

https://www.youtube.com/watch?v=AoIfc5NwRks

If you wish to practice TDD, there are some cool examples on the web like this:

https://osherove.com/tdd-kata-1

Also, it’s always nice to read a book:

https://www.goodreads.com/book/show/387190.Test_Driven_Development

Pair programming

Pair programming is, in short, two people developing at the same time. This means that one developer types the code while the other one gives suggestions and warns about possible bugs in the code. Usually they switch in code writing and assisting. Sometimes one writes a test and the other one writes the implementation or one writes code for the first half of a working day and the other one writes for the second half of the day. It all depends on how two people like to work, but the idea is that both members of the pair are focused on the work at the same time in order to produce quality and bug-free code.

You can read more about pair programming here:

http://www.extremeprogramming.org/rules/pair.html

https://www.codementor.io/pair-programming

How does it look in practice?

Way back, when I had only a few months of experience as a Java web developer, I didn’t even know there was a different way of programming than: “do the implementation and after that make sure it works by writing tests for it”. The first time I heard about TDD was during a Spring framework introduction workshop I attended, but I wasn’t familiar with pair programming until I started working at my current company Tacta.io. There I’ve been told that such an approach leads to better code quality & a smaller number of bugs in the code. Ok. I understood that TDD is in theory the “right” way to develop code, but I was skeptical at first if people really develop in a TDD way and how pair programming even works or helps.

My first real experience with TDD and pair programming happened while working on a big project with around 1M 890k lines of code (1M 007k were the test code). We are also talking about the project that had its first commit in 2010 after 2 failed developing attempts that hadn’t followed the concepts mentioned earlier. So how do you work on that kind of project and keep it maintainable for all these years?

Some of the most important ways to help keep the code base maintainable is not only using TDD & pair programming, but also developing according to clean code and DDD (Domain Driven Design) principles. All this pays off in the long run, especially when you have complicated domain and business logic in the project.

Alongside things mentioned above, we also have a strict development process that we follow:

  • Daily standups – say what you did yesterday, what are you doing today & is something blocking you
  • Pair modeling – a process of creating a document that two people wrote with functional & technical summary of a story that needs to be implemented
  • Story kickoff – after the pair modeling, a story is divided into tasks and presented to the team. The team asks questions and also gives suggestions if something can be done better
  • Pair programming – two developers actively develop a story, one takes a lead on the story which means he/she will work on it until it’s done, and the other member of the pair changes on the daily basis
  • TDD – we strongly try to keep the discipline of writing tests first. Unit tests usually go first and the implementation afterwards. Production code is also backed up with integration and end-to-end tests
  • DDD – while developing, we should be aware that the project must follow DDD concepts and rules
  • Proxy check – check the story with proxies (proxies are people who talk to users and make sure that stories they wrote are functionally OK)
  • Merge/pull requests – create a merge request so that other developers can take a look at your code and approve the code or request changes if necessary

Yeah, yeah, it is all nice in theory, but in practice who really uses TDD?

We do. After a story kickoff you sit with your other half of the pair for that day and you start working on the story. First you create a test, then implementation, then some more tests and so on. Of course, do not forget to refactor! Alongside unit tests, often you need to add integration tests and/or end-to-end Selenium tests to check if everything is ok when user works with the web application on a browser.

No way am I going to work with some random dude all the time!

IMHO, working with people is sometimes the hardest part of my job, but also, working in pair has more benefits than disadvantages. Yeah, it can happen that you aren’t so fond of the person who is pairing with you that day, but that happens rarely and you switch pairs every day. Sometimes discussions can take a lot of time, but in the end, they also teach you how to defend and explain why you do what you do & to develop communication “soft” skills. Also, I learned a lot about how other people think & work. With pair programming you get a chance to share knowledge a lot, because you are literally working next to another developer on the same thing. In the end you do not escape responsibility. You do share it with your pairs, but if you are the story lead, then you should ensure that it is implemented as best as possible in the code and functionally.

TDD & pair programming often work as shown in this simplified scenario:

Developer A: Oh, this story is easy like we talked on the kickoff! Let’s add a method to the service to search for files that have status “ACTIVE”.

Developer B: Great, we are working according to TDD so let’s start with a unit test! We should add a test that creates two mock files. First one should have “INACTIVE” status and the other file should have “ACTIVE” status. This test will then call our method and assert that it returns a list with only one “ACTIVE” file.

Developer A: Nice, I’ll do the test and you can do the implementation and after that we switch?

Developer B: It’s a deal. I’ll help you by watching and commenting and I expect from you the same when I’ll be writing code.

Hopefully our two developers created some quality code with good test coverage. But even if they didn’t, they still have to pass the proxy check and merge request reviews. 🙂

Conclusion

After working for more than a year according to TDD and in pair, some of my opinions on those subjects have changed:

  • TDD changes the way you think and work. The ability to create a test before the implementation proves that you understand what needs to be done. With each test you improve and refactor your code, and tests ensure that the code does what it was intended for.
  • If you are working on a big project, it is hard to know everything and working in pair speeds up the development.
  • Two people usually do the discussions and explanations out loud and that maybe wouldn’t happen so often with “solo” development. This leads to better knowledge sharing, code quality and edge case test coverage.
  • I met some amazing people and each of them has their own unique way of thinking. I learned a lot from them – sometimes it was the use of some new IDE shortcuts, sometimes it was how to think in DDD way and sometimes we learned together about the project and technologies like Java language, Spring framework or Hibernate.
  • I’ve found my preferred way of how to pair program and, in my experience, I remain most focused if I switch roles based on test/implementation cycle.

I hope that this blog has given you some idea how TDD and pair programming look in practice. If you have more questions about it, feel free to send me an email at: luka.sekovanic@tacta.io.

Author: Luka Sekovanić