Cross-Cutting Data Requirements in Microservices

In a microservices architecture, each microservice manages only data related to its bounded context. The entire domain data is spread across multiple databases and sometimes across multiple storage technologies — relational databases and different NoSQL variants.

Handling data requirements that cross microservice boundaries is not an easy task. There are several approaches you can take — getting data from multiple sources together when required, making the data available where required, or moving the data.

Aggregate Data When Required

For instance, prices from one microservice and stock levels from another can be combined to show a table with both prices and stock levels. However, if you want to sort out-of-stock items by price, you cannot limit your stock level results to one page. Your client or aggregator has to retrieve all out-of-stock items, and use that to retrieve a sorted list. You can avoid this problem by sacrificing functionality — allow sorting and filtering only on data from one of the sources.

Complex aggregations that use values calculated from different result sets are best done in an aggregator service. Simpler ones, like looking up a small set of values can be done on the client side.

Letting the client aggregate is architecturally simpler but introduces additional complexity in the client code. This will not be a good experience if you are exposing your API for external use. On the other side, aggregator services take on the availability requirements of all the microservices they combine. So, if the same service aggregates user details and pending orders in addition to prices and stock levels, it will need to handle a simultaneous peak in load for both.

A Properly Managed Cache

Caching may seem like a poor fit for many use cases. But a well thought out page can leverage caching without degrading user experience. For instance, if you have a watch list of items in an auction site, retrieving the list and a server side cache of leading bids for items expiring later is an option. Bids for items expiring soon can be retrieved directly from the bid service.

Caching has its own challenges — keeping the cache up to date being one of them. The simplest way is to reload the entire cache periodically. This may not be feasible with large data volumes since most caches are not optimised for writes. Updating only what changed fixes this problem but is a bit more complex to implement.

Caches also need to be factored into your disaster recovery strategy. If your source is restored to an earlier point in time, your cache is suddenly “invalid”. If your source is a facade for the actual source of truth, this may not really be a problem. For instance, if we lose the last hours’ worth of work in a data entry system, the cache will be just ahead of the intermediate source. The actual source of truth is the paper trial. Actual ways to solve the problem include keeping track of changes made to the cache within the recovery window for rollback or restoring the cache to an earlier point in time.

Is Data Where Its Supposed to Be?

Data can end up in the wrong microservice — a result of incorrect design choices, an incomplete migration when a larger microservice was broken up, or an evolving understanding of the domain. The solution is straightforward, though sometimes tedious — move the data to the microservice it belongs to.

More often, similar data belongs in multiple microservices. Incorporating data into a bounded context is different from caching, where we duplicate the data. The difference may be subtle but determines when and how the data is updated. Cache updates are periodic or driven by updates in the source. Data incorporated into a bounded context are updated by commands on the aggregate.

For instance, in a financial reporting solution, if you hit the account microservice to get the account contacts every time you generate a report, you should have the contacts in the reporting microservice too — as a dated list of contacts.

Sometimes you can be too enthusiastic in breaking up a monolith and end up with microservices that are too small. For instance, if you have bank accounts and bank transactions in separate microservices, you can find yourself hitting accounts every time you need to change a transaction. Combining them into one, with the account as the aggregate, will be a better design.

Conclusion

Microservices are not silver bullets. They solve many problems but throw up other challenging ones to ponder over and solve.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store