Magazine
Memo inside

Communication in a service-oriented architecture

Joan Zapata

icon dot

04 December 2020

Service-Oriented Architecture has proved its strengths in virtually all industries to produce scalable and evolutive systems. It enables the creation of systems capable of evolving and growing to operate on a large scale if necessary. Although the concept of service-oriented architecture encompasses several definitions, but its core idea of breaking a complex application into smaller, reusable and loosely coupled services has come a long way since the late 1990s.

We’ve embraced the concept from day one. And after three years, we now have about twenty domain-driven decoupled services structuring our backend applications. It allowed us to use the languages we liked — like Kotlin, Elixir — and the tools we wanted — like Event Sourcing — for the task at hand. Thanks to this choice of architecture, we were also able to put in place the tools we wanted to work with, such as event sourcing (keeping track of all the events that modify the state of a given application).

But having multiple services comes with its own challenges, the first one being how to make services communicate with each other? There are a lot of strategies out there, and this article is a return on experience on the ones we use at Memo Bank.

Let’s begin with two services and this simple use case as an example:

REST in HTTPS

Like many backend applications we started with REST calls over HTTPS.

Image de l'article

What goes around in circles and which we strive to avoid at all costs? Cyclical dependencies, of course (when A depends on B and B depends on A). In our example, contracts can’t directly tell customers to activate the new customer as soon as the contract is signed. Instead, customers has to send requests to contracts regularly to know when the task is finished.

HTTPS is a great start, and definitely something to keep in the toolbox: it’s simple to set up, widely supported, and also used to expose APIs to frontend applications. But there is one big issue with this scheme, it’s synchronous (although not necessarily blocking). It means that while contracts is working, customers is waiting with an active connection. When used at scale, it means that the slowest service can slow down the entire system. Put differently, it only takes one slow service to decelerate the entire system. It also means that when the contracts service is unavailable, the customers service isn’t either, which is called cascading failure.

How did we tackle this issue? We used two different patterns: discrete and continuous messaging.

Discrete messaging

Let’s take the first use case: customers wants contracts to send a contract.

For this, we can solve the problem using a discrete messaging mechanism: queues. You can see a queue as a high-availability third-party mailbox. Even when the main service is unavailable, the mailbox accepts messages. Multiple instances of the same service can consume the same queue, in which case the queue acts as a load balancer and ensures every message is handled at least once. Once a message has been handled, the mailbox can forget about it.

Image de l'article

We use queues to send commands from one service to another. (SendContract for example)

It has several advantages:

  • The caller service doesn’t rely on the availability of the receiving service, and can resume its work as soon as the command is in the queue. The sender and recipient operate in separate quarters. Once the sending service has transmitted the command to the queue, it can resume work, regardless of the state of the destination service.
  • The receiver can handle the commands at its own pace. We can easily scale the receiver service up or down depending on the load of its queues.
  • As a bonus, failure can easily be isolated and handled manually (we hope to cover the dead letter queues topic in another article).

Continuous messaging

Now let’s see the second use case, when the contract is signed, contracts needs to let customers know it happened so it can activate the customer.

It’s tempting to use another queue here, but as we said before we don’t want a cyclic dependency. Consequently, Contracts is unaware of the activities within Customers. Therefore, Contracts cannot dispatch the order ActivateCustomer to Customers, as it has no knowledge of this order and its intended recipient.

We solved this problem using a continuous messaging mechanism: streams. What is a stream? You can see a stream as an ordered sequence of events that can be consumed in real time but are also made available over time. Contrary to queues that are unrelated ephemeral commands, streams tell a persisting story.

Image de l'article

Each service at Memo Bank broadcasts a stream of events describing the lifecycle of its resources. The event flows disseminated by our services enable us to trace the life cycle of each of our services. Building and maintaining these events is an integral part of any development, whether or not this event is immediately needed. It’s part of the routine, just like automated tests and metrics. It’s part of the routine, just like automated tests and metrics.

These events thus become a reliable timestamped log of immutable facts. And since they are part of the API of the emitting service, they can be consumed by any other service, both in real time (one by one) and in the future. What’s the point of recording these events in a register? These events have an immense value for traceability and data analysis.

To sum up:

  • queues are used to send commands to a specific service;
  • streams are used to expose facts from a specific service.

All about dependencies

The diagram above might make you wonder though looking at the arrows, doesn’t it introduce a cyclic dependency between customers and contracts. No, it does not. Fortunately A dependency is not defined by the direction of the data, but by the knowledge services have of one another. Here, customers knows about contracts, it tells it what to do and it listens to its story. But contracts knows nothing about customers, it doesn’t need to know who is sending the commands, nor who is listening to its story. Indeed, Contracts doesn’t need to know who is sending it commands, nor does it need to know which services can observe what happens within it.

Image de l'article

Both the queue and the stream are part of contracts API, and customers depend on this API. Having a naming convention for commands and facts is very important to convey this idea. To be consistent from one API to another, we always use:

  • A base form for commands. For example: SendContract;
  • A past participle form for facts. For example: ContractSent.

Note that it conveniently matches with our Core Banking System architecture based on CQRS/ES. In this terminology, commands are the same, and events are facts.

How to determine the direction of dependence

Given the principles explained before, this solution would be just as valid.

Image de l'article

But if both solutions are valid, how to choose one over the other? Well, it’s up to you. It all comes down to the direction of the dependency you want to set.

Image de l'article

Here are some questions we usually ask ourselves:

  • Can one service easily be agnostic of others? Here for example, as long as we give the content to contracts, contracts can be totally agnostic of which service is using it. On the other hand, customers can’t afford to be agnostic or indifferent, because customers cannot ignore contracts. That’s in favor of A.
  • What if one service was a third party? For instance, it would not make sense for Memo Bank to outsource customers. On the other hand, it could for contracts. Thus, this is in favor of A as well.
  • Is one service orchestrating other services? Implicit orchestration is bad, you can find more about it in this talk by Bernd Ruecker. Creating a customer is a complex workflow which involves many services (sending emails, notifications, creating a bank account, etc.) So customers is probably an orchestrator here. Making the orchestrator depend on other services — and not the other way around — makes the code a lot easier to understand, because the full workflow can be found in a single place. That’s also in favor of A.
  • Does it create a cycle in the overall architecture? Even if there’s no link between the two services, they both depend on other services. Let’s say customers depends on users, and users depends on contracts already. If we chose solution B, it would create a cycle with the three services. That’s also in favor of A.

Conclusion

When embarking on the design of a service-oriented architecture, it’s best to think very early about the flow of messages between the different services that compose the architecture. Using HTTPS and REST seems like the most straight-forward solution at first, but it has its limitations. We completed our arsenal with queues and streams, and we set mainly two guidelines. For the rest, we stick to two simple principles.

First principle: each service must broadcast events as soon as its state changes, even if the events in question are not useful at the time they are introduced. Like a black box, the recorded events should allow us to trace back through time, reconstructing the various states each service has gone through, such as ContractSent or ContractSigned. These streams contribute to enhancing the traceability of our services — an obligation we have as a bank — but they also help consolidate the API of each service, making the system easier to manipulate for all teams.

Second, it’s all about the dependencies. Dependencies shape the system, and cyclic dependencies are the number one enemy. Once the dependencies are properly set, the messaging tools are just here to let the data flow in any direction.

Image de Joan Zapata

Joan Zapata

Author

Logo MemoBank
Logo Memo Bank