MICROSERVICES AND RABBITMQ
Some time ago defining a service was a costly task, you had to define the WSDL and prepare a container to serve the service in a reliable way.
Not to say that SOAP consumption is expensive. The server has to parse request, validate, process it and return a valid XML response. That’s a lot!
Normally you should load balance and again, doing it was costly. In time and maintenance.
Things changed slowly…
New frameworks appeared and the cost of creating a service decreased greatly.
And REST+json added the light to the situation, because consuming services became less expensive. And a lot easier.
So services started to increase in number but also in complexity. Making big and difficult to maintain beasts servers. You change one service and voilà! A new deployment should be considered and planned. So you have a new operational problem.
Because creating services is easy now. Why not to split the server and deploy them standalone?
Coordination becomes a problem.
You can still have a heavy container that makes deployments and management a little bit complex.
Sometimes it gets slow because has to handle a lot of applications running inside. And it has to manage connection pools among other resources.
This situation will be unreliable in short time.
Frameworks evolved (see Grizzly vs Mina) and we can avoid to use the container at all.This normally will ease the update of the services and overall management. But also performance and reliability.
You will loose some things in the way but as far as you are smart enough there’s always a good work around.
Things will get even better if you use specific frameworks for microservices.
But the more I read the more obvious is the problem with this kind of services and frameworks.
This is just an opinion so ideas are welcome. When you enter that world you will be tied to a synchronous world when you will waste your time waiting for responses and processing requests one by one. (Yes I know each request will be processed by one thread but you know what I mean).
THE MARVELOUS WORLD OF ASYNC PROCESSING
We took a different approach for our microservices.
We put in the middle RabbitMQ. But I know it can be something different like ZeroMQ or Kafka. That way you get also load balancing for free. High availability and resilience. Depending on how you configure it of course. The price you pay is a little bit more of latency in responses.
The idea is simple, there’s a frontend (that also can be removed) that pushes a message to MQ. There will be waiting for messages one or more consumers that will process the request as soon as it arrives. Then a response will arrive to your frontend and it will be returned in whatever format you want it.
With this process you totally decouples the processing from the service serving. You can switch frontends at your will and the processors (consumers) will manage messages in a neutral format. For reference we use json, but we are migrating to Apache Thrift. That way we can switch from one internal representation to another, but this is another history…
This way of deploying services some great advantages over traditional approach:
- Zero deployment downtime: You can deploy new version of the service while the old version still working. Then shutdown old version, you are done. This can be automatic also.
- Load balancing included: RabbitMQ does round robing by default.
- Resilience: No messages are lost if one of the consumers fails. We do implement ACK.
- Scales well: If one part of the system is taking too much time, just deploy few more processors to help with the ‘extra’ load. In same or different host!
- The intercommunication uses RabbitMQ also: We don’t have to switch between protocols, everything is handled the same way.
- It’s fast!!!: We discovered that RabbitMQ is so fast that one transaction in one of the processors was not closed before the response arrived destination. It means that even committing a small transaction is slower than relaying a message.
- We can connect to any system: Since frontends are pluggable, we can switch protocols without problems in a lightweight manner.
Of course there are cons. One of them is that if you want to expose all this as a traditional REST or SOAP service it takes a lot more programming. Also some languages does not have good MQ libraries that pushes you to do traditional way.
Writing MQ services are little more complex than REST/SOAP services.
OUR SOFTWARE STACK
To make things simpler we made a new whole software stack around RabbitMQ and CXF. Some of them are being migrated to new tech, some of them are there to stay long…
Let me describe a little bit:
- We use Spring for configuration management and IoC. It simplifies configuration and allows us to write plain files that let us switch components without having to recompile.
- We use Hibernate for database access. And Hibernate tools + HibernateDAO to generate code for different databases. Simplifies a lot database access, we never wrote a query. Makes system fully performant in different databases. And you get a lot of goodies.
- We use Jboss for the container of the services. This is mainly a backwards compatibility option. I will explain later.
- We use Level2 MQ framework to build the services. It allows us concentrate on building the services and it does for us:
- Serialization/Deserialization of messages.
- Route mapping between, exchanges, queues, etc.
- Handles ACK.
- Apache CXF for building the API REST/SOAP. We create services that match a MQ method using CXF. With our current configuration we get for free:
- The REST service that understand XML and JSON.
- The SOAP service that understands XML.
- API documentation with Openstack API format.
- Level2 MQ Maven Plugin. That allows us to automatically generate all CXF services. Yes you rode well. We automatically generate all API frontends from MQ code. Ok yes. It’s still on construction but we are using for the next project for the full API.
That’s almost all. What do you get with this? Well a lot of useful abilities. This is our way to work.
We generate a database that fits our requirements. For us data is the most important thing so we want to make sure the database is designed the first. Then we generate code to access the database. This code is shared on microservices. And it means is fully accessible and full loaded of useful utils, like an optimized search engine.
Then we design our internal API. We normally structure the services in what we call modules (basic pieces of operations on the DB). They do database access and small operations over our entities.We have also module frontends. They define more complex operations, and compose small services if required. We have few requirements for module frontends, for example DB access is forbidden.
Once we have everything done. We do use the automatic code generation tool to generate the API. The tool parses the generated code in the previous steps and creates fully CXF configured services for SOAP and REST. Json and XML for free.
The generate-sources maven target on that projects automatically generates API docs in html format. Fully commented and with examples (WIP)!
Isn’t that great?
The great thing of it is that it hides all complexity. Because Level2 MQ library creates and configures RabbitMQ exchanges and queues on boot. No manual configuration is required. So programmer only have to define the name of the queue. But this can be also omitted and you still will have a fully compliance service. Configuration is a lot less than a CXF service.
We have to update wiki of the project, now only a class annotation is required to make everything work.
We are never happy enough with what we do. We think there’s always room for improvement. So in the next months we will be changing a lot of things. To address possible problems.
STILL SYNC ON API FRONTEND
We are still sync on API frontend. This means that you send a request and wait for the response or timeout. This part is more about how to deal with processing than microservices itself. But we are wanting to do other way.
We are thinking about streaming mode. It means that you will send a request and you will not receive the response. Instead you will receive a token that points to the response.
That way response is immediate, no more waits. The request will be placed in Kafka and processed in order. The caller can do more things while processing is ongoing or just wait for the token to be fulfilled and response become available. It depends on the action.
We still thinking about this but using this way of work maybe it makes sense to migrate everything to something like Apache Storm. We still thinking about it as I said. But it will remove a lot of waiting in some places. We are reading a lot about backpressure.
We now that serializing and deserializing so many messages in json format will make systems spend a lot of time on that task. And it will increase as traffic an services grow. But we don’t want to loose the ability to inspect messages on the fly.
This is why we are implementing Apache Thrift for this task (Serialization/Deserialization). That way we will have binary, framed (and maybe encrypted) protocol for production. And plain json in test or when needed.
We are using Jboss because historic reason as I told before. It used to manage transactions, connections pool and connections. But with the RabbitMQ+Spring+Hibernate stack we don’t need this anymore. In fact we are currently running inside the container but connections are managed by our library. And transactions can be managed by the Spring transaction manager.
Why it makes sense to use Jboss? Because it’s a piece of software well maintained that provides a lot of useful libraries so to financial companies it makes a lot more attractive. They have RedHat behind to respond on security issues of the libraries.
Sincerely this is the only reason we still maintain it.
IMPROVE DEPLOYMENTS AND OPERATIONS
When you have one thousand services that have to be maintained it makes it a little difficult to maintain control. We are thinking about Docker infrastructure and maybe Openstack for server and resources management. The problem about it is that not many system administrations have the knowhow.
So sometimes is difficult to introduce these kind of technologies to companies because there’s an inherent risk for them. Again, Apache Storm can help here but operations required qualified personnel.
Apart from the fact that there’s other way to do things. It makes sense at least for us to detach the frontend from the real processing. It keeps our programmers focused without having to think in REST, SOAP or any other stuff. Just concentrate on what you are doing.
But we also allow the unit testing to be done precisely. Even the integration testing can be completely done without hassle, because test framework can deploy only what’s required for the test. Run the test and drop down the services.
We saw an important improvement in maintainability, productivity and improvements. Because it seems that everyone is more focused and there are less distractions.
Customer didn’t noticed any slowdown but communicated that things are working better respect the monolithic version. So for now it’s working well.
We will see when the thing scales to see if our efforts are paid…