The neat thing about this project is that it’s greenfield but… not. We understand our domain pretty well and we have an existing system to use as a reference, we just need to bring the system up to date with technology and practices. So obviously, rebuilding a system isn’t something you generally wanna do. Generally, you’d extend existing systems and write new functionality into the new platform. However… this is gonna be tough to do, but we’re gonna look for those opportunities as we go. For other stuff, we may just have to replace them completely. I feel like it’s gonna end up being mostly replace…
The more I learn about the legacy system, the more impressed and appalled I become. You see, the details of this system have been hidden from me since I started working here. The team lead and two other veterans always took the tasks that were given to us to maintain the system and they never shared anything about it let alone documented everything. And now, at least two of them is leaving the company. So naturally, we’re rushing to glean as much information as we can, enough to at least put out fires in the legacy system.
This is system is probably the most intense system I’ve had to deal with. The original authors of this system reinvented the wheel at every turn? Why? Well the apparent motivation was cost. This system was written in Powershell. Yes. POWERSHELL. Business logic, message queues, clustering… with the exception of a sql database and azure queues, everything was designed in windows and powershell. Nothing is in source control and there are thousands of Ps1 files strewn across a number of servers. Workflows are handled by a SQL job acting as a heartbeat, running procs that assess the state of things, then place the names of powershell scripts into tables where other jobs wake up and have servers run those scripts which process files and then move them into other queues to repeat the process until a workflow is completed. That’s the short story… the reality is pure madness. Obviously this isn’t maintainable.
Here are some of the things we’re hoping to address:
- Much of our IP is very specific to working in a particular industry. We’d like to re-tool it to be flexible enough to extend into other industries.
- We should be able to scale appropriately instead of just throwing money at hardware to increase our ability to process.
- We lack a productive culture. Our development process is almost non-existent, everything is top priority, there’s little design and zero documentation, there’s no coherent architecture or coding discipline/standard.
So back to the interesting stuff. Right off the bat, here’s what I was thinking:
- We now know where we stand and where we want to go. In other words, we’re beyond the “don’t put the cart before the horse” phase in terms of architecture
- We have enough complexity inside of each “topic” of what we do to warrant a microservice architecture. It’s time to do it.
- Microservice architecture introduces a lot of other complexities such as eventual consistency, async messaging, etc. The obvious solution in terms of application model is CQRS. This works out well becase CQRS works well with the ever-changing domain model, and that describes us pretty well, I think.
- Many argue that you should do event sourcing unless you have a real use case for it. My proofing of the model has repeatedly shown me that if you’re gonna do CQRS, then you really should do event sourcing as well. I’ll talk about this more later on.
- We’ve been talking a lot about some pretty intricate systems that we’d like to implement that would warrant a graph database. So we’ll be looking in to Neo4j
- There’s a lot of custom search type applications in our customer dashboard, so we’ll be looking into elastic search
- We have serious problems with Object-Relational Impedence Mismatch, so we’ll be looking into NoSQL databases in general.
- With CQRS, I’m finding less and less strong use-cases for a relational database
- With those last few mentioned, clearly, we’re looking at a polyglot data architecture. Perfect. CQRS facilitates this very well
- We need a service bus. NServiceBus it is!
- We want to fix a matching system that’s currently done with some intense regex sorcery. We could probably increase it’s accuracy, speed, and maintainability by using some machine learning.
- Managing scaling is gonna be a biatch… should probably leverage Docker.
- Logging is gonna be a bit different… Loggly or Seq, maybe?
This is going to be … educational… challenge accepted.
Todo — learn:
- MongoDB
- Neo4j
- EventStore
- ElasticSearch
- NServiceBus
- CQRS
- Docker
- Loggly or Seq
The CTO and I feel pretty good about these instincts. Next step: getting buy-in from the team since EVERYONE is going to have to change the way they think about software…
In the meanwhile, I’ll be proofing out these concepts and writing core libraries to help abstract all the complexity of some of these practices. After all, if I’m going to push these things on the team, I have to be able to champion it.