mandag, september 26, 2011

A few answers to CQRS questions

I have tried to pick a few common CQRS questions and supply my own answers:

1) I'm really missing some definition of WHEN to use CQRS. Collaboration, I know, but 90% is collaboration in my book. A Customer table for example. And Orders, Invoices and shipping tables. Can those be accessed and edited by more than one user? Yes - well there's collaboration!

The point here is that, yes, they can be edited by more than one user, but are they really? How high is the probability of having more than one user updating the same Customer simultaneously? I would say close to zero. The same goes for the other entities.

Sure, all of the users are working together to keep the customer table in sync with reality, that is collaboration. But it is not collaboration in the way of working on the same shared resource!

But we still need to handle that 0.001% change of simultaneous updates! Sure - so add a simple version check (first update wins, second update gets a notification). That is not CQRS One Way Commands - that's plain old synchronous and locking database updates.

So when are we collaborating on shared resources and expecting lots of simultaneous work? Good question (and I do not have a good answer). The fact that so few examples exists indicates that CQRS is something which is not normally required.

2) What about the fact that many implementors feel that things are getting much more simple just by having read and write separation. Isn't that a reason in itself to use CQRS?

You can separate reading and writing without CQRS and eventual consistency. All you need is two different code bases working on the same database table. Very low tech. Maybe throw in multiple database views to support more or less complex queries.

3) But having different read and write databases helps scaling out.

Yes. So does plain single-write-master/multiple-read-slave database replication. If scaling out the query part is your problem then use master/slave on the database level. If scaling out writing is your problem then look into CQRS for that particular use case which requires it!

Scaling out writing is required if you have lots of issues with database deadlocks and timeouts due to intensive locking of the database. This won't happen if all your users do, is to work on different entities all the time.

4) Adding some repository and an event store might be hard the first time, but after that it's really simple. Right?

In my little experience - No. But I may certainly be wrong and have used the wrong tools. In theory it is simple, yes, in practice, no. It adds complexity - not that much, but enough add friction to your project. If your are reading this for some advice on CQRS - well, you might just have experienced that friction and started wondering why.

CQRS adds extra time spent on infrastructure for simple problems, when you should be spending time on complex problems, while keeping simple stuff, well, simple. See also my previous post: http://soabits.blogspot.com/2011/09/why-cqrs-may-not-be-answer-you-are.html.


Please, go ahead and use CQRS! I am not saying "no, do not" (who am I to do that?) - I am just sharing my experience.

/Jørn

3 kommentarer:

  1. Hi Jorn.

    It's good to see that not everybody is jumping the hype without thinking about it first.

    As far as I can tell, you are not very far off the truth, but I wanted to quickly help clarify things.

    First and foremost: CQRS as an architectural pattern has nothing to do with event sourcing, eventual consistency, messaging, pub/sub, denormalized views and whatever other patterns are being confused with the term CQRS nowadays. CQRS is simply having two separate models for each reads and writes, that are being accessed via queries and commands respectively. So basically what you say in 2) is pretty much what CQRS is all about.

    As you described in 2), 3) and 4) you don't need all the fuzz. Just use common sense and apply complementary patterns like view denormalization or event sourcing as needed. That's what CQRS is all about: Simplification by viewing Read and Write as separate concerns.

    Concerning collaboration: It's not about concurrent access of the same type, it's about modelling the domain. Different roles might modify the same aggregate with different intentions at different points in time/the aggregate's lifecycle.

    Whenever different people or different roles have use cases concerning a common aggregate, it might be considered collaborative imho.

    So when do we apply DDD/CQRS?

    Whenever a bounded context has a really complex model underneath AND it has a significant impact on our application's value. The blue book calls it the "core domain". i.e. don't use CQRS on the accounting context, since this is pretty much a solved problem. (Except when you're creating that new accounting app and thus it's your core domain, of course.)

    When do we apply Event Sourcing?

    Most teams probably shouldn't at all. Persisting state in a relational db and feeding your events into an event log pretty much gives you all you need:

    The state-based db keeps being your single (and well-known) source of truth. And if you need to go back in time or project some new views or reports off historical data, go through your event log and take what you need. But even an event log is an addition to, not core part of CQRS.

    When do we denormalize our views into a separate datastore?

    We already do most of the time. It's called caching. It helps with performance issues and scalability. But as you correctly stated, it's not needed until it's needed. And it's yet another pattern being confused with CQRS.

    The same goes for messaging, integration patterns, etc.

    Long story short: The difficult part is not (and shouldn't be) the infrastructure. The difficult part is modelling your domain. Focus on that and use CQRS in whatever bounded context it may help reduce to complexity of trying to stuff reads and writes into one big model.

    Everything else, I'd consider premature optimization.

    Any thoughts?

    Dennis

    SvarSlet
  2. > That's what CQRS is all about: Simplification by viewing Read and Write as separate concerns.

    Yep. Just looked it up at Fowler http://martinfowler.com/bliki/CQRS.html and Greg Young http://codebetter.com/gregyoung/2010/02/16/cqrs-task-based-uis-event-sourcing-agh/. At its very core it is certainly neither difficult nor an overhead to anything.

    So why are people having trouble with it? Because by that definition you only get cleaner code, which of course is a good thing in itself, but you don't get the scalability that people buy into CQRS for, which is where CQRS gets difficult.

    This fact that we have "simple" CQRS and "advanced" CQRS will certainly lead people astray. When one person says CQRS is simple and another says it helps scalability - well, then most of us will fall into the trap and believe it is easy to do the advanced CQRS stuff. I did.

    Since the advanced CQRS is built on distributed computing using messaging and so on, we might call it Distributed CQRS to clarify what we are talking about. For the rest of this text, that is what I am refering to.

    > Whenever different people or different roles have use cases concerning a common aggregate, it might be considered collaborative imho.

    Yep. That is what I mean when I talk about "working on the same shared resource". You just got it more correct :-)

    > So when do we apply DDD/CQRS?
    > Whenever a bounded context has a really complex model underneath AND it has a significant impact on our application's value.

    Yep. Keep the simple stuff simple, and throw your resources into the complex stuff. Sounds easy.

    It is unfortunately a pitfal for many people - to understand something we usually try it out simple before jumping on a bigger project. So we apply the DCQRS techniques to the Customer Care / Blog Post / Tweet / What-ever-CRUD scenarios and smash into a wall of technical issues. Then we think, Duh! What am I doing wrong? And it is *so* hard to see that the technical stuff may in fact be done right while it is the choice of business example which is wrong.

    As Fowler puts it: "CQRS is a significant mental leap for all concerned, so shouldn't be tackled unless the benefit is worth the jump".

    If a developer goes fully DCQRS with eventual consistency they quickly turn to a mailing list with the question "But how do I make sure my UI is consistent with my input?". Then someone says "You don't!" and that leaves the poor guy with that "Duh?" feeling. Eventual consistency and one way commands that never fail seems like such simple concepts, but turns out a lot more tricky than expected.

    What makes the tricky part even more tricky to developers is that you avoid it by choosing your business domain very carefully. Unfortunately that means understanding business and *that* is not something we are trained to do as developers.

    > When do we denormalize our views into a separate datastore?
    > We already do most of the time. It's called caching.

    Great point!

    /Jørn

    SvarSlet
  3. The state-based db keeps being your single (and well-known) source of truth. And if you need to go back in time or project some new views or reports off historical data, go through your event log and take what you need. But even an event log is an addition to, not core part of CQRS.

    SvarSlet