Ideas how to use CIM NMM together with Event Sourcing

Introduction

In the DSO world in Denmark, we’ve seen an increased focus on optimizing the flow from planning to construction, to operation, and to maintenance and asset management.

Also, as part of the CIM Network Model Management (NMM) initiative, some new concepts such as projects, stages, change sets etc. has been introduced in the CIM standard, to be able to support coordination and modification of data across processes and systems in a utility.

That is, the use and sharing of network model data between business units (aka business silos – which is difficult to avoid due to Conway’s law), and making sure the data quality is not broken in processes spanning multiple departments. This, so that more efficient data-driven decisions and automation can be made in the utility as a whole, with the end goal to save costs.

One concrete example of functionality that several DSO users has asked for recently, is the ability to get near real time feedback in terms of conforming to some data quality rules set out by the DSO – i.e. whether the connectivity and topology is correct, the needed electrical attributes are present etc. This, while they do their network modelling, and to prevent that bad data flows into other processes – i.e. construction, operation, network analysis, asset management etc.

Therefore the users like a fast QA feedback loop while they’re creating and modifying network models – e.g. when projects are sketched, detailed for use in the construction process, implemented and documented etc.

Also, the request for more processes to be more automated – i.e. automatically generating a full node-breaker-model from a project-sketch, BOMs etc. in near real time, has put more pressure on the IT-architecture.

The aforementioned functionality request has been lying in my head for some time now, and it gives me headache, because right now, in the current production systems, we cannot give them this kind of functionality in near real time. We can give them feedback with a delay of half an hour or less, by reading the whole network model in GIS, convert it to CIM, doing topology processing, generate derived data, check for errors etc. But it’s not good enough!

One problem is that the current systems used for network modelling are build around traditional relational (SQL) databases, that holds the current state of the network model. It’s not possible to get changes out of these systems in a fast and reliable way – at least not without doing a little hacking and tweaking, that you can read more about in the end of this post.

In Danish DSOs, initially network modelling is typical done in GIS, and these systems has so called versioning functionality, but many times is it disabled, or history are throw away (i.e. every night) because otherwise the system becomes very slow.

Also, typical the internal event streams, are not available to the outside world, at least not in a fast and reliable way. The systems are not build with event sourcing in mind – i.e. events are not first class citizen that reflect the business domain. Also, the state that is presented to the user, is not derived from events, like a true event based system would do it.

I believe many IT-problems we have in the utility domain (especially if we look at whole chain of processes and not just that of editing the as-build network) is due to accidental complexity (in many cases we use the wrong methods, information modelling and technologies for the job), and that switching our mindset and systems to event stream processing will help with that.

It’s not that SQL databases are bad. They are very powerful, and we need them. However, many of the problems we’ve been spending tons of resources solving, could be solved way more elegantly using event sourcing as the fundamental philosophy, in combination with other technologies, like SQL databases, graph databases, OLAP systems etc. 

The problem is, if we are forced to use only SQL databases, and only focus on modelling data structure (i.e. what datasets and attributes the business needs), and don’t focus on modelling the behavior (what events are going on in the different business processes), then we are pretty much stuck. Shoehorning paper forms and paper maps into systems with no clue about the business processes worked in the 90’s – not today!

Putting some effort into modelling our processes in terms of events, it will be a lot easier to build variant schemas (to be use for efficient and advanced querying – i.e. temporal queries, graph processing, event projections etc.), handle occasionally connected clients, and create scalable and highly available systems etc.

Looking at four promising trends in dealing with complex enterprise systems: microservices, continuous delivery, domain driven design and event sourcing, I think that event sourcing is the most interesting in terms of dealing with the core problems and challenges we struggle with right now in our industry.

Event sourcing encourage you to model the business domain from a behavior and temporal viewpoint, instead of a structural and state viewpoint. Event storming – some people like to call this process. This is really powerful.

Like the feeling scientists must have had experimenting with the different ways Newton and Einstein modeled the world, I fell butterflies in my stomach, that event sourcing is a very powerful way to model and reason about the utility business processes in a more fundamental way, as opposed to the more traditional modelling techniques we have been using for decades – i.e. UML.

When babbling about these ideas of mine to my friend Mogens Heller Grabe, creator of Rebus, he said to me several times: Do you know about Greg Young? You should take look at Greg’s event store. It’s really interesting… Okay, I get it, I’ll look into it, and I finally did 🙂

And thanks to Mogens for that, because it gave me a lot of ideas how to actual implement this concept. Some ideas that I like to present here, in the hope someone else in the power utility world might think they are great, so we can try them out in real life on some real (innocent) users 🙂

Actually, I have already sneaked some event sourcing into an ongoing project – a tool for creating power engineering network models and documenting the as-build network in the field. Now I just hope the customer will be excited, when they see the possibilities event sourcing will give them. But I’m not worried about that, actually, because it’s really powerful! I’m very happy that I went down the rabbit hole to the event sourced world, even that I had to refactor a lot of code.

Background

It’s important to understand the utility domain and their processes, to make the event source concept work in the utility world. In other words, we need to be sharp when modelling all the categories of events going on in a utility.

I will not go into detail about utility processes here, but will explain some of the basic ones, and how they in my view relates to event sourcing.

However, before doing that, if you’re new to event sourcing, I can highly recommend to look at Greg Young and Martin Fowlers recent presentations (YouTube videos):

For a more fundamental introduction, Rich Hickey has this awesome talk about values vs. places:

You can search YouTube, to investigate if some newer event sourcing presentations exists. These smart people gets wiser every day, and it’s therefore always a good idea to search for their latest presentation.

If you’re an IT geek, then I can recommend Greg Young’s “In The Brain” session about using projections for Complex Event Processing, and his 6,5 hours CQRS class YouTube video.

Another great, but a little geeky post on event sourcing is Jay Kreps’ “The Log: What every software engineer should know about real-time data’s unifying abstraction” , which I personally think is genius, because he is really good at explaining the problem space, and the fundamentals of event souring.

I’ve searched for dedicated books on this topic, but didn’t find any “must have” ones. But if you know some good book on event sourcing, I’m all ears.

The as-build network model

The as-build network is a central concept, that is really important to understand and take seriously as an IT-architect, when modelling IT-systems in the power system domain. I see many IT people that don’t understand the important of this concept, and it leads to all kind of problems.

Taking our event hat on, the physical as-build network model is all the construction events, that led up to the network we have out in the real world right now.

That is, every time a substation, cable boxes, cables, meters etc., including some equipment inside these containers, are added, modified and removed in the network – this is to be modeled and stored as an event in the IT-system, reflecting the work done by some field crew in the real world.

In the CIM world this is known as the as-build physical network model, that is modeled around the equipment information model in IEC 61970. The reason I say around, is that the equipment model function as the electrical view on the network, whereas other models in IEC 61968 are needed as well – i.e. asset information.

So, for sure, we should have a physical as-build network event stream. But that’s just one category of events that can happen in a real life network. It’s important to be able to handle many different categories of events as well.

E.g. open or closing a switch, is normally considered a more dynamic thing, and are not changes to the physical network model according to some power engineers book. They like to think of static vs. dynamic network changes.

A measurement or alarm is also an event that happens in a real life network, but this again is another event species, that we typical need to separate out in its own event stream. Imagine millions of measurement events per hour, in contrast to as-build physical network changes, occurring maybe 10-50 times a day.

Also, what about replacing some asset – i.e. replacing a power transformer or some breaker, but not changing its specs. Is this a change to the physical network or not? Well, it depends on whom you ask. If you ask a power engineer in planning, she might say: I don’t really care. If you ask a power engineer in operation, he might say: Sure, I need to know when stuff is replaced by whom and when.

These different contexts, concepts, and different types of network model changes, that power engineers and other utility professionals reason about, we should understand and document carefully, when modelling our events.

The event categories I just mentioned are actually so called bounded contexts, according to the DDD principles. They typical belong to a particular domain model – i.e. equipment, asset, work etc.

It’s crucial to understand these contexts, and IT and domain experts (users) should agree on them, creating an ubiquitous language.

CIM can help us a lot, because the CIM-experts has already done a great job defining a language for many contexts, in terms of CIM models and profiles (i.e. CGMES). The real problem, is to get people to invest time understanding CIM and the lessons learned from other CIM-projects. But that’s another story.

But CIM is not perfect. It’s not a silver bullet. We get into trouble, if we think of CIM as a perfect domain model or some plug’n play kind of interface specification. It’s an evolving standard, and that is actual what makes it powerful. However, most of the standard is modeled around UML class diagrams, which can be dangerous, because it might lead to an anemic domain model.

To make a long story short, I believe event sourcing in combination with the CIM standard can prevent many problems and lead us to a more flexible architecture. I’m not saying that the domain models and model driven architecture, which CIM is based on, is bad. I’m just saying that we need more than that, to deal with the increasing need for semantic interoperability, and that event sourcing looks really promising in terms of achieving such goal.

Project data

The power engineer is planning some changes that has to be applied to the as-build network in the future. This project might be constructed, and/or abandoned, in stages, in parts. It might be revised and change before it is constructed in real life, and often also changed while being constructed.

Projects may depend on each other too – i.e. stacked on top of each other. Also the as-build network might change (in a previous project) before the construction work is started in a following project, rendering the later projects specification invalid, because of some changes in the first project, due to conflicts with reality.

In other words, projects and the construction process is much more fun in terms of information modelling, than is just the process registering the as-build network 🙂

The big question is. If we take our event hat on, how should we deal with this kind of “mess”?

First of all, because of the changing nature of projects, and because it’s thought-stuff, compared to real life stuff, I would personally very much make sure this stuff is totally separated from the events going on in the as-build network world. We should not mix these two types of events, at least not conceptually. There must be some kind of strong boundary here. Also in terms of performance and reliability, we should have this in mind.

Many users in the utility don’t really care about project events. The operation/SCADA people don’t want this mess in their systems, before the stuff is actually constructed, or at least not before they are sure it will be constructed in a certain way.

It’s crucial for operation people to have a consistent as-build network model, to be able to operate the network. Also, it’s important that systems used by operations are up running all the time. The messy nature of projects should not be able to mess up operation, so to speak.

However, this is how the processes are running right now, but I’m not sure this will be the case in the future. The project mess will no doubt sneak into operation processes too. So called operation related projects – i.e. when they have to do some complex switching in combination with some construction work going on etc.

So we need to have event streams for projects, where we track everything that goes on in all the projects, including when the engineer changes his/her mind – i.e. because the as-build network has changed in the meantime or whatever the reason might be. And we need to foresee that we need to deal with projects data/events not only in the traditional planning department.

These kind of projects changes/events are important and very valuable information – but not many systems tracks them today, at least not in the DSO world. However, we sure don’t want to loose that information, if you think about it.

The project streams should be a history of every work done on all projects, no matter if it’s just experiments/research, whoever is doing these changes, if only a few percent of this is actually going to be constructed, or if the project is completely abandoned. We need these events to drive automation, quality checking, and examine how to optimize processes.

Also when people in the field take over, this is also events we need to store and handle, of course. When the project is actually being constructed, new events are created in the as-build event stream, that is related to the planned project events. The field crew also creates events, such as enriching data, i.e. serial numbers, test reports etc. We have to deal with all this stuff.

The beauty of this, instead of just modifying the current state in some transactional database, we now have full audit, and it’s easier to create automated processes that help the users to check things, automatically create model parts according to calculations etc. 

Instead of transactions, we think in term of flow of events!

E.g. when some engineer is tabbing on her/his iPad in the field, creating some sketch project together with the city planner, that initial network dimensionering, BOM’s etc., automatically gets created (as good as the automatic process can do it, but it can make a quite good guesses), and they will have insight into price, time to build the project etc. instantly. Just an example. 

Like we developer are dependent on source code versioning, as a tool to be able to go back in time, resolve conflicts etc. This is something that the power engineers and other people in the utility struggles with all the time too – i.e. getting project plans and real life to match, and getting work/plans from different teams and departments merged together, find out who should be punished, what vent wrong, what activities are taking the most resources and time etc.

The workspace

This is an interesting concept, that I’m still not sure how to deal with event modelling wise.  So I need you help! 🙂

The idea is, that when a person fiddle with some network modelling, they can do that in an isolated workspace, so they don’t annoy anyone else, before they are sure, that the quality of their modelling is okay. That is, they can check things, and experiment, before they “release” the information.

But how do we model that elegantly in the event world?

I’m not 100% sure about this one. It’s a bit more difficult, because this workspace concept span different contexts and processes, in my thinking.

When refining a project, it’s pretty easy, because we just add changes to the “thought stuff” event stream. It’s in the same context – the planning context – and it’s going on inside a project “container”. Therefore it’s easy to understand, isolate and reason about.

But when starting to construct the stuff in the real world, we are creating new events that reflect what is build according to the project. And here we might run into conflicts.

I.e. if we place some equipment into the real world, that is rated 200 Amp, but the project says it only need to be rated 150 Amp, because of some reason (the project guy was wrong, or we run out of 150 Amp equipments etc). How do we model such a conflict, and how to we preserve model consistency in the project and as-build model in the process? We need some kind of continuous integration process modeled using events here.

I believe we should not touch the project data (i.e. add change events to the project event stream, to make the project reflect the real world), but just create events in the as-build stream, that is related to the project event, or something like that. What the field crew did, should not be mixed with what project guy thought he should do. It’s valuable to know what stuff in real life did not match project expectations etc.

In terms of the workspace, it’s important that when the crew in the field is adapting/creating the as-build network model to reflect his work real life (in respect to some project telling him what to do), that he can do this “fiddling” inside a workspace, before the model is “pushed” to the as-build model. In other words, this workspace idea is not only useful in the planning context. It’s useful in all processes!

In the software world we have Git. Martin Fowler talks about it in the YouTube video linked to previously. Diff, merge, commit, push, pull etc. is concepts of Git, and we can learn a lot from that. However, as Martin also said, Git is processing text documents, and that’s more simple to deal with. Domain models in the utility are much more complex.

I’m still not sure how to model this workspace thing using events, and sometime I wake up at night, because my brain got a stack overflow. Need a whiteboard, some strong coffee and some creative domain experts and IT geeks to help out here. And after that, I give beers and some good food at Cafe Gran 🙂

Conclusion

I believe the key for success is to play and experiment with these aforementioned use case scenarios, together with some domain experts, on a whiteboard first. This way figuring out the most elegant way to model our events, and try the stuff out in a proof-of-concept test environment, in some agile process together with real users afterwards.

I’m confident that we can solve many of the problems mentioned, by using an efficient and flexible event store, such as the one Greg Young created. But as always, it crucial to try get a good grip of the domain and bounded contexts first, and this can only be done the agile way, if you ask me.

After that, we have the foundation in terms of easy to reason about event streams, that we can use to feed event stream processors and build projections in what ever technology that suits our problem – e.g. graph databases, NoSQL database, in memory object structures etc.

Also, the value we can get out of this event store is really fascinating to think of, in terms of analytics. Give the event stream to some BI-machine-learning geek, and they can do amazing things, I’m pretty sure of!

It’s depressing to think of all the information we throw out right now. Information that we’ll likely miss (I would guess, 99.9% sure will miss) when we need to optimize processes and asset management in the future. Right now most of our information is thrown away, and only the current state is updated in some monolithic silos, and the business will soner og later loose money because of this.

If you are from the utility world, and like to try this event stuff out in real life, please contact me.

Also if you have ideas, how to model the aforementioned concepts using events, I would be very happy to hear from you.

Avoiding the BBoM

In Greg Young’s event store, it’s easy to divide events up into multiple streams – i.e. using indexing projections etc. We must however think carefully how we do this, and not go into technology mode too quickly.

The more simple projections, the better, I would argue. And they should be aligned with business concepts, so that the power engineer can relate to them, otherwise the system may be very difficult to use, extent and for IT to operate.

I’m very afraid that we too soon end up with a BBoM containing many complex projections with tons of complex Javascript expressions. Just like we’ve seen with SQL – where tons of views, stored procedures etc. are messing up our lives for good.

But done right, I believe these event projections is a extremely powerful concept!

The GIS hack

gis_event_hack_0_1

As mentioned earlier, we have a problem with the current systems that houses our modelling data – e.g. the GIS systems that is typical used in the DSO world.

To get the benefits of event sourcing, we need to have an event store containing all events going on in the GIS (where projects and the as-build network are created and refined), that different processes can subscribe to in near real time. That is, we should be able to build up a state/projections by processing network change events from a event store.

I very much like to get started with this event stuff, also in DSOs that have modeled everything in GIS.

Because they will not be able to replace their transactional thinking GIS systems in the near future, we have to create a work around, to get events out of these systems in a fast and reliable way. After this, I believe it will be a lot easier to move the models out of the GIS grip, and into a event store, using the strangler pattern. The GIS should only be used for geographical data, and the model in GIS should be a projection from the event store IMO.

Anyway, the above rough sketch is my proposal how to get started with that idea.

First I like to get changes (CRUD events) out of the GIS database as-is – using SQL Server Event notification or something like that. That is, no transformation. This because, is will be fairly easy to create such event extractor, and then we don’t loose information anymore. Also the output will serve as a great event stream to test the GIS-to-CIM event processing/transformation etc.

Having a “raw” set of events from GIS, to be use for debugging, experimenting with event processors etc. is invaluable. Also it’s easier to be sure that we record all events, because the emitter we need to create is much more simple to develop, test and debug.

But of course, we don’t want to expose these proprietary GIS CRUD events to other systems – that would be insane! We need to map these to CIM change set (delta) events. But this is not as easy as it sounds, because we need to create objects, that don’t exist in the GIS etc.

In GIS we might have one transformer table, with all information regarding a transformer. In CIM we have the PowerTransformer, PowerTransformerEnd, RadioTapChanger etc. This is just an example, which is one of the easy ones to solve, actually.

What I’m trying to say is, that the mapping from a proprietary datamodel to CIM is not straight forward! And that’s why we need the 1:1 event stream, to test and debug the GIS-to-CIM event transformation processor.

But after we have successfully created a stream of CIM change set events, life will be a lot easier for us!

It will be fairly easy to create different projections, to serve the different needs, in near real time – i on the fly QA check when the users modify the network model. To do that, we must from GIS emit all events, including the events that occur inside versions or user sesions – i.e. by marking the events which project/users created them etc.

We have to deal with the fact, that the GIS is compressed at night, which mean that all history is gone. So we need to have som kind of consistency check after that run, that checks if the final state in the event store matched the GIS, by comparing all objects, and then create eventually missing change events. This is necessary, because we cannot trust the way GIS handles versioning and changes. We need to make sure, we don’t loose the ability to create a correct as-build state from events, otherwise we will get in trouble.

 

 

Skriv en kommentar