Description
Key Learnings
- Gain insight into how model-based design can transition to fileless design with a common object model at the core.
- Learn how automation and calculation services can capture engineering design knowledge for improved design workflows.
- Gain skills and inspiration around using Neo4j graph databases and GraphQL APIs.
Speaker
- Will ReynoldsWill is a software developer and solutions architect with over 17 years experience automating workflows and leading the development of a multitude of applications and their supporting data platforms. He has an extensive experience of the Revit API, as well design automation knowledge in the context of MEP design. Will has a particular interest in the application of graph data to MEP design data, and has spoken at AU and other conferences about the potential and practical application of this approach.
WILL REYNOLDS: Hello, everyone. Thank you for listening to my talk on MEP redefined, using automation and file-less related data for event driven MEP design.
So a quick intro about me. So I am the applications developer and solutions architect at Hawley, where I've been for 17 years. So quite 17 years experience in the AEC industry. Over that time, I started developing Revit add ins but quickly became frustrated with all of the data silos and the issues I was seeing with MEP design, and I wanted to change things. So that's when I started about five years ago with a series of talks.
So I also, outside of work, I enjoy hiking and walking in the countryside. Just this year, I was in Lake District with my children. A couple of pictures there. It's a beautiful setting. If you're ever in the UK definitely worth visiting the Lake District.
So a quick agenda then of what I'm going through today. I've done the intro. Then we're going to move on next to a bit of context about talks that have led to previously to this talk. And then we're going to go into what we mean or what we mean by file-less architecture. And then we're going to move into GraphQL and how that meets the-- how that achieves a file-less system, really. And then we're going to look at the AC data model and the origin data model, two GraphQL enabled endpoints.
And then we're going to look at Neo4j and how that's a very good companion for GraphQL. And then we're going to look into how-- well, the actual subjects of this talk, which is the event driven, how GraphQL achieves that. And then we're going to have a look at former and some other application services. And then finally, we're going to look at some example workflows.
So some useful context to this talk. So in AU 2018, I did this talk on Graph data and how Neo4j was a good fit. And then in 2019, I did another talk on GraphQL. And again we'll touch on this a bit later. Just some useful context leading up to this one.
But first, let's go into defining what file-less actually is and what it solves. So what is file-less. So broadly it does mean not saving to a specific file or model. Generally, it's the migration of data from the edge, what we've called the edge to the core. And in doing so enables fine grained access to the objects within the data.
So this circle over here on the left, the outer ring signifies the file stores, which the inner ring-- the applications save the data against. So in the case of Revit, it has a view on data which it stores in its own files or cloud storage. And then in the case of Excel, the engineers might have files stored in network drives. Again, that has another view on the data, as well as Grasshopper and Rhino and [INAUDIBLE] simulation services and other applications as well. All have their independent views on the data but store their data in possibly siloed places.
So the idea with file-less is it gets-- it moves data from the edge into the core. So each of these applications retrieve the data from a single source of truth, which contains the entire data model. Instead of having separate views on the data on the edge, it's now as one in the middle.
So what problem are we really trying to solve with this? So as I mentioned, it's about the size of data. It's about that data which is locked away in proprietary formats, which is inaccessible unless discoverable locations. And also there's the issue of translation of data between application object models, and not-- most of this data is only accessible to users who have licenses and skills with the software.
And then another particular issue is around model performance. So many of you will be familiar with downloading or opening Revit models and waiting for ages for those models and the dependencies to download. With a file-less architecture, that shouldn't be an issue. Because you're only ever going to get the data you're interested in working on in that particular time.
And then there's also the repetition around if you're working between two different applications, you might need to do some model translation, you might need to do some import and export and then do some repair tasks in maintaining the integrity of that data. So how do file-less principles solve this? Well, it enables an object level data query and update, where each application is responsible for translation to their own object models.
So the core object model doesn't necessarily have to know about the application models in each of the applications which concern it. So then there would be no requirement for users to translate or fix model data. But interoperability is still important because files are needed for data transfer between some organizations and some applications which have very complex geometry generation as well as archive. So in 50 years time, when the applications that work, read and write that data, and the services which run these applications no longer exist, we still need to get access to that data.
So the common object model. So this raises a new problem. So a file-less data requires a paradigm shift in who's actually responsible for the data. We need-- so all of these applications need to agree mutually agreed types in a shared schema. They might need to be some mapping between the native object model and the shared design model, which would be the responsibility of each of the applications. But there also might need to be some mapping between file formats and shared design model for the archive purposes, as well as sharing design data.
And also with this would need to have an API to push data, to push and pull that data. So introducing GraphQL. So what does GraphQL have to offer for this? So this is covered elsewhere and also in the previous talk. But to recap, it provides a framework for a shared object model API.
It can adopt any model as its basis. Doesn't necessarily need to invent a new one. And of course, the nature with of GraphQL is that it will only return the data that you've asked for. And it can support fine grained filtering and pagination. And of course, it's self-documenting through the introspection API of GraphQL itself.
So let's have a look at GraphQL in practice. This example is looking at the AC data model API. So this particular API has a very succinct level of abstraction. Starting from the project it has-- a project can contain many element groups of a certain of many versions, and each element group contains an element which can reference other elements for the reference property, and each element contains many properties with a property definition.
So it's going to quickly get into an example. So yeah, that's fine. There we go. So these examples are already provided through the AC data Model Explorer. Once you've gone through all of these stages, you be able to get the ID of a particular element group. So you can just run that query against the endpoint. It will authenticate for you.
And then we get into data that you just saw. It's the same data. It's just come back. Only the data that you've asked for. And the particular difference with this is that it has an element property with the category. So we've asked for all elements of the category ducts.
So going back to this. So that's the Autodesk data model API. So this is an example I used to explore it with Apollo Studio. I've already covered that in the previous slides. So let's look at the Origin GraphQL API.
So Origin is the name of a product that we're developing internally at Hawley. So we actually went a little bit nuts possibly. We went for a very detailed level of abstraction. So that starts-- similarly, it starts from the project. A project can contain many buildings, and the building can contain many models, but each of those models can contain spaces and levels. But we also went out and created these abstract interfaces for an entity and a distribution.
And a distribution can be pipes, ducts and the pipe fittings. So that's your MEP distributions. And entity can be equipment and other terminals like supplier terminals, extract air terminals, and their type as well as electrical outlets and [INAUDIBLE] units, all that kind of thing. So this doesn't represent all of them. I've reduced it significantly for brevity, but in terms of how we define that schema, we use GraphQL SDL files where we include the fields and the relationships.
And go through another practical example of how we're doing that. So in this one, we're going to do a query for all spaces in a particular project and a particular model. So we're going to go look at the sample projects. We're going to get all spaces in the model. We called this model here LoadsASpaces model, 2024, and all spaces which have an area greater than 50.
So this is another virtue of having that detailed schema in that we can create these aggregations, we can create the fields and filters based on the field names which are discoverable through the introspection API. So let's go and run that query. So there we go. Let's come back with the data we've asked for. And then we want to now going to explore how we can subscribe to any changes in that data.
So this is going to set up a subscription. So we're going to look at all spaces and we want to look at when an area or a number of volume changes. We want to get the information when space changes of how these have changed. So initially, we set up the subscription. This creates a web socket. So then we're going to go to the next stage where we are going to send a mutation to change those spaces, and we should see the response come through here.
So we're going to these two spaces, which we retrieved from the previous stage. We retrieved those IDs. We're going to change the area to 1.50 and change the sockets to 3, and we're going to multiply the volume by 2. It's a bit of an odd thing to do. You wouldn't usually do that, but it's just by way of example. And I run that query. It's pretty quick.
So this was the result of the mutation. And this was what came through the subscription. So you can see that for the two spaces we just changed, it's given us the volume before and the volume after, as well as the other fields which we changed before and after state of the two spaces. Some other things that we can do by virtue of having this complicated model or schema is, we're going to look at distribution. MEP distribution.
So in this example, we're going to find all duct segments and all pipe segments. We can also get other information. So we're getting the overall length. You could also get the name. So let me hit the send the post-request. And there we go. The data comes back as expected.
So the idea here is obviously we're not expecting engineers to get involved with GraphQL. This is just to illustrate how an application could consume those events and how it could respond. Two days.
So how did we achieve that common object model with GraphQL? So we, yeah, as I mentioned, we chose a very detailed level of abstraction. So the GraphQL types themselves were in source control. And that was intentional to establish a democratized object model for AC types within our organization. And we also made heavy use of the Neo4j GraphQL library. So that does all of the boilerplate stuff for us, including the mutations, including subscriptions.
So we have-- so we can get a complete API just from those schema files. And so it also provides authentication and authorization out of the box, including subscriptions, which I'll make sure to go into more detail a bit later. And also uses Neo4j database backend. More on that a bit later too.
So those GraphQL schema files also have the potential to map to existing schemas such as IFC and other file formats. So in terms of advantages and disadvantages of that level of abstraction. So the AEC Data Model is highly flexible, so those types of property changes. So any changes to types and properties don't require schema updates. It's a very concise schema, and element properties and filters are through string queries or RSQL. But it does mean that detail types and the filter options are not discoverable through introspection.
And then on to origin. And the one that we developed. As you saw, it has a more discoverable object model, discoverable filters and options. It enables field level aggregations. So you can get sums on. So in the example I showed, we were summing or we were finding the max of the area for each space, but you could do any other field and any other mathematical operation you wanted as an aggregation.
And it also enables top level-- sorry, enables type level connection queries. From a space, we can-- sorry, from a model, we can find all spaces in that model, we can find all levels that those spaces are on. And all of that is in-- all of that information is discoverable through the introspection API in terms of what we can query. Yeah, and then the types are visible in source control, but this does lead to a very complex schema.
And one thing we found in the applications like Apollo studio, they sometimes struggle with the level of detail in our actual schema. Especially in terms of spaces where we have hundreds of properties. Or hundreds of fields. Yeah, so coming on to the issues we found.
So in building that origin API. In terms of the code base, we found traditional planar class objects or object relational mapping tools generally require a complete object. But the advantage of GraphQL is we only query the fields we need. But that leads to some of those properties on those objects having being null when they actually do have data in the database.
And depending on where your objects lands, or ends up being used within your stack, you wouldn't necessarily know which fields were requested in the first place. It also means that some related objects will be missing. So again, in the case of spaces and-- sorry-- in models and the spaces in that model, if you just retrieve a model, then the spaces will be, again, will be empty. Even though they are, the model does actually have spaces. And this raised testing and object model challenges.
The some other issues we found when creating multiple nodes, particularly when there's related objects, it can be very tricky. So for example, we needed to create levels and spaces. If we need to create then both at the same time, and this single mutation. So say we had one space-- sorry-- we had 10 spaces on one level. Free one mutation that would potentially create 10 levels in each space on a single level. So that enables-- so in order for that to work, we would have to send a level mutation first, and then send the space mutation to relate those spaces to the level.
And that causes some issues as well, because partial mutations are possible. So if one of those queries fails, even though the database itself supports atomic transactions, it would leave the data in an inconsistent state. So that necessitates building some kind of transaction model at the application level or API level.
Another thing that we found in sending large, mutating large amounts of data at once is it would hit a request limit with the Node.js servers we're using. So we just had to increase the request limits to get over that. But there was always going to be a hard limit on the size of that initial API call.
Another thing which is great about GraphQL is GraphQL Federation. So that enables us to combine data sources into a single endpoint. It also handles the cross referencing and allows separate APIs and microservices to appear as one, but be maintained separately. And this is useful in AC because it can bring together different libraries of data within your organization.
Again, it's supporting that file-less architecture by doing so. And it could even combine the AC data model from Autodesk with your own, and supplement it with data from a single endpoint. So that's all very well for data within individual organizations. But the actual nature of the data we're sharing is we are sharing it between separate organizations.
So we don't have a final solution for that at the moment, but possible some examples could be in a centralized design. Each organization pulls and pushes, and pulls the data from a single repository. But that raises issues in who actually owns the data. So there's another option of a distributed model where each organization is responsible for storing their own data, and there's a gated exchange of information between each. So they each organization would control what data that can be shared with other organizations.
So I actually wanted to recap at this point on Neo4j as a file-less BIM data store. And this kind of links back to the first talk I did. This is the case for semantic modeling. So this is a web app we built back in 2018 when it was called, and it used Autodesk Forge, as it was called at the time. In this case, we uploaded our Revit model of our London office, as well as publishing to a graph database.
So there's our London office. And in this case, we ran-- we were running Cipher queries directly and rendering them in the web app. So this was-- the idea behind this is illustrating how the graph represents systems semantically.
So in that first one you saw-- there we go back to this one. So this is the entire electrical network within this model. So we have two circuits running off this distribution panel, and these are all luminaires on this circuit. And these are all the sockets on this circuit as well as some other electrical flow back to some other distribution panels and some connection units. And then the next example that I want to show is finding the cable trade routes between two rooms.
So we're going to look at this room here and this room here. OK, let's just move on. There it is. So the space for this number, 112 and this one there, 127, you run that query. And it gives us the graph, and then we click on the graph, and then it highlights. There we go.
So these are all the networks between-- sorry-- the networks? These are cable trays between these two rooms. What if we wanted to find the shortest path. We wanted to run a cable tray-- sorry, a cable down that cable tray. We have an algorithm for that. So it's this one. So again, that graph algorithm has found the shortest route.
And in this particular example, we wanted to find the airflow paths between these two spaces. Again, it's a fairly simple query to do that. And it's returned to us the network between those two spaces. Another one that we looked at is the airflow paths between this meeting room and the air handling units. So this is the supply and extract routes.
And then a final thing I wanted to show is how it can also represent JSON sees. So this is the space here, the meeting room and the surfaces, which bound that space. And the walls that those surfaces are on. That includes windows, walls, windows, doors and ceilings. So Yeah. So as you saw, it supports semantic modeling. So the idea also enables a continuous design thinking from database to physical systems. It supports algorithms and being a graph database, I loves it, of course.
And this is broadly how it looks. So electrical system is look something like this. As I just before, but a duct network could look something like this. So a duct or duct transition. The added benefit is we can establish this relationship. So from a terminal, it establishes the airflow relationship between the terminal and the space.
OK, so introducing then event driven design. What is it and what does it actually solve? Let's start again by defining what event driven design is. So what's an event? Amongst programmers it'd be very familiar, what it means, but broadly it can be any change. So add, remove, update, or delete, or a signal such as a click button, press a sensor value, which can be a change to a field on an entity. It can be a completion of a computation, such as a heat loss simulation, or it can be a transition project, stage transition. So it could go from a design stage to a more detailed design stage.
So each-- so in order for these events-- so these events have subscribers which do something in response to the event. And then those subscribers register for specific changes and then get notified by changes of interest.
So what problems do event driven principles solve? Again, it's very similar to the data silo issue, but this time with disparate workflows. So we're solving some application constraints where users must manually visit an application to contribute to a design workflow. So they wanted to change something, they would have to open up Revit, make some changes and then save and synchronize those changes with the central model. And also requires-- so in order to do that, it would require some-- in order to do that would specialist skills and knowledge.
So in the case of Dynamo, where we only have a select number of people who are skilled to actually design those scripts, though we have overcome that to some extent with Relay and Dynamo Player. But again, those scripts are only accessible by opening the software itself by people who have a software license to have a software installed and have a license. And that introduced the problem of knowledge silos, where even though Dynamo is very accessible, it is usually known how it's designed by one person. And if they were to leave, we would potentially lose the ability to maintain that script. And potentially lose the functionality of it completely.
And then interdependency is not always clear as well. So for example, we might have a script that's designed to do something with mechanical equipment and another which is designed to-- which designed to do electrical loads. And they could be-- there's a relationship between the two, which might not be clear. Those two scripts might operate independently.
And then there's some repetitive design issues as well. So following a model update, often the same tasks are performed, re-performed manually. So in the case of our lighting engineers, there's a change to the model, and they need to change the luminaires placed in all rooms of a specific type across-- sorry-- or luminaires of a specific type across numerous levels.
And our engineers also may need to update a parameter of space across hundreds of spaces of the same type. I mean, again, there's scripts and ways of doing that through automation, but they also require opening up the software and running it and then synchronizing it. So how do event driven principles solve this? So it's the migration of functions from application bound processes to microservices.
Well, there's no need to open the application to actually run this. So we're moving things from this outer ring and the applications that run them, we're inverting that. So it moves. It looks something more like this, where those applications just form runtime environments and those functions, and it responds to changes in the object model. So in the single source of truth object model.
So in the case of Revit, where before you might have had to open it and run an add in, and the adding could respond to the change. In that case, we mean this is idealized. Some applications don't currently work this way. So in the case of Revit, that could be through design automation.
But with Excel you could switch this around. So where engineers have built a spreadsheet with some calculations in. We can turn those that spreadsheet into a function. So it's kind of a headless spreadsheet with a-- yeah, with those calculations as bits of code. And then similar with Grasshopper.
Again, we can use Rhino compute to achieve something like this, where those functions or those scripts listen for changes to the model and then make changes directly without having to open the application itself. And that's similar for other applications as well.
So then, as I mention, the calculations become microservices. So each calc or logic service can be tested and verified independently, and each calc or logic service can be versions and it supports sandbox development. So you could build [INAUDIBLE] pipelines where when there's a change to a particular calculation, you can run that through, check that it works, and then report if it works or not for a simple interface.
So is there more to data than its properties? I think, and relationships? So yeah, that's pretty much what we're showing here, is that as well as the properties and relationships, that's what to do with the data. And when. It's about logic that's required and to do that and to create that logic that involves creating and deleting elements as well as changing their parameters.
And it captures the engineering knowledge for automation and reuse. And it's never a final solution. So once you've built these automations, they still need to be continually evolved by building, testing, and maintaining the calculations. And this is broadly where I think Autodesk Forma is headed.
So we have things. So it comes to something like this. So we have-- and we have the idea of intelligent entities. So each entity has fields and it has a behavior. And the fields are captured by the entities, and the fields are captured by the single source of truth, the file-less platform. And the behavior is captured by this event driven architecture.
So the entities themselves usually are in a system. So the system is comprised of these entities and that itself also has fields and behaviors. So we have the concept of intelligent systems as well.
So this is an example of an event driven workflow with GraphQL subscriptions. So in the case of Forma you could use that to create a building outline and create some units or rooms. You can convert those to spaces and have through a custom application and update the graph with a graph computation. And then you could have something listening to those changes. So that could kick off heating and cooling requirements. And it could also kick off a process to set some equipment requirements.
Again, they will update the graph, and something else will respond to those changes. So we could have a function that will add equipment to the spaces, and upgrade the graph again. And then we can have some more services using one of our Excel based calculations and calculate some electrical loads. And we can have another service which could be using Rhino compute to do some other mechanical loads.
So this is what I showed you before in terms of the GraphQL subscription in action. So this is the description you saw me sending before, and this is the data that comes back once this mutation is sent. So you can actually do this now. There are solutions, as I mentioned before, we have Autodesk Forma.
So that enables a far less an event driven-- it is file-less and event driven from the platform UI as well as the Autodesk Platform Services, AEC Data Model as we've seen, which is GraphQL. And that also allows for fine grained access to model data, which can be presented in any application. But they obviously there's no mutations through the API. So that all the changes would be going through the applications themselves such as Revit.
But also we have special automates. And that also now has GraphQL to some extent. So that enables controlled sharing of data streams between organizations. It also enables event driven automations and has many integrations with other software. And another excellent service is Viktor. So Vikor and CalcTree are calculation platforms. Again, they allow event-driven calculations from the user interface and also have many integrations.
As I mentioned before, to support those calculation services, you can use things like Rhino compute, we can use the Autodesk design automation, Autodesk Platform Services, design automation. And of course cloud services from Azure AWS for any containerized applications, lambdas, functions, as well as Microsoft Fabric could be used for that too.
So why does this matter to engineers? Well, it enables them to design in a way, which, as we saw with the graph data, it semantically matches the things that they're designing. And that can-- well, the idea is we would then be able to give them easy to use user interfaces where there's no awkward data translation between tabulated data or other different applications. It enables them to have more systems design thinking so they can get back to the engineering and not have to wrestle with different applications.
So moving on to the workflow examples. So for this talk there is workflow examples on my GitHub repository to look at. And so great. And now what? How to apply this within your company.
So you can try out a Neo4j GraphQL Python example from the repo. You can use the data model API. It's definitely worth trying out Forma as one of interconnected data. Definitely give Speckle a go and Victor and CalcTree. If you really want to get into it, I would recommend the Neo4j GraphQL package. That is the most complete package that we've found, but it does require some foundation. So you will need to set up a new database.
There's various options around that, but probably the best option is to use Neo4j Aura, which is a cloud based-- cloud database as a service offering from Neo4j. And then definitely in building that you will have access to the full range of GraphQL subscriptions offered by that service. So thank you for listening.