Description
Key Learnings
- Learn how to assess raw data and identify opportunities to normalize the data for consumption by tools
- Learn how to define a process for converting raw data to a usable data model
- Discover how users can migrate data processing out of their scripts and into tools better suited for data manipulation
- Learn how to build a process for generating a variety of graphic outputs based on data
Speakers
- RMRobert MannaRobert works for Stantec in the Boston office, he has been a key team member on multiple projects, and he now serves as the BIM R&D leader for the firm as well as providing business consulting services for clients implementing BIM. He has taught internally, provides high level support as well as planning and implementation of new tools. In 2013 he led the organization of the first annual RTC Design Technology Summit, he has spoken at: RTC NA, Autodesk University, has been a guest lecturer at the BAC, and has presented at BIM events hosted by the AIA, ACEC, Autodesk and resellers. He has written two articles about Revit for the AUGI AEC Edge magazine, and has written assessment questions for KnowledgeSmart. He maintains a personal blog dedicated to Revit & BIM. When he has time he hangs out with his wife and two year old daughter, and enjoys skiing, swimming and biking
- Esra AbumounsharI am the generative and computational design lead at Stantec. I have been working on adopting these tools and implementing them across my company. My work include Healthcare, High Ed, Mixed-use & Master Planning.
ROBERT MANNA: Welcome, everybody, to our class today. My name is Robert Manna, and I have with me today Esra Abumounshar to talk to you about Planning Driven by Data, Organize, Evaluate, and Illustrate. So we're here today to talk about how we can use data to help drive our designs. And I'm a project manager, product manager at Stantec, and so I get involved with helping our project teams adopt and use technology in a wide variety of ways and methods, so, Esra.
ESRA ABUMOUNSHAR: Hello, again. I am Esra Abumounshar. I am an architectural designer and the generative design lead at Stantec, so thank you all for joining us. Hope you find this useful.
ROBERT MANNA: OK, so what do we really want you to take away from this session today? And Autodesk has asked us to try and keep this session short so there's a lot of time for Q&A as part of the official Autodesk University presentation. So what are the key things that we want to talk about? And one big thing is that don't process the data in scripts. Move your data processing out of your scripts and into some other tool. And we'll talk about the tools that we used, but ultimately, there's lots of tools that you can use.
And so when you start saying, well, how do we deal with data, and how do we think about data? There's really a triangle of specialties that you need to be thinking of. One is your data architect or wrangler. The second is somebody or a role who's going to help develop scripts, and maybe also do some straight-up coding as well. And then the third role that we have sort of in this triumvirate is somebody who's dealing with the output and visualization.
Depending on project size and scope, this doesn't necessarily mean you need three people, but it's a useful context to use when thinking about what do you need to accomplish, and what are you trying to accomplish, and who has what responsibilities where? Even on our own team, we did have three people involved, but there was overlap between some of those specialties or some of those roles. So it doesn't have to be perfect, but it is a good way to sort of put your project in context.
And there's just a lot of gains to be had from taking the time to architect your data model and build a good data model that you can use with a variety of tools, be it Dynamo, Grasshopper, Power BI, or whatever. And really as an industry, this is a place where I believe there continues to be underinvestment on our projects when we go to deal with data. And because there's underinvestment, we also underestimate the amount of effort that's required to go about doing this.
This is still work. And there's opportunity to do that work efficiently and effectively, but it is something that we historically underinvest in when in the design space when we're talking about we're trying to accomplish. So our case study that we're using to sort of demo things today and talk about what our process was was a large master plan for an urban health care campus. The goal from the outset, or actually at least, was that our project team was going to be developing a 10-plus year master plan for this particular client.
And the client has multiple roles, right? The client owns some of their own property. They're landlords for some of their properties. They lease some of their property out. And then they're actually tenants themselves in other property all on this campus that they're on. And it's a massive campus. As I like to say, depending on how you count it, it's either 9 million or 12 million square feet. So just a minor factor in difference there, and then it comes down to how you're counting your square feet and what you're counting.
And our goal ultimately was to be able to provide a 30,000-foot master plan view of where this campus needs to go in the future. And one of their issues is they are just flat out of space right now. And that was where we started is what are their existing conditions? Where are they today? Is it really as bad as everybody thinks it is? And then the project team could go ahead and develop that master plan looking forward over the next decade or so.
So Esra and myself were not actually part of the project team per se, though we obviously got to know the project team quite well and worked with them quite closely through this process. But the project team came to my group that I'm part of and said, you know, we want to be able to do this better. We want to be able to do it faster. We want it to be driven by data. We need it to be visually compelling. We needed it to be interesting, and we want our clients to really engage with it.
They also talked about things like multi-platform support for being able to view and access the data. There was also some thoughts around being able to navigate models in real-time, and potentially manipulate those models in real-time in client meetings, particularly for the master planning piece. And so the reality is we didn't achieve all the objectives, but that's where we started.
And so with given that that's where we started, my first goal as more of a project manager in this particular case, but also a product manager, was to really figure out what was our scope? What are we being asked to do? What are the constraints? And then really that led immediately into assessing, and organizing, and validating the data that the client had provided to us. Another couple of key things that we needed to accomplish was we needed to be able to create some 3D models. So we needed to have a context.
We also needed to have a campus model. We needed to write some scripts. We knew we were going to very likely use Dynamo. We looked at Grasshopper, but ultimately, decided Dynamo was the right choice for us in this particular case. So we knew we were going to need to write some scripts. We also suspected we would need to do some development in Excel, as well as developing that overall data model and creating the outputs, and actually defining what those outputs need to be coming out of this data that we were going to be generating.
So the reality is that this is not a single tool, right? And that's where I always get questions. Everybody is like can you do this as one tool? And what I like to say is it's a toolbox, not a tool. You wouldn't go and combine a hammer and a saw, though you can actually buy a hammersaw. I looked it up. It's on Amazon, though I don't think that would be a very useful tool if you're trying to do fine woodworking. So the reality is we had a bunch of different tools.
We had Excel. We had RBI. We had Dynamo. We had Revit. And then lastly, we had the Adobe product suite for processing. So it wasn't linear either, so we didn't go from Power BI into Dynamo. We actually went from Excel into Dynamo and Excel into [INAUDIBLE] But the point is that we built a toolbox and not a tool. And it's about making these things work together to generate the different outputs that the team is looking for.
So let's talk about data because that's really the central core message that we want to bring to you today is it's data, data, data. And by the way, Excel turns 40 next year in 2022, so I'm a little older than Excel. But I'm guessing a lot of people who are watching this video are probably a bit younger than Excel at this point. So Excel has been around for a long time. It's a good tool, but there are other tools out there as well obviously.
So client data. They give us data all the time. Most of the time, it's not as good as they tell us it's going to be, right? In this case, the client gave us a 300-row Excel file and told us it was all the data we needed, and it was perfectly good, and we would be all set. Yeah, that wasn't the case. There was very limited hierarchy embedded in the data. There were very limited dimensions, so there was no category. There was no care service definition.
And the granularity of the data was at the suite level, which is not necessarily a bad thing. But again, we were given data that we were told was perfect, you know, we were all set. And again, that wasn't the case. We more than tripled the amount of data to where our Excel file now has around 1,000 rows of facts. They are defined in that file that formed the core of our data model. This is not unique to this client. Again, anybody who's listening, I'm sure you've dealt with this before where you get data from people, and it's not what you were promised, or it's not complete.
But this is another example. This is a spreadsheet that was given to Esra actually, as an example of a functional space program for a health care project. And again, this is a great example of, like, yeah, this is maybe really useful for the person who was creating this data initially, but it's not useful if we're trying to build a really solid data model, or rather we've got to extract the data out of this and get it into a format that's going to work really well for our purposes in what we're trying to accomplish.
So in this case, you've got multiple worksheets with similar or repeating data across all those different worksheets. There's no idea of relationships between the data. There's no lookups being used. If you're old-school, you probably know about VLOOKUP in Excel. I'd highly recommend XLOOKUP if you don't know about XLOOKUP. And there's not any sort of explicit hierarchy or dimensions to the data in these spreadsheets either. So this is going to certainly be a starting point. It's not unique. We see it all the time.
And this comes back to my earlier comment that we underinvest in terms of making sure that our data is clean and well-organized so that we can produce good results. So what is this whole thing that I keep talking about with dimensions, facts, normalization? So normalization is taking this data that is in some sort of denormalized state-- like the Excel spreadsheet that we were just looking at-- and getting the data normalized into a state that is actually going to be useful for building a data model.
So normalization is basically a fancy way of saying remove duplication. That is the core of data normalization. If you're interested, there's a lot of vocabulary and theory that goes with data and data normalization. You can spend hours reading about it on the web or watching YouTube videos. So I'm going to try and keep this fairly simple, but relatively high-level. But the key to data normalization is looking for repetitive data that can be broken out into its own table. And then it's going to be related back to your primary fact table.
So what we're looking at here is, again, that raw data that we received. And everything that's highlighted-- and some of the stuff is blurred out because of confidentiality concerns-- but anything that's highlighted is data that's repeating. And I didn't even highlight all the repeating data here, but I highlighted some key portions of the data that is repeating. So that's data that we want to break out into separate tables and relate back to our facts. And what are our facts?
Our facts are that we have individual suites, and those suites have a certain amount of area. That's the fact. The fact is that it has a certain amount of area, and maybe it has a suite number or some other unique identifier, and that's it. All the other information about that suite is going to be what we refer to as dimensions because that information repeats, whether it's the level that the suites on, the building that the level is part of, the department of the suite it's part of, the entity, et cetera.
Those are all dimensions. They help us organize, filter, slice the data that we're interested in. Additionally, as we can see here in this particular example, you've got this last column that says department slash use. So that column is combining two different dimensions of data into a single column. So again, a great example of where we need to split some of this information out into its own columns and into its own tables.
Lastly, the stuff in the middle that's crossed out is data we just don't care about. It's not useful for our purposes, and we can get rid of it and not have it cluttering up our data model. So going forward, data normalization, if you start looking for these patterns, data that repeats, and what can I break out into own tables. The levels is a good example. We can break levels out into its own table. And individual and unique levels may belong to individual and unique buildings, again, depending on your circumstances. So that's one example of normalization.
And that's two levels of normalization. So we're saying a suite is on a level, and a level is in a building. Additionally, departments are another good example. You might have subdepartments that roll up in departments, and then departments might roll up into service lines. And service lines may roll up into, in our case, what we called areas of care. So again, this is a really good example of starting to normalize the data, and also starting to establish what we're going to call hierarchies in terms of hierarchies of data.
So a good example will be service line, department, subdepartment. That's a hierarchy. Building level suite, that's another hierarchy. And what we can also see here is another good example of when you're cleaning your data up, you never want to have nulls. You don't want to have empty values because empty values like this, for example, oftentimes make it really hard to track down.
So it's really good to put in things like not assigned, not defined, missing, because that will help you clean up your data and spot issues in your data that you didn't catch initially when you were starting to organize your data. So really the key is that repetition, breaking data out, and looking for those patterns, looking for those relationships between the different pieces of data. So what are the keys to good data normalization? Unique IDs. We need to be able to uniquely identify all those roads.
Oftentimes, people want to use name as unique ID because there's not going to be another department named cardiology. The problem is what happens when somebody says, well, we want to shorten that to cardio. Well, what if your client has a totally different name for the cardio department that doesn't even have cardio in the name?
So use unique IDs to identify all these dimensions and facts. Names are just additional columns and that way you can have multiple columns with different names, shorter names, longer names, abbreviated names, whatever. Those all just become different columns. They are not the key to your relationships. So unique IDs are really critical.
Deduplication, as I mentioned earlier, normalization is all about removing that data that repeats. Maintain those relationships. As you clean up the data and organize the data, you've got to maintain those relationships between these tables that you're adding to, otherwise, your data model is going to break, and you won't have something that you can work with.
Lastly, names, in this case, not the names of the departments or the names of the suites, the names of your columns, the names of your tables. Establishing a naming convention for yourself will help with maintaining consistency. It will help you when you're working on the data model. It will also help your partners and your colleagues when they go to interact with the data as well because that naming convention will hopefully, help convey what the different columns or what the different tables are intended for and what their purposes are.
So when we're talking about data normalization, the ultimate goal is to get to a data model that looks something like this. And this is what Power BI happens to prefer. So there are several other data models out there. This is what's called a star. There's also a snowflake, and there's at least another one that I can't think of off the top of my head. So there are different common data models out there. But if you're interested in using Power BI or even Tableau, those products are really interested in stars.
And stars are sort of the easiest thing to wrap your head around too as you're getting started. And so with a star, what we have is our facts in the middle. So there's our spaces, or suites, or rooms, or whatever the case may be. And then we have all of the dimensions around the outside. And this is very literal. You should be able to apply a star to your relationship diagram, and it should work. You should be able to say, yeah, that looks like a star. I have all these points with my facts in the middle.
In addition, this is also where you can start to think about your hierarchies because if you have another table, and another table all linking off each other, there's a good chance you can consolidate that down to some sort of hierarchy, again, be it neighborhood, building-level suite, areas of care, service line, department, suite, et cetera. There's hierarchies that we use all the time. Again, Power BI, in particular, deals with hierarchies quite well. And hierarchies are going to be the key to getting into aggregations.
And your dimensions are keys to getting into aggregations. How do we want to aggregate this data? And what I mean by aggregate is either sum or total, or add things up together. And so that is our ultimate goal. How do we want to aggregate the data because you probably have an executive or somebody saying I want to see X by Y. And when you hear that "by" word, that is an important telltale that they are looking for an aggregation. I want to see all of the departments by floor, or I want to see all of the departments by building.
That is an aggregation of the data to answer that question. And so other words that you might hear people say is I want the sum. I want the total. I want the average. These are all going to be aggregations of your data in some way, shape, or form. And so a tool that's really, really facile and really powerful in terms of aggregating data in Power Query. And so Power Query is what we'll show you in a moment when we get into the demo. Power Query is in Excel and Power BI, so it's a transferable skill.
And it is a very, very powerful tool for transforming, aggregating, manipulating data. And so the real key here is, in our case, we had Suite. We weren't really interested in suites. Nobody cared about suites except when it came to validation of the data. Our executives, the leadership on the project team, what they're addressing is I want to see all the area by category, by department, per level of each building.
So we need to aggregate that suite data together into those sums or those totals, and that's the information that we then pushed into Revit through Dynamo or that we just evaluated in Power BI and/or a couple of other tools that we looked at. So, again, what are we all talking about? As one of my favorite YouTube channels likes to say, let's take it to the computer.
Let's take it to my laptop, and let's take a look at what I'm talking about in terms of what are we doing, particularly in Excel, and how are we aggregating data. So I've got an example Excel spreadsheet here. This will be available as part of the class materials. There's a handout too that hopefully, you can get reviewed. So you'll be able to download this file. And you can play with it all you want, and it will literally be in the format that you see it today.
And so you'll see that I actually have some tabs in here for demonstration, as opposed to the complete one, so if anything goes wrong during this slide demo, we're covered. We can pick up and keep going from where we are. So in this Excel spreadsheet, I've got this Rooms table here. And we want to have area of department, which is conveniently, I've done it once already. But we want to add up all the area by department.
But when we look at rooms here, we say, well, we've got subdepartment, but how do I add up my area so that I add them up by department when I don't even see subdepartment? So I've already done some work here obviously. And in this case, if we go over to this table here, you can see we've got SubDepartment Demo. And so here in the SubDepartment table, we've got those IDs. Again, those IDs come back right here to those IDs right there. And then you can see we've got Department, so maybe I want to break this out into an actual Department table.
So I want to normalize my data further. I don't want to just have department and subdepartment in the same table, I actually want a separate Department table. And again, normalization also helps with keeping things up to date, adding new information, maybe there's a typo that you need to fix the spelling. Normalization helps you manage all of that. So, all right, I want a Department table. So I'm going to come in here. I'm just going to copy that data I'm going to come to my Department table here, and I'm going to paste my data.
And I want only values. We don't need to bring that formatting along. So all right, so there's my departments. And I'm going to follow my naming convention where I'm going to call that Department Name, and I'm going to call this column Department ID. And I'm going to say, OK, I need to reduce the duplicates. I know I have duplicates here. I can select my data. On my data tab, I can come over here, and I can go to Remove Duplicates. It's going to say did I mean to select that? I'm like, nope, I want to just continue with what I've got selected.
I'm just going to dedupe that one table there, and there we go. We've removed our duplicates, so now I can come in and I can create my IDs. And there we go. We've got my IDs. We've got my department names. Now I am a huge, huge fan of formatting everything as tables when you are doing this kind of exercise. Use this wonderful tool here, and format everything as a table yet. My tables got headers, and I'm immediately going to come in here, get a table design, and I'm going to give this a name.
And I'm going to call it Table.DeptDemo, so I don't have any overlap with the table that I've already got, and there we go. Now we've got the table. Now, you're saying to yourself, Robert, you said maintaining relationships. How do we relate department back to the subdepartment? Now, in the handout, I cover a more formulaic way to do this, but since we're only dealing with three departments here, I am going to cheat a little bit in the interest of time. And I'm going to just change the column name here to Department ID.
And let's see, Oncology is going to be 1 because it was the first one. I know Cardiology is 2, so I'm going to plug in 2. And I know Public is 3. And I can come back here, and I can double-check on it and say, yep, Oncology is 1, Cardio is, there we go. So I have now normalized the department data further. I've created this department table, and I've made sure that I've maintained the relationship of the department back to the subdepartment.
So now I'm in a position to say, OK, now, I can go ahead and aggregate the data to answer the question, which is how much area is there by department? How do I go and do that? So again, a really easy way to get started here. I can right-click on a formatted table, and I can come down here, I can say get data from table range. I'm going to go ahead and do this. Again, because this is a cooking show, things are already in the oven and baking for me, so we don't have to wait for everything to bake.
I just have to wait for Power Query to load up here. Hopefully, this won't take too, too long. In the meanwhile, so Power Query, again, really, really powerful tool, really, really helpful. You'll find that in the handout, there's a page with a bunch of URLs to some recommended additional material to introduce you to all these topics, particularly Power Query, data modeling.
There is a fantastic one-hour course from a Microsoft conference where the gentleman does an excellent job of explaining star schemas and data normalization, far better job than I can ever do. So I encourage you to go watch his presentation about data normalization for Power BI and just in general because it's a great course. OK, so here we are in Power Query and it's still loading the data. Come on.
And normally, it wouldn't be this slow. I'm going to blame Zoom, or maybe my computer, one or the other. All right, so here we go. So we're in Power Query, and we can see that we've got multiple queries here. And I can see that this has now loaded that table that I just created as a new query. So now we've got table Department Demo. And if I come here, I can see that I've got my table Rooms here. So that's my original Rooms table. And so what I'm going to do is I'm going to just reference this.
And again, in the handout, I talk about differences between reference and duplicate a little bit. Reference is usually what you probably want to do because it helps minimize overall processing time and impact, but sometimes you need to duplicate queries as well. So I'm going to reference this, and I'm going to rename this, and I'm just going to call it Area by Department Demo because we already have the finished one in here as well. And so what do I need? I need to ultimately add this information up by department.
That's what we need to do here, but we don't have a department value in here. So I'm going to come up here, and I'm going to actually, I'm going to select on that column. I'm going to come up here and go to Merge Queries. Come on. There we go. So I'm going to come up here to Merge Queries, and I'm going to click on that SubDepartment ID. I'm going to select my table SubDepartment, and I'm going to click on my ID here, and I'm going to say OK.
And I'm going to expand this column, and then all I'm interested in is the Department ID. So there is my Department ID, and then I'm going to come up here, and I'm going to say group by, and again this is the power of Power Query. You could do this in Excel certainly, but it's going to be a lot more work. I want to group on the Department ID. I'm going to call my new value Area, and I want to sum. What do I want to sum? I want to sum my Area column, so Room.Area. Click OK.
And there we go. We've got our Department ID and our area. The last thing I'm going to do is I really want my department name in here. So I'm going to Merge Queries one more time. You'll note that off to the right, there's a series of applied steps there. And so the great thing about Power Query is you're basically building a modifier stack, so you can always go back and change things later if you need to, or undo things, or add additional steps. There's never anything lost in this process. It's all there.
You can always rewind if you have to. Again, I'm going to come in here. And so now I've got my department names. I've got the area per department. And I still have the ID here. I could clean this up more. I could reorder the columns if I want to. I could rename the columns. There's any additional steps that I could do to do more with this, but lastly, I can come here.
I can do Close to Load. I always recommend Close and Load To, and not just click on the Close and Load button because oftentimes, when you're doing these more complex data modeling exercises, you do not want all those queries loaded. And so what's nice is if you do the Close Load To, you can choose to only create connections. And it won't create sheets for all those queries because otherwise, by default, Power Query wants to go ahead and create new sheets and new tables for every single query you have.
Again, more complex data models and richer environments where maybe we're doing multiple aggregations, you don't necessarily want all those loaded back into your Excel spreadsheet. And so there you go. We've got Area by Department Demo, and we've got our table here with the data. So ultimately, what's this leading to? This is what we did to aggregate the data for Dynamo to go ahead and consume. And so the point was we built a rich data model in Excel.
That data model was all ready to be loaded into Power BI, and in addition, we could then aggregate the data, just like what I was showing to then be able to load that data into Dynamo, and then Dynamo could go ahead and process that data for us. And so Esra is going to take us through that processing, and what we were trying to accomplish with this processing, and why pulling the data out was so valuable. Esra, I think you might still be muted.
ESRA ABUMOUNSHAR: All right, here we go. Can you hear me now?
ROBERT MANNA: We can.
ESRA ABUMOUNSHAR: OK, so again, I'm going to walk you guys through how we utilize Dynamo. Thank you, Rob, for walking us through how the data setup happened. That was kind of the heavy lifting part of this master plan project. And that kind of paved the way for how we're going to be using Dynamo, and what the workflow is going to start to look like. So I just want to go over some of the highlights and what we were able to achieve through this workflow.
In terms of time, we were able to do the 3D modeling of all of these components with an Excel sheet within less than 25 minutes for each push. And so I'll go over that and details in a little bit. In terms of how many people were doing the modeling, it was basically the person who was doing the pushing of the data from the Excel sheet through the Dynamo into the Revit, so it's basically one person.
And then also, how many components? It was over 350 mass families and their parameters per push per iteration. And also, the main benefit or what was kind of what we walked away with was a graph that we can utilize and adopt on other projects. And that was really key for us going into this workflow initially. So the first thing we wanted to figure out is how do we develop our workflow? We identified four factors that will drive the development of the workflow itself.
And those kind of gave us an understanding of what the objective and what the restrictions were going to be going forward. So the first factor is, again, in most projects is the time when the budget constraints. So understanding what the time limits are and what the budget constraints are starts to pop the question of what the client deliverables requirements are. So tying these two together and assessing the software capabilities, what softwares are we currently using?
And also, our Technical Support Team, meaning not just what help do we need on a project, but assessing what current workflow is here at Stantec? What are the go-to softwares that we use in order to document and come up with the illustrations that the client was asking for? And in this case, specifically, they were looking for a stacking diagram, so section cuts of buildings that show them how much of this floor is being taken by this kind of specific program or category in this instance.
So taking a look at what these softwares that we have, what our toolbox is, as Robert was mentioning earlier. And how comfortable are our staff in terms of using them? And do we need to introduce a different software? And in this case, yes. We identified the restrictions. There was an excessive 3D modeling that was being asked of us from the client, so doing it manually wasn't an option.
So in the previous slide, when we talked about there was a kind of statement talking about an objective, our objective was to automate the process to push a lot of geometry back into the Revit or into the 3D modeling platform in such a short amount of time for different iterations. So in terms of the softwares that we have here, we are heavily reliant on Autodesk products within our company. So our go-to is Revit, and we currently started to use FormIt. Both of these are known to be compatible with Dynamo.
So, OK, which one do we use? Initially, our go-to was FormIt because we're not really doing any heavy modeling. But we quickly realized that if we were to streamline not just the 3D modeling, but also the production of illustrations given the time and the budget constraints, maybe it's best if we go with Revit. And that was quickly realized when we started to generate the geometry in FormIt. FormIt was, yes, it's compatible with Dynamo, but the geometry that was produced was pretty limiting.
It's too simple for the tasks that we were trying to accomplish in the sense that we couldn't create groups of geometry with properties associated with them. And that came naturally, or that was part kind of an inherent feature. If we're talking about Revit and pushing mass families, you can host and manipulate the properties of that family and add, delete, modify, and also tag, and filter once we move forward with the illustrations and the section cuts on the plans. And that was pretty limiting in FormIt. So going forward, we decided Revit and Dynamo was our go-to.
And now it's time to start looking at how do we develop our Dynamo Logic You know, that's step one of the process now that we know what that workflow starts to look like. And this is really kind of best practice. And most of you who are familiar with Dynamo, this is really key when you're trying to create something that needs to be adopted by other users. Especially if you're the expert, and you're someone that's going to walk away after your role is done, keeping your Dynamo logic clear and straightforward is going to help a lot in allowing others to adopt, and utilize it, and move forward.
So we divided our logic through six specific components, and color-coded them, and left notes across the graph to allow people to navigate through it on their own past my role. So the first one is inputs. Our inputs were basically everything that Robert just covered in terms of data. All of the data was coming from those Excel sheets, and all of those Excel sheets contained more information that we needed for this exercise.
So and when you're talking about the Excel sheet for the Power BI, that's not the same Excel sheet that was brought into the Dynamo. We simplified it heavily in order to bring in only the information that we needed to push the 3D modeling into the Revit or to create the 3D version of the data in Revit. So input is basically, your Excel sheet and Excel data. Import, however, is anything you're importing from Revit.
So now that we have an Excel sheet with data in it, what Revit elements can we start to hold onto in order to generate the 3D models of that Excel data? So import says in this instance was are linked Revit models. Each model was representing the kind of each building within our site or within the client's campus, and we'll go further into that in the next slide. But import was anything that you're importing from the Revit model that was also levels and mass floors.
Creating a geometry, really straightforward, all the geometry within Dynamo that you're pushing into Revit, in this instance, or further along, that would be your mass families. And next was Dynamo the visualization. This was really, really, really important if you're utilizing Dynamo as a platform to streamline 3D modeling and automating the 3D modeling, then you probably should look into visualizing what you're making or what you're trying to push into Revit directly in Dynamo before pushing it into Revit.
This is kind of also best practice because especially, with this amount of heavy information, or this amount of data, you don't want to push or wait 25 minutes and then realize, oh, this is not showing up correctly, or, oh, I forgot to take this component out, or I didn't want to show this in this kind of iteration. So visualizing it in Dynamo prior to waiting the 25 minutes is part of you saving time and pushing only what you need to the Revit model.
So the last two things are pretty straightforward. Again, Output 1 and Output 2. Output 1 would be all of the geometry pushed in as mass families, and Output 2 is all of the parameters associated with each one of these families that you're pushing into the 3D model. So taking a step back, now that we have established a logic for this graph, and knowing that we needed to import things from Revit, what was the best way to set up a Revit model?
Our Revit model consisted of the main model, which would be the main launcher for the Dynamo script, so we're going to launch our Dynamo from the same model that hosts all of the linked model. And in this instance, we were working with over 30 buildings total, and building a Revit model for each single one of these buildings, a mass model, and then bringing that as a linked model into, again, the model that we were going to launch Dynamo from.
This helped keep the model clean. This helped kind of drive the structure for our graph in terms of the import portion and keeping that clean as well. So kind of this was what we agreed on as a team to bring in the host for our 3D geometry later on.
And just one last note. When setting up the model-- and I'll go over this slide as well-- so when hosting two elements from the linked model, one thing that Robert had covered early on, the naming process. Keep everything consistent, and knowing that we were going to jump from Excel to Power BI to Dynamo to Revit, we knew early on that we needed to keep the naming consistent.
So if my Revit read a certain way in Excel, those Revit had to read the same way on my levels within the linked models within my site model. So understand that consistency is key into handling data and into handling Dynamo in general. So make sure that you're being consistent and straightforward when you're naming these components. So now that we figured out the logic and that we figured out what's the best setup for our model to handle that many buildings, we needed to figure out what the graph structure is going to start to look like.
And always keep your Dynamo kind of simple and straightforward. And if you start to get in the weeds, or your graph starts to look really spaghetti-like and really complex, take a step back and revise or re-evaluate what your graph structure starts to look like. For us, we try to keep it simple. Again, this is for the sake of further adoption on other projects, so we knew early on that we wanted to have one source to go back to for information.
And in this case, since we're importing from an Excel sheet, we needed to create a dictionary. Dictionaries are by far the most powerful component of our graph. They behaved as one-stop-shop for us. All of the information that was coming from an Excel sheet, was tying into this dictionary early on in the graph. And a dictionary, most of you who have used it before, know that now if we establish this, we can tie the data directly from Excel to those components in Revit.
So again, this is where the naming process was key or the consistency in naming was key. If those namings were the same across the Excel and the Revit models, the dictionary was able to tie these into or develop keys and values for them, and kind of correlate that relationship within the graph itself. So lastly, was developing the geometry.
And when we're developing the geometry, we had to ask ourselves, since we were hosting those mass families onto the floors themselves within Revit, what's the best way to kind of do the math? What's the best way to kind of dedicate the areas to each individual family mass that will represent how much of the floor within Revit is being occupied by a specific program that's coming from Excel?
Well, you can do all the math you need in Dynamo, and you can divvy up the geometry and make your graph extremely complex and extremely long. But if we're already doing the clean-up in Excel, why not do that as well in Excel? So we went ahead and assigned a value, a percentage to each single one of these components within Excel sheet initially, and we just told Dynamo to grab that column from the Excel sheet itself. And all we did was kind of tell the Dynamo, OK, here is the percentage, now divvy up the geometry.
Rather than doing all of the math and all of the kind of excessive or heavy lifting in the Dynamo graph, we kept the Dynamo graph clean, and we allowed Excel to handle that portion of the process. We can go to the next one. So I'm just going to kind of conclude the Dynamo portion by saying our takeaway from the process was the main benefit was separating the data, understanding that we needed to separate the data from the script.
Even for upcoming projects and further pursuit of data in Dynamo, we always need to separate the data from the script and handle the data prior to bringing it into the script, clean up, shorten it, and take only what you need, understanding always that less is more when handling that much data through the Dynamo to develop your 3D model. And that made it so much easier for us to troubleshoot. So a lot of the people who were on the project had zero experience in Dynamo encoding.
And the ability for me to walk away as a Dynamo expert, and handing this over to them and having them troubleshoot on their own, was really, really essential, was the main driver for this type of graph and the way it was laid out. So troubleshooting was so much easier because those who are running the script knew that the graph was already set up a way that anything that was to go wrong in terms of the end geometry, it did not require the designer to go into the script and adjust the script.
Because the heavy lifting was done on the data side, we knew to look for the kind of the error or the notes within the Excel sheet, or within-- there must have been something wrong with Excel or with the Revit model itself. And most of the times, it was the case. It was something that was missing from the datasheet, or it was something that was mistyped in the datasheet, or something that was kind of misnamed in the Revit model, or parameters that Dynamo could not kind of link both elements together because either the Excel or the Revit. It was acting as the middleman obviously.
And the last thing you needed to look at was the graph. Review your Excel first. Review your Revit model, and then look at your Dynamo. And that made it so much easier for them to troubleshoot because they're comfortable with Excel. They're comfortable with the Revit that they've been using for other projects or most of their projects. And if there was something that was in the Dynamo, they could reach back, or oftentimes, they were able to handle it on their own. So again, that drove the scalability and the flexibility.
We were able to adopt this on multiple projects, and reason being, handling Excel was easy in terms of creating a template for people to follow for other projects, and then just pushing that Excel into the Dynamo. And again, Dynamo is not perfect for every single project, neither is Excel, neither is Revit, so all you needed to do was tweak it a little bit to fit your project needs, but knowing that you have a template to start with, the graph is working, and your Revit has to be consistent with the naming throughout the process. So that's it for me, Rob, take it away.
ROBERT MANNA: Cool, thank you very much. So, yeah, Esra has now sort of taken us through what we did to generate outputs in Revit. And so now we just want to talk a little bit about what all the outputs were, right? So visualizing data, we want to put data in context. And so Revit was a great way to put data in a physical context to say here's the relationship of your departments to their physical locations. But at the same time, Power BI, which is great for online data interrogation, was super powerful from a data validation and exploration perspective.
And then Power BI Report Builder was also really, really useful for building or creating those PDFs that people always want. Everybody always wants a table that's in a PDF, that looks nice, and they can put it into the massive multi-page master plan document. And that's fine. That's another valid output. Again, we were able to achieve all three outputs by having a centralized data model. So I'm to move through some of this stuff pretty quickly in the interest of time. I want to be able to leave enough time for good Q&A.
You know, this looks pretty, right? Everybody loves this graphic, ooh, ah, that's awesome. I want some of that on my project. Well, the reality is it's not useful. It looks pretty. You can say, oh, wow, there's a lot of parking here, and there's a lot of red stuff here, and there's a lot of purple stuff here, and you can look and say, OK, you got some idea. But if you really want to understand the data in context, these images are not useful. What is perhaps more useful is starting to create a story about your data.
So being able to put the data in some sort of context where you can tell a story with these different dimensions and people can look and say, oh, that's interesting. Like I look at these graphs, and I say, oh, that's interesting. There's as much parking as there is health care. I don't know. Personally, that's kind of weird to me. It's not my job to make that decision. I'm not the master planner. I'm not the client. But this data tells a story. It's not hard for you to jump to that understanding when you start to look at these charts and graphs here.
In the same way, really good for validation, really good for illustration, particularly if you don't need the architectural illustration, this is a building stacking chart. Because it's in Power BI, it can be all the buildings, or it can be one building, or it can be three buildings. It doesn't matter. Power BI lets us do that. And so if you are just interested in the relationship of your program in your buildings, this becomes a really powerful tool. Not only that, it becomes really powerful for data validation.
So our client was able to use Power BI, and then if they saw something and they were like, wait, why is there this much inpatient on that floor? Why is there only 18k? What's this other space? They could actually use the drill through capabilities of Power BI to say, OK, show me where is this number coming? What's generating that number? And that drill through then gets you into saying, oh, here are all the suites that are adding up to that number that you weren't sure about.
So if there was any question, they could go and look and say, wait, this got classified incorrectly, or this really should be in this department, or it should be this category. Whatever the case is, they were able to help us validate their data because we don't know their campus. They know their campus. And in fact, we exported this information out to Excel so they could actually mark it up, so they could use Power BI to interrogate, investigate. Then they used Excel to provide markups back to us and say, hey, this needs to change to that. This is wrong, whatever the case is.
In addition, we can start to build more complex visuals. So there are visuals in Power BI. You might need to do more data modeling or additional data transformation. So these Sankey diagrams, also very popular because it shows you the relationship of the program to the buildings. But we couldn't just use our data model that we had as our core data model. This actually required additional data modeling to be able to make this visual work, but again, very powerful, and very helpful.
Calculating additional values, again, you could either do this in Excel, much like we aggregated the data together. In this case, this was done in Power BI. We actually calculated the grossing factors. We knew how much usable space there was in all the suites, and we knew what the gross floor plate was. With that data available, we could calculate what the grossing factor was for each individual floor plate of each individual building as well as the grossing factor for the buildings themselves.
So again, having that centralized data model allows you to do this math in the place where it makes the most sense depending on what the outcome is that you're looking for. Again, pushing the data into Revit because we're pushing data into Revit, we're not just creating geometry, we're actually attaching parameters to that geometry and filling those parameters out, all of these could be generated with your traditional tools in Revit in terms of filters, color schemes, overrides, et cetera, schedules, tagging of stuff. So there's actually tagging going on. It's a little bit hard to see here, but there's tags here.
We've got a schedule here. So all of this was accomplished by the fact that we were able to push that data through Dynamo into the geometry that Dynamo was creating, giving us a lot of flexibility. So when they said, well, we want to see a certain department, and where is that department on the campus, and we don't want the noise of all the other departments, that was relatively easy to accomplish.
At one point, the client said, well, help us understand big picture what our options are for new site development. So there were three possible sites, three possible sizes, and three high-level program scenarios, inpatient only, outpatient only, or outpatient inpatient combined. So that's 27 different iterations that our one designer had to explore in a two-week window.
The script became hugely valuable at being able to quickly process those options because all she had to do was create an Excel spreadsheet that represented the different combinations of program for the different sizes and the different scenarios. Then she had to create envelopes for the three sizes for the three different sites, and really it just got taller, so it was sort of like 3.5 or something like that, envelopes that she had to create.
And so she created our envelopes, created the spreadsheets. She used the script to quickly generate all the different pieces of geometry with the data in it. And then, of course, your traditional Revit tools for quickly creating all the graphics. Super, super-powerful in terms of what were you able to do. And again, a great example of because there was no data processing embedded in the Dynamo script, it was really easy to adapt.
Lastly, paginated reports or Report Builder, that gets you your PDFs. This is all a data-driven formatting, so all the color-coding was coming out of the data. The same colors that we were using in Revit as well, that information was actually in the data model. Unfortunately, Power BI, the online service, does not give you as much control over conditional formatting as Report Builder does, but, again, a great example of reusing that data source in multiple locations and for multiple purposes.
So really one of the big implications here, right? What did we learn? And what didn't we do? So we did all this stuff in the left, and that was wicked awesome. What we actually didn't do is we never got to the master planning. The client put the whole thing on pause after we did our report of the existing conditions and those schemes or scenarios for the very high level of development possibilities.
And really, in my opinion, we did such a good job that the client was able to step back and say I need my strategy department to think more about this to answer the questions that Stantec is asking in order to proceed with the actual master planning. So our goal was to really look at growth and distribution over time. We actually developed some macros in Excel to help with data entry. We had to be thinking about multiple options over multiple-year periods, and multiple benchmarks. We were all set up for this. We just haven't done it yet.
The really interesting thing is this is when it starts to get really interesting in Power BI and even in Excel because you actually just start to think about time, and that adds a level of complexity in terms of your data model. You have to start to write more complex measures to aggregate the values appropriately because you want to add things up over time cumulatively. Again, unfortunately, we didn't get there in this project, but we're certainly well-prepared for it when the client does come back and say, hey, we're ready to go now.
So final thoughts, taking the time to manage and transform data in an independent toolset is just super, super valid. We used Excel, and Power BI, and Dynamo, and Revit. I don't care what you use. You can use Airtable. You can use Crystal Reports. You can use Grasshopper. You can use Hypar. You can combine those with the tools that we used. There's lots of tools out there that help you do this. The point is that it is very much worth it to spend the time to organize your data, build that data model because it gives you so much more flexibility.
Paying attention to your granularity is important. So again, we kept that suite granularity even though we were summing these things up into higher levels. Having that granularity gives you more flexibility with what you do with data. Always, always ask yourself what are the stories that we're trying to tell with this data? What are the questions we need to be able to answer? What do I think the client wants to know? Whether the client is your own internal project team or the actual client that's hired your particular company to do the work for you.
Geometry is really just another outcome, right? Geometry is data at the end of the day, so that's another outcome for us in this particular case. And then ultimately, again, having separate data helps set the stage for more advanced work. So whether that's what we didn't get to do in this case, or optioneering with the data model we have, with the information that we now are able to ingest, it is entirely conceivable that we could optioneer those design scenarios. I'm not saying it would be easy, but we're in a place where we could look to tackle that if we wanted to because we have well-organized data.
And it's not only in one place. It's not confined in a script. It's in a tool that's accessible to a wide variety of folks at the end of the day. So with that, thank you very much for tuning in. I'm looking forward to the live Q&A if you're attending this as part of the live AU event this year. And we will certainly be prepared to do additional on-screen demonstration in Excel during that QA process, so I look forward to some good questions. Thank you.
ESRA ABUMOUNSHAR: Thank you all for joining.