Description
Key Learnings
- Discover how AI and machine learning can be applied to the AEC industry.
- Learn how to integrate AI into building better BIM data with machine learning.
- Learn about determining how to enhance company processes by using historical data.
Speaker
- Patrick EldridgePatrick Eldridge is a seasoned professional currently serving as a Software Engineer and the AI Team Lead at a midsized MEP firm in the Midwest. With a robust background in Mechanical Engineering spanning seven years, Patrick transitioned into the realm of software development two years ago, driven by a passion for technology and innovation. In their current role, Patrick focuses on integrating Artificial Intelligence into the AEC industry, specializing in deep neural networks, computer vision, and large language models. They have successfully deployed numerous AI-focused tools internally, significantly enhancing the efficiency and capabilities of their company. A tech enthusiast and optimist, Patrick believes that while a bright future for the industry is possible, it requires deliberate and intentional efforts to achieve. Their commitment to advancing technology in engineering reflects this belief, combining technical expertise with a visionary outlook.
PATRICK ELDRIDGE: All right, thank you for clicking on the link to check out this talk. I'm just going to show the Autodesk safe harbor statement very quickly. I'm sure you're used to looking at it by now. This is From Lines to Life-- Automating CAD-to-Revit Workflow With Machine Learning.
This talk is mostly going to be about the end-to-end process that we went through. If you're trying to automate the CAD-to-Revit workflow, then you'll probably learn quite a bit about how we did it. But even if that's not the particular thing that you're trying to do, even if you're watching this and you want to start innovation at your company, you should still get a lot out of this.
I'm going to try to make sure that there are plenty of takeaways so that when you leave this talk, you feel ready and more prepared to do things. So let's go ahead and get started.
My name is Patrick Eldridge. I went to the University of Cincinnati. And I was mechanical engineering technology student. I co-oped for a plumbing contractor. That's kind of where I sort of fell in love with the AEC industry. It's still a little bit rough around the edges. It's not totally settled. It wasn't in the 2000s when I was co-oping, and it's still kind of isn't.
After I graduated college, I wasn't able to get right into it, but I joined KLH in 2015 as a plumbing engineer and started doing project work. I'd been trying to make a video game. And while I still have yet to make that, I was learning how to program.
And then I kind of caught the programming bug at work and kind of slowly over time worked on project work less and worked on coding more until on 2022, I transitioned to a full-time software developer. A lot of the tools that I make have an AI or like an smart algorithm component to them. And I would consider myself to be a tech optimist, but only if we guide it there. We are the ones that need to control that conversation. It's not just going to happen on its own.
Just a little bit about KLH. We're a small to medium sized firm out of Fort Thomas, Kentucky. We serve a wide range of market segments. As of 2017, all of our work is done in Revit. And we have a software department where we write custom tools that support the workflows that we are building.
Here's a quick review of the learning objectives. And all right, let's go ahead and get into it. So this all started when the mechanical department head saw a post on LinkedIn.
Someone said, hey, my company is starting a journey from 2D to 3D. We have a lot of existing blueprints for our buildings. And we're looking to transition those to BIM models. If anyone has any advice, we would love to get together.
The department head responded. Probably didn't think too too much of it, but that led to a call just to see, feel each other out. And then that led to a demo of the existing tool that we have got for the CAD-to-Revit process.
And I have a quick video showing a demo of what that looks like. I don't know if you guys can hear the audio, but it's inspirational tech music. That's not the super important part. I'm going to go ahead and play the video. That kind of goes into it.
So we're a 100% CAD-to-Revit company, but we're still happy to work with architects who are in AutoCAD. But our model builder in Revit needs KLH standard layer names. And that meant-- and here, it's showing it-- we have to map the architect layer name to the standard layer name.
And this, you're looking at right now a sped-up version of this tool. We had to do it manually. And then, once we had done that, we actually made a machine learning model that would map those automatically. And it drastically reduced the amount of time that we spent doing that. And it displayed a quick confidence score. The user had to correct it.
And then, once we had brought it into Revit, we linked the DWGs to each of the levels, and then we ran the tool. And then what we were left with was a functioning Revit model. So that went well, but there was something I want to point out. On the left is the Plan View. It looks really great, but the reality of the situation is on the right. So there, I've selected one of the walls, or one half of one of the walls. We just took the wall lines and extruded them up to the next level. And then, you can see the furniture plan and everything is still just a 2D CAD plan that was extracted out of the other one and then linked in here.
So these paper-thin-- so we need paper-thin walls, because it was easy. And we just needed something to host our hosted families to. So this had been working for us.
And then, the windows and doors are just extruded lines. Like I said, on the left, you can see it looks really good in a floor plan. But on the right, that's what's actually there.
So when the client saw that, they said, we're not just going to be able to pick up your tool and use it. We need a better model. The whole conversation culminated in a proposal to them with a per-store dollar amount. Here's how much it's going to take to do a store. If we landed that proposal, there would be hundreds and hundreds of stores, and they wanted them very quickly.
So we're an MEP firm, and we do this for ourselves. So we were going to have to figure out how to do this for someone else and using a different process. That department head came to me and asked, can we do this with AI?
And so I had to really look at it and try to understand the problem. And he caught me up on what the situation was. But I want to quickly dive into why he would think to ask that question, and then why he came to me.
So KLH started the AI journey in the summer of 2018. Once you've run through-- there's YouTube videos out there of getting into machine learning, basic model setups on things like handwritten digits, or pictures of clothes, or whatever-- classifying those different things. Those data sets are really, really clean. There's a big gap from those kind of tutorials where the other person, has done all the hard work, and you're just following along, and then actually going out and doing it yourself.
So in the summer, I'd made a quick polynomial curve and then sampled it, and then trained a machine learning model on those samples. And it was just to get comfortable with that process. And then, from there, I was able to move on.
So the first model that KLH released was me and a team. It was part of a team. We released it in February 2019. It was that layer translator. So just a reminder that that's what it looks like.
And then, the next thing that we kind of moved on to was 3D classification of Revit families. And I actually talked about this in another Autodesk University talk that was posted in 2020. It was the all-remote year. And I'll put those two that-- we've done a talk on the Layer translator, and then the other talk in 2020. I'll put links to those talks in the handout if you want to see those more specific versions, if you want to dive into that.
And then, in November of 2022, we had been-- wait, yeah. Sorry, this is an example of classifying tables and chairs that I did in the talk in 2020. And then, in November 2022, I started doing 2D object detection.
And you can see the output of one of those here. This is a very simple tech demo that I had that I got running on a Python notebook. I'm sorry if that doesn't make any sense. But this is just identifying outlets in a picture.
And I had been giving lunch and learns about all this stuff and showing everyone. So everyone had an idea about what I had been doing-- me and my team had been doing. So that was just a brief background. That's how he knew to come to me.
So that's some quick background. Back to the problem, 2D to 3D. I broke the problem into two different parts. We needed better walls, and then we needed to actually place window and door families.
We didn't know exactly what to do with walls at first, but it very quickly became obvious that we should use some kind of algorithm to look at the actual wall lines and try to find the parallel walls, and go from there and do something. And then for the door and window family, since those are like actual objects that you can identify and it's not just a wall of arbitrary length, drawing a bounding box around that and then somehow placing that family based on where the model had identified that wall, the electrical outlets, was going to be the way to go.
So we came up with a game plan. We need the walls first because the windows and doors are going to be hosted to it. And then we're going to move on to windows and start with windows for the machine learning part. You most windows, are that window icon. They're symmetrical. So if they're rotated, or flipped, or mirrored, it's still going to look like that.
So that was going to be the less complicated of the two doors. I'm showing that basic symbol, but there can be double doors. There can be double doors that open the full 180 degrees. There's a lot of different symbols for doors. And so we're going to have to come up with a way to identify not just doors, but all the different kinds of doors and everything like that.
So that was how we were going to tackle it from a feature standpoint, what we were going to build out and when. Almost immediately, we needed to figure out how to architect the solution. It's a funny word-- CODE architecture, not building architecture.
So we had two main choices. We could either deploy all the code locally, or we could set up a client-server architecture. Deploying locally is good because it makes the app less complicated from a developer standpoint. Everything you need is right on the same machine, and you don't have to worry about some others. If the server goes down, if something goes wrong with the Windows kernel and suddenly 3/4 of the world's servers shut down and you can't do your work, that won't affect you if everything is just on your machine and your machine is up.
So that's deployed locally. Client-server, it does make the code more complicated, and you need two machines to work. And they need to be connected via internet or intranet. But it's easier to update the server code because there's only one server and not hundreds of individual computers that you'd need to somehow make sure that you update, or that you have to make sure that the users update.
They have to either restart or run some command to update their code. They're not super good about doing that. I don't know about all the engineers, architects at your company, but they really just need to get their job done and may not necessarily have time or remember to keep all their software up to date.
So those are the strengths and weaknesses. If you're looking at these two architectures. We went with client-server, and I've got a quick diagram here just to make it maybe hopefully a little more clear. So on the left, there's a user computer, it has a Revit version installed-- so Revit 2024. There are add-ins for Revit 2024 that are installed.
One of those add-ins at KLH is MMA, our Model Management Assistant. Inside the actual add-in itself is the wall algorithm. And then there is the client's code that bundles up all the information that the machine learning model is going to need and sends that to the server for doors and windows.
That does whatever it's going to do. It's a black box. The engineers and model managers don't need to know. And then it sends back something that's expected, which is the bounding box around the objects and whether or not it's a door or window for now. We expect to add more later, but for right now, this is all it can do because it's all that we need it to do.
So hopefully that makes it a little bit more clear. Those dashed lines are actual physical Cat6 cables. So these are running on different machines.
All right, so how can I actually do this, or what did this look like? We had to add functionality to the existing tool. This is the existing UI for actually performing the CAD convert. You load the DWG files in and set them to be on the correct level.
We're going to have to add two new buttons, one for walls, and then one for windows and doors. Why not just one button? Well, the wall algorithm that we ended up with is pretty good, all things considered. It runs pretty quickly, and it's about 85% to 90% accurate.
That's good enough, but we still do need a little bit of user cleanup. And we didn't want to try to run the windows and doors until we had the walls finalized by the user. So that's why we split it up. So yeah, once we were able to add those buttons, we were able to start testing things and get into the algorithm.
So this is me as Buzz, explaining to the software co-op that I had-- kind of explaining the problem that he was going to be solving over the next couple of months. And it was just taking parallel lines and figuring out how to draw walls in between them. I'm not going to show the entire algorithm. That would take way too long. But just to get a taste of this, the algorithm mainly focuses on sorting all the wall lines into groups by direction.
So we take all the walls that are straight up, or all the lines that are straight up and down, all the walls that are left-right, all the walls at the same angle, and we group them together. And then, in those groups, we just pick a line to start with, and we find the nearest line. It's facing the same way. And then, there's some logic to say whether or not those two lines are a wall.
And then, you would draw another wall center line. You need to find in the Revit API that we'd be using to actually construct the walls ourselves. You need to know the starting and ending point of the center lines of all the walls.
So OK, no worries. There's two lines right next to each other. We find the center line-- easy peasy.
You end up needing to specify the thickness. So if the walls or if these two gray lines are 15 feet apart from each other, if that's the nearest one, probably not going to be a 15-foot-thick wall in most places. But maybe 15 inches is a decent size of an exterior wall that you'd come across. We had to pick an arbitrary size, and 15, I think, was the width that we ended up going with.
So no problem. A long T wall, again, we had to write some custom intersect code, some custom intersection code. The intersection API with lines is not super straightforward. There's a little bit of a learning curve there, and even then, there was a bug.
So once we were able to tell where these lines intersected, we could suddenly have intersecting walls, and we were all good to go. Again, easy peasy. No problem. A short T was where things started to get a little interesting.
So just for reference, this line and that line-- or if you can't see, the end of that short wall that's left-right, and then the wall that it's teeing off from-- is shorter than 15 inches. So the two lines running left-right will make the wall the way that I've shown it with that red center line on the x-axis. We'll also get a very short and very fat wall because those two lines are within 15 feet of each other.
So we'll have overlapping walls. That's one of the things that we need to clean up, or go into all these different edge cases. And then we need to deal with this. When just the wall layers get exported and you don't look at doors, this is what that looks like.
Now, there's supposed to be a door there, and Revit needs a wall to post a door too. So you end up needing to close that gap. And then, that logic gets really tricky because is it just supposed to be an opening?
If it's exactly 36 inches, OK, maybe it's a door. But if it's 29.5, is it just a 29.5 existing door and we placed something? Or is it just a weird old basement in a really old building in New York somewhere, and that's the gap that people could fit through back in the day, and there were no codes to govern anything?
So it started to get really tricky. But we got algorithms that, with minimum cleanup, would get us there. And then, once we were satisfied with that-- and all the model managers, more importantly, were satisfied with that-- we had to move on to Step 2.
And that involved computer vision. So there are lots of different models for computer vision. And for this particular task, we knew that we were going to need to fine-tune the model from the electrical outlet technique that I had working in November 2022.
These models are supposed to find-- I'm sure you've seen all sorts of demos about people doing, "Find all the hard hats and safety vests in photos from a construction site," to make sure that all the workers are wearing proper PPE. That's what it's supposed to do.
So trying to find the set of parallel lines that is a window in a sea of other parallel lines-- that is, the whole rest of the blueprint-- was way outside of what these models were trained to do. So you have to fine-tune them quite a bit, which is where you take a base model, and then you put it back into school. You put it back into training mode, and then you run a bunch of more training data through it. And then you have a model that's pretty good for what we needed it to do.
We ended up going with Mask RCNN, or MRCNN. It produces a bounding box around the objects that it finds, and it also outputs a pixel mask and the object category that it thinks it is. So that was all the information that we thought we needed to be able to place the family. So the output, again, from the model was a bounding box, a pixel mask-- which is kind of like a more fine-grained location data about where the actual object is-- and then what that actual object is, and then the confidence score.
Knowing what the model outputs is important, but then we need to talk about input because these models have to have exactly the same thing every single time or they just won't work. Just about every single machine learning model is like that. The model that we chose expects an image that is 1,024 pixels wide by 1,024 pixels width and height, and I need to just hold onto that thought because it's going to be important for later.
Quickly bouncing back to outputs, this is very indicative of what the outputs look like. It's obviously not on blueprint data, but you can see there's the different bounding boxes, normally the different colors, to say, OK, this is different. These are all different people objects, each with that person category and then the confidence score.
And then, on the right is the actual pixel mask. So in the image, it's a pixel-by-pixel categorization of, these are the different people in this image, as opposed to a less-granular bounding box. So all right, we have a model. What do we do now? Because we need to take that thing that was a tech demo that was running in a Python notebook and actually get it to be what it needed, what we needed to be-- a model in production.
So the model development process is something that kind of has evolved over the years. We've already gone over the first four, which is identify the business case, land on machine learning as a solution, which we did. What data do we need as input? We need 1,024 by 1,024 images that have been exported out of Revit.
We're going to return bounding boxes that represent where the object should be and what that object is, because that's the information that we need to actually physically place the element in Revit. So then, we needed to gather the data set. So this client that we had been doing this work for, or that we were going to start doing this work for if we got the proposal-- because this is still pre-awarding of the actual job-- they sent us a couple of stores just so we could get a feel of what the data would be like.
We were able to generate a data set of images based on that. And then, we had to do a lot of cleaning and pre-processing. That's most of the time spent in this process. It's about 80% getting the data, and cleaning it, and labeling it if you need to. And we did need to. Once we had that data set, we had to manually label every single door and window.
And then you go into actually building, and training, and evaluating the model. My previous talks go into there. Not just mine, but the one the talks that'll be in the handout talk about that more because it's its whole own process. But it doesn't usually take nearly as long as a pre-processing the data.
And then, Step 9 is the really important one. That's what you did all the rest of the stuff for. You actually have a trained model that you need to save somewhere. It's very important because the process on the right needs that, because you're going to write software that either gets deployed locally or on some server that's going to get the input data, do whatever pre-processing you need to, feed that through, and then actually return the information.
Normally, at KLH, It's kind of standard we do the client-server architecture. So someone needs to write the actual web server that's going to receive the requests from the client and then return the responses. And then, we also need to actually write the tool, the client code in whatever add-in that we're trying to incorporate into, and get that all up and working. So at a very high level end-to-end, that's all.
This was driven by a business need. This was something that we were pretty sure that we could do. Total development of this entire process took about three months for me and two software engineer co-ops. I had one software engineer co-op working on the wall algorithm and the other one working on the object detection. And then, I bounce around as needed. And again, it took three people about three months, just to give you some kind of sense of that.
So I want to dive into the challenges that we faced. This is a little in the weeds, but hopefully you just benefit from seeing the things that can crop up when you do stuff like this. So the first thing that we really, really struggled with was converting, actually getting the images out of Revit. There's a way to do that in Revit, but it ended up not being very good for our purposes.
It was kind of slow, and it exported an image. That image was really, really big because we were trying to do an entire level at a time and send it over. For a really small like job that's like a strip mall, that would probably be fine. But we were really running into an issue.
It's kind of weird. We ended up having to basically specify a resolution and a conversion rate between a foot in model space and then how many pixels that was going to be in image space. Because the bounding box that the model outputs is in pixels, and that is pretty useless. You can't put pixels into the Revit API to tell it where it should go. So when you export that image, you need to know, pixel by pixel, exactly in model space where that pixel is.
And so we had to write our own custom exporter that actually took all the lines that we needed and made it an SVG. That way. When you plot out a Revit, it makes a PDF and zoom in. You can zoom out. It won't get blurry.
It's an SVG format. "Scalable vector graphics" is literally what that stands for. So you can zoom in as far as you want and the lines stay crisp, which we really needed if we were going to be able to.
Once we converted that image, we still needed the windows to look like windows. And then, as well, we were getting JPEGs out of Revit, and that compression algorithm was decaying the lines. So we were having a lot of trouble.
And then, the SVG that we were able to write our own custom exporter and then send that over the wire to the server was really small. It's less than 50 KB. And that image was going to be megabytes and megabytes and megabytes. It was going to be a lot slower to send.
If anybody's lost, I completely understand, so I want to do a quick example. This example isn't going to be quite to scale, but it should get the idea across of what I just talked about. So say this image is too big, it's outside of-- we're not just going to be able to feed this thing into the MRCNN model as one image of 1,024 pixels by 1,024 pixels.
Each pixel would represent feet in model space, which would be way, way too inaccurate to try to accurately place a window where it needs to be based on the architectural background. So we need to chunk out the image. And then, starting with that first image, we're making images 1,024 pixels by 1,024 pixels.
You can see that this image has half of the door in the bottom and then less than half of the door in the right. So we also realized that that was going to happen. The training data had clipped doors like that, so we knew that we were going to be able to correctly identify them. But then, that was going to make trying to figure out-- it was going to be difficult.
So we ended up not sliding all the way over to the next window. We ended up sliding halfway so that anything that was clipped in the first image would be whole in the second image, and so on, and so forth. Now, that means that across all these images, when you bring all the bounding boxes back, you're going to have a ton of overlapping bounding boxes. And you need logic to handle that.
So that was fun. And then, going to get that image, and then so on and so forth. And then you'll slide-- again, you're not going to go all the way down to the second row and go across. You're going to slide down halfway, and then slide halfway across, and then finally hit the bottom one, the bottom row, and then slide all the way across.
So that's how we're chunking the images. And then, let's just say that top left corner of whatever this is-- store, strip mall thing, it's a bowling alley, whatever-- That point is 0,0 by 100 feet. Level, because level 1 is at 100. It's architecture. And that is the bounding box for that window.
And then, I tried to color-coordinate this, and I probably could have picked two colors that contrast a little bit more. But the green is going to be model space, and the blue is going to be pixels. So this is in pixels. It's in that image.
So the image isn't even relative to-- that image is just floating out there. It's in pixels. I know that that bounding box is in 180 pixels to the left and 430 pixels down from the top-left corner. OK, that's great. What do I do now?
Well, you have to know that that image is going to be the fifth image over because of the way that we do the half-grid slide. And so you need to know that three full squares over is so many feet because you set up that exchange rate, or you have that conversion from pixels to feet. So that's what lets you say, OK, I know in pixels where that origin point is, 0,0,100, and then I know how many images away I am.
I know the width of those images. I know the conversion rate between pixels and feet. And then, that's what lets you say, OK, I know that that's 18 feet away.
Say the insertion point for that window is at the top-left corner. Now I know the exact x,y,z coordinate that Revit needs to actually place that window instance when I'm back in Revit. So hopefully, that made it a little bit more clear what the challenge was. And yeah, it was a huge challenge.
The second challenge that I want to talk about is, where did we source our training data? So for that client, they were able to give us a couple of stores and we were able to build a large enough training data set to train a model that was pretty good at doing doors and windows. But we quickly realized-- say that that was client A, and they do bowling alleys. And they were trying to upgrade-- it's a national bowling chain or whatever. Well, if client B is also a bowling alley chain, and they're a national chain, a major competitor of client A, they say, hey, we also are looking to go from DWGs to Revit models, how is client going to feel about a model that was trained on their data now suddenly helping their competitor?
We talked to a lawyer about it, and we came to the conclusion of, OK, we don't think it would actually be illegal to do that, but it would not be a really great way to build a relationship with client A if they found out that that was what we were doing. And that was going to happen everywhere. We were going to start using this to upgrade every single model that we make, whether it's for a client or not.
So what we ended up doing was going online and finding out, OK, does anyone own the rights to what a window and a door like look like typically in a plan? And this is what windows and doors look like, that everybody uses this. No one really has a copyright, I don't think. These are standard symbols, but even then, even though it's in a standard, they don't really own that symbol-- what the actual pixel is like what that, what that looks like.
So we ended up making a synthetic data set that includes these, and that's the model that we're using. So that cleared up all the stuff. We're not using anything that was specific to client A. And I think that's how we're going to go from now on.
But it was a kind of an interesting question. And it's something that, if you get into machine learning, you need to think about-- where are you getting your data from? So those were the major challenges.
What do I want you to leave this talk with? First of all, get more familiar with the space. In 2016, my company did a retreat that really we asked some really hard questions to ourselves, and that started a really big journey of tech and innovation that we are still on and that has been bearing fruit for us.
And I joined the company, like I said, in 2015, so I was really on the ground floor for this innovation journey. And it has been a really wild ride, and it's been extremely fun. But one of the questions that we asked ourselves was, OK in the future, 10 years from now-- so in 2026, because this happened in 2016-- Is there going to be more 3D or less 3D? Is there going to be more data, or is there going to be less data?
So those are the big picture questions you need to ask at an industry level. And we obviously came to the conclusion, OK, there's going to be more BIM models and less DWG files. And there's going to be more data, not less data.
So with that in mind, in 10 years, in 2037, is there going to be more AI, or is there going to be less AI? I certainly think there's going to be more AI. There may also be more regulation of AI, which there probably should be anyway. Learning about this now will make it easier.
So yeah, the second part is, algorithms were better than models. They're more predictable. Models are statistical outputs. Algorithms are not. So that makes things a little bit more difficult.
I talked to some things about how to determine code architecture and why we made the decisions that we made. And then, weighing the costs and rewards of creating the AI in just three people for three months, that's money that the company spent on payroll and opportunity costs because we weren't working on other things. And then, lean into the messiness. We're changing process. We're talking to other people and trying to get things going.
So there's a lot going on. It's messy, but it's fun. And if your process just stays stale, you're not going to be able to really get the advancement that you want.
So those are my major takeaways. Just a parting thought. If you're not generating data when you generate your drawings, you need to get started doing that because there's no public data sets for any of this stuff. And that's a great way to get started on your journey, is start generating data in your process.
So with that, if you want to reach out to me, I would love to hear from you. In the handout. I'll leave my contact information. And, yeah, thank you so much. Hopefully, you got a lot out of this. And I'll see you at either out somewhere out in the industry or at AU. Thanks.
Downloads
Tags
Product | |
Industries | |
Topics |