Description
Key Learnings
- Learn how out-of-the-box Autodesk Construction Cloud features can be used at scale.
- Gain a high-level understanding of how Autodesk Construction Cloud features can be integrated and automated.
- Learn how Autodesk Construction Cloud tasks scale, feeding AWS for data pattern inferences.
Speakers
- Anthony MeleoAnthony Meleo is a Senior Product Manager at Amazon, where he leads the integration of innovative cloud-based solutions to enhance project delivery and portfolio management. With over 12 years of experience in the construction industry, Anthony has a proven track record of leveraging technology to drive efficiency and collaboration across complex project environments. In his current role, Anthony has spearheaded the implementation of Autodesk Construction Cloud and Amazon Web Services (AWS) to streamline data management, improve team coordination, and scale project portfolio oversight. He is highly proficient in Autodesk's APS (Autodesk Platform Services) and AWS's suite of construction-focused tools, and has successfully deployed these technologies to optimize decision-making and project performance across Amazon's diverse project pipeline. Anthony holds a Bachelor of Science in Civil Engineering from Rutgers University.
- NJNima jafariMy name is Nima Jafari and I'm Senior BIM/Product Manager for north America in Amazon.I have broad experience in Process improvement and innovation, building information modeling(BIM), technical program management, implementing and emerging new technologies in multi-billion-dollar projects such as East side access, Hudson Yards, Amazonian distribution and data centers infrastructures. I define and execute road maps, product implementation strategy and software alignment and integration for the entire life cycle of AEC based on customers' and org needs. I hold a post professional master's degree in architecture and courses in construction management. I have 18 years of experience in construction, design, and engineering with broad knowledge of BIM, coordination, project management and tools like laser scanning and new software in construction. Proficient in Revit, Navisworks, Micro station, Bentley Navigator,Tekla, AutoCAD, and Primavera P6. I have a passion for mentorship, AI, Larg language modeling, data analytics and data integration.
- YCYao Chen
ANTHONY MELEO: All right. Hey, everyone, and welcome to our presentation. Today we're going to be presenting on Autodesk Construction Cloud and Amazon. It's a really good marriage, using APS and AWS to manage a project portfolio at scale at Amazon.
Particularly the org that we work in, TES, Transportation Engineering Services, we build a lot and, since coming to Amazon, really had to have leveraged the tools at AWS, but also APS, to do certain tasks.
So our main objective was getting everyone into a common data environment. We chose ACC, initially BIM 360. So you will see many slides will reference BIM 360, and we're in the middle of actually migrating to ACC. But, yeah, these two tools are working really well together, and really excited to go into a little bit of detail about how we're using them.
A bit about myself and my background. I graduated Rutgers University with a degree in civil engineering, started working for general contractors initially as a field engineer, project engineer, and found myself, really due to necessity, utilizing BIM, became a BIM manager, was really excited to work on some really big projects in New York City, the metropolitan area.
That's where I met Nima. I really want to give Nima kudos. I literally would not be at Amazon if it were not for him. He initially came to Amazon and then brought me over.
And it's been really great-- learned a lot, learned some coding, again, kind of born out of necessity just because of the scale. And, yeah, I'll turn it over to you, Nima, if you want to go ahead and introduce yourself.
NIMA JAFARI: Thank you very much, Anthony. I appreciate your very nice introduction. Yeah, as Anthony mentioned, my background is in architecture. Then I went a little bit to the engineering.
And after that, I worked for about 10 years in construction, kind of a translator between the engineer, architect, and general contractor with the usage of building information modeling.
And after joining Amazon, we started with Anthony working on gathering all of the construction data, just not working on the BIM part or 3D modeling, basically managing and extracting and analyzing the data that we are having on our distribution center and our sorting centers.
So a little bit about Anthony-- as he introduced me, now I should do the same. He's one of the pioneers in extracting the data and also working with the APIs for BIM 360 and ACC. And lots of these APIs that you or your team may use right now or feature requests, it comes from Anthony and our team-- so very excited to go to little bit of history of how we did it, where we are right now, and where we are going.
Ant, do you want to go through agenda and safe harbor?
ANTHONY MELEO: Yeah, I'll steal a joke from you, Nima. If we had to take a shot every time we saw the safe harbor statement, I don't know if we would get through AU. But I'll pause here, and I do want to give credit where credit's due.
I mean, obviously, Nima, myself, we didn't do all of the legwork that you'll see in this presentation. We're very proud of the work that we did do, but I want to give credit to our manager, Dave Burke, our skip-level manager, [INAUDIBLE], our team, and-- yeah, our BI team, Rebecca Stewart, and, yeah, her engineers, yeah, all of us working really well together and getting it done.
All right. Today's agenda-- we're going to go through the history, as I mentioned earlier. It does take a bit of legwork to get into APS and AWS. We're going to talk about the work that we had to put in to get where we are now. We're going to then go into the toolbox, the tools that we are using, both from the Autodesk side, but also from the Amazon AWS side.
And then how did we build it out from there? We started at the foundation. That's what we do as civil engineers and architects. And our foundation was leveraging the out-of-the-box features of ACC or BIM 360.
Next came the frame. I think at Amazon, we would consider that the precast walls, but we'll call it the frame in this presentation. And that's going to be the building, the automations and integrations. So we did run into some obstacles at the scale that we're building. And we'll talk about how we invented and simplified through-- crossed through those obstacles.
And then, finally, probably the most fun part, the advanced machine learning and AI solutions that we leverage, of course, using AWS.
NIMA JAFARI: Awesome. So a little bit talking about our history and how we get here-- so I really like the storytimes in all of the TED Talks and also movies that I'm seeing. And I think it is-- I like that part of a history, that how, for example, from working from Excel, now we are working full-on AI and Amazon Q.
And I think if we can learn one or two small thing from it, it can save, I don't know, six months, eight months of the time for lots of people that kind of going through this process, kind of modernize or automation and also machine learning.
So when I joined Amazon three years and a half ago, and I'm like, thinking, oh, I didn't know what I'm going into, but I wanted to implement building information modeling and kind of digitalize all of the information that we are having on our distribution centers in transportation engineering team.
So first thing that I did-- I looked at what are the workflows that we are having, what are the tools that we are using, and how our customers are working on these workflows, daily process that they are having in the construction, and how they are feeling about the tools that we are having.
So on the right side, the unhappy emoji that you are seeing, it was our user feedback, that 80% of them were unhappy about the tools that they were using. They wanted to see more automation. They think they're doing lots of manual entry of the tools. The data was lost.
And when I was looking at it, I see on the left side this crazy diagram that there are lots of tools that they are not integrated together, and they are inputting data, lots of duplication, and there is no mechanism or integration on that. And I counted, I think, at that time, we had around 40 different tools.
And maybe the first thought for me was how we can integrate these tools and these tools talk with each other. And if you look at on the graph on the right side, if you want to have a integration of the 40 tools together, it made the integration go to 700. That is not something that definitely we wanted to do.
So what did we do? We started working backward, looking at what are the different workflows we are having there, how we can improve them with technology or using better workflow.
And in the screen, you see the RFI workflow and what we had in that time that was our current state. Now it's our past state. It was like our GCs are sending their RFI either through their, for example, email address or through their third-party tool like Procore to our architect. Then architect was sending it in the email or third-party tool to RCM.
And this process, in the end, you will see that this RFI data, after the project is done, it is lost. You cannot get the lesson learned from it. So in future, you cannot use it for machine learning.
And how we try to resolve it? By implementing that in the construction management tool like BIM 360. And you see how this process got improved with usage of the new tool.
That being said, we understood the first part is limiting the tools that we are using and also use a tool that can manage most of these workflows. So we look at lots of construction management tool that they were out there and grade them-- not going through the list of them.
But long story short, we chose BIM 360, ACC, our main construction management tool that we are very happy that we chose that. And from the left side, you see, for example, for data management, how many different tools we were using because different teams were using different tools for cost management, design review. And we reduced it on the right side with all of these, ACC, and a couple of small tools that we still needed to use to have the smooth workflow.
So that first part was kind of like our tool reduction. And then we noticed that on top of the ACC, we have some internal tool that still we should use it, or there are some small workflows that BIM 360 is not supporting at that time.
So for having all of the data in one place and having a real common data environment, we pour all of these data from different tools into our data lake. And we created a data lake on our Amazon Web Services. And now that we had all of the data in one place, then from that, we could have a single UI or user interface for our portfolio management and also reporting.
So at that time, we noticed that BIM 360, it is really hard to get the reporting for all of the project combined together, and it's much easier to, from our data lake, to get these reporting.
And that being said, now we are heading all of the information in one place. We planned that from three years ago, to utilize machine learning and artificial intelligence. And for doing that, we went through four steps that most of our presentation is going to talk about these steps.
So the first one was our customers' need and use cases. We really need to know what is our end user need and how to support that. The second was the training and user guides. Now we are implementing all of this new technology, new processes, new process improvement, how we can train our users and how our new hires that come and join us after trainings can learn that and can use that.
And number three-- how to optimize our data through common data environment that we kind of went through it and, in the end, establishing the data lake, that we have all of the information in one place and we can use machine learning on those data.
A little bit of history, how we kind of manage these four pillars that I showed you throughout from 2022 and what we're going to do in 2025.
And it is going to be very useful for the companies and users that they have a bigger scope and always ask me, how you guys did that, where you guys started. So as I showed you in couple of slides before, we started with talking with our users-- what is their need, what are their workflows, and how we can improve them, and then choosing the technology.
So in 2022, we did proof of concept on the BIM 360 and made all of those workflows that our users were using in Amazon on BIM 360. And we tested it with the users-- and we call it pilot phase-- And then get the feedback of those pilot phase and fix our issues.
We also work on the integration between other business-critical software that we have in Amazon with the BIM 360. And we started gathering all of our historical data.
And then, in 2023, we kind of migrate about 5,000 6,000 of our project that data was all over the place into the BIM 360. And we train over 400 internal stakeholders and made training for our external, and we kind of stop using all of those other tools and start using BIM 360.
That being said, Anthony is going to go in the presentation to the big part of our work in 2023 and noticing that, out of the box, BIM 360 cannot support this magnitude of project-- 7,000 users getting added and also mass-creating the projects, that we're going to go through that and show that how we use out-of-the-box Autodesk Construction Cloud tools and add our customized tools to manage the scope and portfolio of the project.
And in 2024, we started gathering all of the data that now we brought into the BIM 360 and generate and analyze those data, create dashboards and reporting, create sophisticated tools for project creation and user addition, and automate that process with the security that we are having in the Amazon.
Also, we started to investigate and implement machine learning using the data and the API that is available for us for lesson learned of the data we are having on BIM 360.
And where we are going from here? In 2025, we want to focus more on establishing cases and extracting more data from our 5D, 6D models. We want to develop mechanism to track and enforce 3D model standard, in some part even using AI.
And we want to use AI for risk management and also analysis, not just a chat bot that is answering our questions about our data in BIM 360.
So it is kind of like I know I spent a little bit of time on this slide, but I think it is very important to understand a step-by-step of the process that we went through. And it was not-- it cannot happen over the night.
So what are our tools? All of the tools that we start using it, it is on cloud-- cloud is king-- because we don't want to have an Excel file that has lots of our cost data on our user desktop and could get lost.
So we are using Autodesk Construction Cloud, Autodesk Platform Services that formerly was Forge, Python, AWS Console like Redshift, QuickSight, Lambda, Amazon Q, S3, QuickSight. So these are all of the tools that we are using, and that is our toolbox.
And we are using the different tools for different reasons. For example, for our common data environment, we are heavily using right now in BIM 360 and in near future ACC build. Or for our data lake, most of our information, it is on S3 bucket.
The foundation-- so for implementing AI or any kind of technology and automation, we have three main pillars. And not caring about one pillar is going to make your foundation not a great foundation. Let's put it this way. The roots of the tree determine its height.
And I noticed that in lots of areas, there are people that really work good with the technology but totally ignore their users. And they don't daily talk with their users and see their user needs. So what is the three pillar? People, process, and technology.
So what is people? For people, it is, in the end of the day, all of these tools is going to get by the users. You are creating those tools for your customers. So you should listen to their feedback and do the implementation change and see what they need and what they don't need.
And it is totally OK-- it is be against what we study in the school, what we are studying as a data scientists or B managers, and kind of implements their need and not just do what we think it is better and explain for them why we think this process or this tool is better. And also, train those people and develop a network of champions that can, in their team, train other people.
About the process, we talk about it. We're going to have lots of workflows and lots of steps in our process. And we should work with our users to see how we can make this process lean process, that it takes less time, it manages the data fidelity, and we don't have the data loss.
And it could be through, again, all of our third pillar, through the people, and also through the technology and the tools that we are using.
And technology-- I think, for this part, you're all familiar how important the technology is going to help us with all of these process for our foundation. So for the people part, what we did for that-- and I went through that really fast.
But we develop a network of the champions, we train them, and they could help their team because, for example, right now, we may have 8,000 users on our projects. We cannot train all of them. So the best way is kind of train the trainers. And those people are the people that are going to answer the question inside of their team. Or if they have a big request, they can come to us, and we can help them move that forward.
In the start, we did a kind of university classroom also training. We went to different teams in the whole North America and trained them in the three kind of day classes.
It was very fun. It was like-- knowing people, knowing our users, and also training them. And we worked on the workflows together in the tools. We showed them, OK, that is how you are doing it in the past, and that is how we can do it in the future, and showed them the advantage of those new workflows.
And also, we made the kind of training, like user guide, step-by-step as you see. Click on here, and then second click on here. So even for the people that are not in that environment with us, after that, they can use it, or if they forget, they can use those training.
For example, I think the snapshot that we have, it is for the clash detection. And then we have a training for RFI. So just don't assume that your users know everything or if you train them, they will remember everything. We kind of understood that you should have training for each small piece of them.
And for the technology, we were like, OK, we have this workflow. For example, we need a checklist for our construction safety. That is the workflow. The tool that we want to use is BIM 360.
Let's look at it, how we can use all of these workflow in the different tools that we are having and make that ready for our users. Again, as an example, I said checklist. We made the template of the construction safety checklist ready or contract agreement request or punch list.
And we send it to all of our projects. So the base of the project with the base of the tool and technology when it's ready for our users, that is a way to kind of set them up for success-- same thing with RFI, meeting minute issues, document management. And-- yeah.
And how we did and how we made all of these automation-- we had problem with the BIM 360 for the autogeneration issue, the checklist, safety checklist. We had a problem for adding users. We had a problem with creating in the mass users.
How we automate all of those process and integration to the other tools is the next part that we want to talk about it. And Anthony was one of the superheroes on that. So I'm going to give the floor to Anthony to talk about, OK, that is our foundation. What is going on on our frame?
ANTHONY MELEO: Thanks, Nima. And I think the analogy of a foundation, the frame, and then the roof is really convenient, obviously, coming from the construction industry. But it also speaks to the importance of the foundation.
If you have a weak foundation, the rest of the tools you want to build, eventually AI, is going to be very difficult. And we'll talk about the cleaning of the data, things along those lines. But investing time into speaking with the end users, finding out their pain points, and resolving them together, I think, essentially, building trust goes a long way.
So I would almost say it's very important to-- I wouldn't almost say, I would say-- it's very important to focus on that foundation and really invest the time and resources because it will help you tenfold in the next steps that we'll talk about.
So let's talk about some of the problems that we were faced with and then how we were able to leverage APS, Autodesk Platform Services' APIs and their endpoints to resolve them. Pretty much right away, what we found is that we were manually creating too many projects. I mean, there are times, depending on the season, that we are generating dozens of projects in a week.
And having a human resource, human capital, actually spending time on that data entry was not really necessary. Nima talked about it a little bit in his portion of the presentation. We do have other business-critical software that we need to interface with, and lots of that information or data is already stored in them.
So, for example, project address-- we shouldn't be copying the project address from one software to another software. We can go in, we can grab it, and we can use it to develop projects when we need them.
So, yeah, so through the solution that we developed, right now, our end users would just provide us the ID for that business-critical software-- that's where we establish the integration-- their email, so we know who to add to the project. We also have a running list of users that get added to every project. So we know all of that information, and once we know it, then we can apply it.
So I won't go through every single endpoint that we leveraged. Here are the main ones. We even used the patch image so that when projects are created, they also have a corresponding image to their site code.
Here are some important fields. This is directly from the JSON request, I believe. We used job number to build that integration between other critical software. So where is the project first released? For us, it's our real estate software.
So once it's released and it qualifies for an ACC or BIM 360 project, we could use the same ID that's established in the real estate software and apply it to all other software downstream, including ACC BIM 360.
And then, for us, it's not as simple as we're building the same type of building or the same retrofit every single time. So another important field that we're leveraging is businessUnitId. This way we can establish what type of building this particular project is. Or maybe we're doing an EV project, and we could establish that through leveraging the business unit ID.
So, yeah, pretty much right after realizing we needed an automated way to create projects, we also needed a way for people to self-serve access to those projects. So we know, particularly internally at Amazon, who needs to be added to the projects. But we don't always know our external partners or vendors that need to be added.
And also, our team shouldn't have that responsibility from a security perspective of understanding who should be added to each project. So what we've done is we've partnered with one of our, I guess, sibling organizations, BIM Central, who developed this solution, which is we call it WW-BIM.
And, again, that allows both internal and external users to self-serve access to projects in a very secure way and interfaces with some of our third-party security to make sure that those companies do have an NDA signed. It also initiates a workflow to have an Amazon point of contact approve that request.
What's really nice about this tool is that it's not limited to one project. We find that lots of times, both internals and externals, they might need access to several projects all at the same time. So rather than them filling out a form every single time they need project access, they can request-- I don't even know if there is a limitation.
I know at one time it was 500 projects. I'm not sure anyone actually clicked 500 projects to add themselves to. But nonetheless, the ability is there. Again, this is all using APS API endpoints and AWS in the background.
And then you can see the bottom row that is where an external vendor would actually add an Amazonian to then approve their request. So that's done through email.
A little bit more sophisticated here-- so there are more API endpoints that were used. I'm not going to go through each one. And also, I think I would need a magnifying glass to do so. But, again, these slides are going to be posted, and you're free to review.
OK, so here we're going to get into, really, how we're interfacing with our data that's stored initially in ACC. So what Autodesk does a really good job of is allowing you to go in and extract your data that-- you're basically doing data entry through the lifespan of a project. And the first place, I think, really, where the most data is stored is the data connector tables.
So we needed to set up an automation to where we can go in, we can go get that data at will, now multiple times a day, we could establish relationships between those tables using Amazon Redshift, and then we could draw business-critical insights using our visualization dashboarding tool QuickSight.
Here's just a quick mapping of how we are extracting the data connector tables. We do it on a schedule. This is showing, I think, four different times. I believe now we're up to eight different times through the workday. Of course, we're an international company, so we need to make sure that, at least coast to coast, that the data is refreshed as much as possible.
And, of course, that goes from the data connector tables. We use AWS Lambda functions to drive that data into an S3 bucket. And then we'll talk in later in the presentation what we do with it from there.
I guess I'll take this opportunity to note in the top right corner of many slides, it'll actually show the tools that are applicable to what's being discussed in the slide itself. So you can see in the top right here we're using APS, of course, getting the data connector tables via API, Python with Lambda, SQL in Redshift, and then also S3.
OK, so I'm going to try to go through these slides fairly quickly. They will be posted so you can review the code line by line if you would really like to. I don't want to put everyone to sleep. Nima had to remind me that in the audience, it's going to be some coding people, very tech-savvy, and then people who maybe are more people-facing.
So I'll try to run through this. I'm just going to talk about what the code is achieving in each slide. So the first thing that you'll need to do is authenticate. To extract data connector tables, it requires three-legged authentication.
So first, what we're doing here is we're defining the scope. There's a few other variables like authentication_url, client_id, redirect_url, and others. What you do is you construct a URL that you would click, and then probably most of you have seen this window on the right-hand side. You need to allow the app to integrate with your data.
Once you click that link, you're going to generate an authorization code, which is the variable you need in the next step. Once you define that variable, you're going to then generate the access tokens. And three-legged authentication will generate two tokens, both an access token and also a refresh token.
And a helpful hint on that end is that the refresh token is valid for 15 days. So as long as you use the refresh token to generate a new access token within that 15 days, your token remains active, which what that would translate to is you could skip these previous seven steps by just using the refresh token within 15 days, which is what we do.
So once you have access to the data, you just generated the key. So you're in. You could access every single data connector table, and there are many-- probably close to 100, maybe more than that. And I wouldn't advise downloading them all at once. I mean, if you're working for a small or medium firm, and you're generating a small or medium amount of data, maybe that's advantageous to you to just download everything.
If that's the case, I probably would just do that directly through the UI. But in our case, downloading all those tables would be gigabytes worth of data eight times a day. It's not scalable. That's something AWS does really well. And that's why we're using it. So what this solution does, it actually allows you to pick the specific tables you want to download for whatever it is you're looking to use them for.
OK, here. I don't know if there's much to go into. You could have multiple data connector table jobs by different organizations. So we need to define which job that we want to extract. And that's what this is doing here.
I'm not going to go into detail about that slide. And then this is where it's actually showing all of the data connector tables. It's listing them for you. It gives the parent table and then how many subtables below.
So you can see the first row in activities, there's actually 15 separate data connector tables related to activities occurring on projects, checklists. You can see there's actually 16 data connector tables related to checklists. So, again, by downloading all of them, you would be trying to boil the ocean, which is something we want to avoid if possible.
In this slide, you actually have the ability through a widget to select the individual tables. I don't think it's pictured here. You could just see the tables that I had selected in my notebook, which shows activities, admin, assets, checklist, and it looks like there was at least one other one. But I selected to download the tables associated with the parent table, as I mentioned.
It's showing-- let me jump back. It's showing the actual tables that I'm downloading here in the next step. And then finally, here, it's aggregating the tables and the size of those tables. This way, you could see, what am I about to download-- is it gigabytes of data or is it megabytes?-- and then decide if you want to go ahead and download.
I think here is where we're kind of putting all the pieces together and downloading. And the last cell, you can see it actually has a download bar so you could track the progress. But what you receive after this is a zip file containing all of the tables that you selected.
OK, so you have the tables. What do you want to do with them? We go directly to an S3 bucket. Again, keeping everything in AWS just makes your life that much easier. So in this slide here, we're showing how to authenticate now into your AWS S3 and then send those tables there.
Here's a line chart or a map of what we had just discussed. It's going to the S3 bucket that those data connector tables are now stored in. There's a Lambda function that sends those tables into Redshift, where we establish relationships using SQL. And then finally, through Redshift, we publish to QuickSight tables.
Here's an example. I think the next few slides actually will go through examples in QuickSight of real tables that we have and visuals that we have and that we use daily to draw business insights. This is actually our team dashboard. Here we can track activity. As I mentioned earlier, one of those data connector tables, lots of them are related to activities on projects.
On this slide, we take a little bit more granular look into RFIs. We could see how quickly RFIs are being resolved on our projects. We could even break that down by vendor.
So if you're an Amazon vendor and you're doing RFIs, we're tracking that. Do it faster. So, yeah, and I think a little bit of this goes into checklists. Nima, do you want to jump in on any of these slides?
NIMA JAFARI: No, actually, I wanted to exactly go there. So when you have all of the data in one common data environment that I was explaining to you, extracting these datas are not really hard from there. The hard part is making everybody using the same tool and putting all of the information in one place.
And as an example, what Anthony is showing on the screen for the portfolio level of where the files are and which file we are having on the project, it was really hard to get that with out-of-the-box tool.
So with the APIs that it is available on BIM 360 and QuickSight Dashboard that actually, on the build, it's going to be reporting tool with the AWS QuickSight-- we could see where all of the files are in the project or our different project, what are the name of them. And we use that for our document collection.
And we could just click on it. If you are searching for the as-built of project 20, you click on it, and it shows you all of the files.
And it is very easy to see which document, for example, our general contractor or architect is giving it to us and upload it on the folders and which one it is not, as you see with the green and red. So, Anthony, back to you.
ANTHONY MELEO: I'll try to wrap up quickly here because now we're getting into some of the fun stuff. The roof-- why did we do everything that we talked about so far? We want to use machine learning and AI to better understand our inefficiencies on one hand, but also what are we doing well.
And what we've chosen to use is Amazon Q, really out of necessity, born out of necessity. We don't have a data scientist team at TES. So to use some of the more sophisticated AWS tools, of course, there's more features that you could use, but Amazon Q is a very simplified approach to getting very sophisticated insights.
This was kind of an incremental step that we took towards that solution. First, we were doing a little bit of manual work, nothing really heavy lifting, but we were going straight to our QuickSight CSV tables. That's where our teams were already generating very critical insights as they wanted to see it.
So what the team felt was important to them, it was in their QuickSight dashboard. That's a great place to start to train a machine learning or AI model. We just downloaded them as CSV files. We had the data right there.
We saved it initially on our local hard drive, and then we used a Jupyter Notebook to clean up that data. That's always going to be an important step when you're leveraging AI or machine learning. But once we cleaned up the data and we added a bit of context to it, we sent it again right back to S3. Amazon Q could be trained directly from an S3 bucket-- so no better way to store it.
So the code in the following slides is going to focus on where I have the dashed line here. I think it'll offer just some insights into what that process looks like. So we're going to go through downloading QuickSight tables and CSV, importing and converting those CSV into dataframe, which is a Python terminology, and then converting the clean dataframe into JSON files.
Why JSON files? We found that Amazon Q is better trained in JSON format rather than CSV. Again, not going to go line by line here-- showing you how to import CSV files from your hard drive, convert them into dataframe, so just renaming them into a more legible dataframe, loading them into the notebook, renaming the columns, as I mentioned earlier, cleaning the data.
So what do I mean by cleaning the data? You might have used-- as an example, you might have used one issue subtype and then another maybe a year later. But essentially, those two issue subtypes mean the same thing. So you want to map them together, you want to aggregate them into one place, and maybe you want to explain to Amazon Q what that issue subtype means.
Yeah, I won't go into a lot of detail there. Feel free to ask questions or meet up with us later. But add a file description to the JSON that will help Amazon Q understand what it's reading or what it's being trained on. Describe the column context.
So we went one step further. We're actually describing each column, what it's depicting. And then finally convert the data frame into a JSON file.
And then, as we did earlier, you want to bring that information directly into the S3 bucket so that Amazon Q can be trained on it. So this is where we're headed or maybe even where we are at now, maybe somewhere in between. But a lot of that can be automated and should be automated.
Previously, as I mentioned, we were downloading the files to our hard drive, cleaning them in a notebook. But there's no better way to do that than use AWS Lambda functions, which could be run on a schedule, or they have other triggers that could impact how the Lambda is grabbing the data.
I think a really important takeaway from this slide that I hadn't mentioned in the previous one is that we have two streams that are training our Amazon Q. One is the structured data, which is what I just talked about, related to the code, the JSON files, the dataframe, data connector tables. That's all considered structured data.
How structured? We're making structured data more structured by creating those JSON files. But you will also have a lot of unstructured data in the form of PDF, Excel. For us, our Standard Operating Procedures, SOPs, are in PDF format, design criteria.
So there's lots of very critical information that is important to train your Amazon Q model on. And there's another API for that, particularly the Data Management API from Autodesk Platform Services. So for us, we chose this two-stream approach to make sure that our Amazon Q is being trained not only in the structured data directly from our projects, but also high-level PDFs that are very impactful as well.
NIMA JAFARI: Yeah.
ANTHONY MELEO: I'll give it over to you, Nima.
NIMA JAFARI: Yeah, just building on what Anthony said, kind of more into a nontechnical standpoint, and I want to show a little bit of fun of actually using it.
So if you have-- just imagine you have a design standard. It is coming as a PDF. Or you have your safety regulations, like your OSHA regulations. You have your spec.
All of them is kind of, like, PDF, right? And you can save it in BIM 360. And with all of the slide that Anthony shows, it goes to S3 bucket and then Amazon Q. And the same time, if you have RFI punch list, checklist, and all of those kind of metadata and structured data, it can go to Amazon Q and can analyze both of them together and even combine those two data together.
So it is really easy when you make your data lake. It is as easy as just uploading the files into the BIM 360, like PDF drawings, whatever is that, and then have your users use checklist, punch list, RFI. And in the backbone, after setting it up, Amazon Q is going to do the rest.
So let's look at actually an example and kind of doing one test together. So here, that is kind of our interface. You can modify that interface.
So that is the use case that one of our safety managers wanted to see the issues that our inspectors, when they did the walkthrough through their iPad in the project 1, what are the issues that they were having? And we made a couple of project, like project 1, project 2, to have our data confidential in Amazon.
So as you see, it asks, what are our safety concerns in project 1? And it's kind of answered from those issues that was generated in BIM 360-- for example, the excavation. Here was had issue where people were working without PPE, or rigging equipment is not being properly inspected, stored, and take out the services.
Then you can ask the followup question. How many safety issue we have on this project? For example, in this project, it says we have 26 safety issues. Again, it's coming from issues in BIM 360.
And who is these issues assigned to? Who is responsible to take care of it? So in this example, it's saying John is responsible for that. Where it's getting it from? In assigning the issue in the BIM 360.
And now we know who it's assigned to. Who is the most person that has those issues? And this way, we can see in our project, who is the general contractor, or who is the person that is responsible for our safety issues? And we can plan ahead to rectify those issues.
And now we have lots of issues. We are saying, OK, can you summarize these issues that is assigned to John? We want to figure out what are the issues and what is the next step. And Amazon Q is helping us with all of that.
So the raw data was basically a couple of issues that we ask our inspector or safety manager to input in the Amazon Q-- not Amazon Q, BIM 360, and then goes to Amazon Q. And the rest, as you are seeing, is Amazon Q itself telling us, OK, well, how we can rectify these issue that this person is responsible of. What are the source of these issues, like lack of oversight, equipment and improper equipment?
And we are asking it, OK, now we understand who is more responsible. What are the pain points and result? And can you write us an email that we want to send it to him?
Even it does that for you also-- write the email with all of the issues, exactly what are they, categorize it, summarizing it, and your email to that person is ready. As you see on the bottom-- and that is a fun part of the Amazon Q-- it kind of look at our pattern and question and ask us, hey, for these type of questions that you are having, do you want me to create an app?
What is an app on the Amazon Q? It is basically, instead of people start chatting and ask all of these questions, it will generate the application for you. So you see it. It says, I'm going to build an app that takes the project identifier, name of the project, and it will tell you what kind of safety issues it has, how many safety issue it has-- exactly the questions that we are having-- who is responsible for this safety issue, and how we can rectify those issues.
So our user, the only thing that it needs to do there is going to go and put Project 22, whatever the name of that project is that. So as an example, we kind of delete the names of the project and actual issues and actual name of the people. And then Amazon Q is going to do the rest.
Even for application, you saw that. It was in five seconds. And you understand how even to make the application.
So we put the Project 22. As you see, it's generating total issue count, what are the safety issues lists on our project-- again, from BIM 360. And you can have other sources of that also. If you're using another program for another checklist, again, all you need to do is come to your data lake and, from your data lake, go to the S3 bucket and Amazon Q.
And as you see, it's going to come to the top assigning issue detail. Who is it assigned to? What was our issue list? What is the suggested remediation action for even fixing those for each of the project?
And you can expand it for RFIs. What are the RFIs on this project? Who is responsible for that? Which one is overdue?
Same as submittals, meeting minutes, your change order-- and it's as easy as that, and creating the apps even on top of that. So, yeah, I hope that gives you guys a quick demo overview of the Amazon Q. We can show you more example in person.
And, yeah, I feel it is very fascinating, and we are working on the next steps, that they are how we can reduce our risk on our project, our safety risk, our schedule risk, our change order risk, with all of this data that we are having. Ant, any last words?
ANTHONY MELEO: No, I think I'll just double down on earlier sentiments about spending a lot of time earning trust with the end users, understanding their pain points. Get your data into a data lake on AWS. And from there, the flashy and very cool tools will come.
We chose Amazon Q. We have had a lot of success with that. There are others. There's CEDRIC. They're built on Bedrock. The list goes on and on. And the innovations will continue to come. And, yeah, we're just excited about the future and excited to meet with you all.
NIMA JAFARI: Thank you very much, Anthony. Thank you very much, everybody. Have a good day.