Description
Key Learnings
- Learn about quality management challenges and the growing importance of data in the AEC industry.
- Discover the risks of a reactive data-validation approach, and its negative impact on project delivery and your organization.
- Learn about defining methods of proactive and automated data validation using Autodesk products, business intelligence, and custom tools.
- Learn about validating data in real time using the Revit API, Autodesk Construction Cloud, and events-driven programming to ensure consistency.
Speaker
- Mateusz LukasiewiczMateusz Lukasiewicz has over 12 years of experience in the AEC industry, and throughout his career, he successfully led digital delivery of large-scale projects and developed a number of modern digital engineering solutions by combining BIM expertise, computer programming skills and project management principles. Mateusz undertakes a vital role in driving company's clear vision towards achieving the leading digital innovator position in the market and its long-term digital capability goals.
MATEUSZ LUKASIEWICZ: Hi, everyone, and welcome to my presentation. Today we will talk about data validation methods reactive, proactive, automated, and in real time. My name is Mateusz Lukasiewicz. I'm digital projects manager at KEO International Consultants. I'm based in Dubai. In my role, I'm focused on BIM, computer programming, and computational design, project and construction management, and digital twins.
This class contains two parts. In the first, theoretical, will emphasize the importance of data, the meaning and types of data validation. We'll talk about quality and how to define data requirements. In the second part, we will explore different methods of data validation and show multiple practical examples starting from manual visual methods, through semi-automated, to advanced methods using custom tools and event-driven programming.
Before we move to data validation, let's answer a question. What is the data? Information, facts, or statistics collected for analysis, decision making, or communication. Data can be quantitative, meaning it is countable and gathered by measuring. Another type of data is qualitative, which is descriptive, subjective, and is gathered through observations and surveys rather than by measurements.
Knowing that data definition covers pretty much everything that we can imagine. We can easily agree with the statement that data is used throughout the project life cycle in AEC industry. Starting from design, data is used in project planning, decision making, iterative design. It can help in achieving sustainability goals and cost estimation.
True construction, where we use data to optimize resources, improve health and safety, track the progress, manage supply chain, and increase the quality of deliverables. Finally, in operations to analyze the performance of assets, improve maintenance, predict demand, and finally, feed AI and machine learning tools.
Are we successful in using data? Based on multiple surveys, no. Most of the surveys says that anything between 70% to 99% data is not used. What is the truth? It is really hard to say.
If we look on this model, it is very difficult to answer the question, how much data is in this model? Is it the size in megabytes? Is it the number of elements, or maybe something else, maybe number of parameters? Also, another question is, how do we define used and unused data?
So as long as we are not able to agree on the exact percentage, we can probably agree that potential of data is not fully utilized in AEC industry. Why data is not used? Because of poor quality. What is the quality then? Its conformance with specific standards and requirements. Poor quality data increases project cost, risks, and may lead to delays.
As expected, quality comes at cost. Conformance, which is the cost of achieving quality during delivery, it includes prevention costs such as QA/QC procedures, data validation, which is the primary topic of this class, research and training, also appraisal costs such as inspections, testing, and fixing. As shown on the chart, the more we invest, the higher the quality can be achieved.
The second component is cost of non-conformance, which is the cost of achieving quality after product delivery. It includes rework, liabilities, lost opportunities, and reputational damages. The relation is the worst quality of delivered product, the higher is the cost. The total cost, as expected, is the sum of these two.
What should be our target then? That really depends on the type of project and your organization's risk tolerance. However, in my opinion, we should probably aim to be on the right side of the intersection of these two curves, meaning we invest more in cost of prevention rather than in fixing the errors.
Quality can be achieved by validating the data. What is it then? It's a practice of checking the integrity, accuracy, and structure of data before it is used for a business operation. There are a few types of data validation.
One can be a number. It can be 100%, a number stored as a text. It can be a fraction or ratio. From data validation point, all these types are different. Constraints validation, which is checking whether the data is within certain range.
Consistency, for example, decimal point precision. Structure validation, so let's have a look on these two objects. These are the classes, to be more specific. And both describes reinforcement concrete floor. However, you can notice that the data structure is different both in terms of the properties and also the data types.
Finally, we have code validation, where we are validating whether certain data is compliant with the code. In this case, 800 millimeters column strip is not compliant with specific section of ACI code. This is that probably most complex type of data validation.
To validate data, we need to define the requirements. The requirements are typically defined by appointing party, which is the client or developer. The documentation is set with lit appointed party. for example, if the lead consultant, who reviews the requirements and creates a BIM execution plan, that is followed by other appointed parties.
To streamline data validation, we would like to use structured data, which is tabulated, specific, and restricted, rather than operate on non-structured data such text and images. In practice, looking at the sample BIM execution plan, we'll notice some data requirements such as project coordinates, which are captured as an image, and also naming convention description. That will be captured as a sentence.
This data is very difficult to read by external application. And to fix this, we would like to replace this requirements by using structured data. This can be easily done by using, for example, spreadsheet, which is saved in shared area in common data environment, which is Autodesk Docs, in our case.
So this is the file that everyone can access, and this document contains information about project information, models list, also the drawings list. So it may be your MIPP. Similar for the location,
So you can agree it's easier to access this data programmatically than by reading the image in BIM execution plan. Now we are looking on the naming system and finally parameters, where we can observe details about parameter name, parameter type, applicable Revit category, unit, and also data restrictions. So this file will be fundamental for multiple data validation methods that we'll be exploring in the second part of this presentation.
Now, let's move to the second part of presentation, which are the examples of data validation. So we are moving into practical implementation of the methods. Data can be validated using various tools, using various methods, which can be classified based on engagement.
So we can do it reactively based on comments received from other stakeholders or observed [INAUDIBLE] model and processes performance. Or it can be done proactively, which is the preferred way which will be focused on today. We can do it inside and outside the models. In regards to scope, we can read data or write, meaning fixing errors or block software functionality.
Data validation can be done manually or automated. In our case, we'll be using event-driven programming concept, which means that scripts, such as Revit plugins, will be executed once certain event occurs in Revit. There are more than 100-plus events available in Revit API. We will see examples of running scripts on models sync event, family load, view change, and when certain button is clicked in default user interface in Revit.
A verification can be done visually by visual model inspection or analytically by using formulas or custom functions. In terms of user awareness, users can be aware of data validation happening or alerted on non-compliance, or this process can happen silently in the background without user being aware of something happening.
Now we'll move to the first example. This is the very most basic way of checking the data. We are now in federated model. We'll be visually inspecting the models and object properties. So we are currently in the desktop application sectioning the models, selecting elements, and visually checking the properties.
If we don't have the license, if you don't have the software, this exercise can be done directly in common data environment. For example, we can use Building Model Viewer in Autodesk Docs in Autodesk Construction Cloud. You'll see exactly the same model. We can obtain element's properties.
And finally, if we don't have license for desktop application, and maybe project is not hosted in Autodesk Construction Cloud, we can also use some free web model viewer so you can simply upload the file and see exactly the same content without having any license any access to the project. So in summary, the visual inspection is very easy to perform by anyone in the project. It may not require any software installation, and person can be easily trained.
Previous method, although it is easy, it's not very effective. In this example, we'll be using basic formulas to validate the data in the native model and also outside by using schedules. So if you are a model author, for example, you are a Revit user, probably the easiest way of verifying data is to use schedules.
In this case, we'll be looking at [INAUDIBLE] schedule verifying fire rating parameter. We can see some blank entries also some invalid entries. And structure our column schedule, where we are verifying concrete grade parameter. So this is the validation. It's happening visually as we are just looking on the schedules.
The better way of doing it manually, is to export data and automate validation by using formulas. We can simply extract CSV file from Revit model, open it in Excel, and use formulas to validate the data. So basically, we are comparing the data extract with the data restrictions as per project requirements spreadsheet.
In this case, I tried to ask ChatGPT to write this formula for us and I was pretty much successful. It was more like fun exercise to do it in a slightly different way.
So even if you are not very familiar with Excel, you can easily get this formula from AI tools. Obviously, you may need to change some references, in this case, reference to specific cell in project requirements spreadsheet. So here we are. The data is already validated. We can now create some charts and build reports that can be shared with other project stakeholders.
So we can simply select the results, create a pivot table, and create some charts. So this is probably the most popular way of validating the data. It does not require much skills to export data and use formulas to build reports.
OK, let's move on. Manual data extraction is not the most efficient way of obtaining the data. In this case, model metrics are exported automatically whenever a model is synchronized. As for the data, validation is happening in the background without explicitly writing formulas.
So this is the example where we are using Revit API. We are using the sync event, which means that whenever the user synchronized the model, there is certain script being executed in the background. In this case, the script is extracting and validating the data and saving it to ACC.
So in this case, you can see the Power BI dashboard. It is hosted directly in common data environment. It is accessible by entire project team members. So this is the data for [INAUDIBLE] models. Also datasource is saved in a common location. So we are all operating on the same data.
So now we are in Revit files. We see that once the model is synchronized, we will see this pop-up notification informing about data extraction. Obviously, we can disable this pop-up notification. Can see updated file uploaded in ACC. So this is the output of the data validation that was executed at model sync. Now we can simply refresh the data source in Power BI and republish the dashboard.
In summary, this method is very effective as data is first of all exported automatically. It's also validated automatically, as we are not using any formulas explicitly. Also, the results are available for everyone as dashboard is hosted in common data environment. So there is obviously some benefit over the previous method.
What if we don't want to open models? Maybe we don't have the license. Maybe we don't know how to interact with models. What would be the best method then? We can obtain the data from outside of model environment by using Autodesk Platform Services, formerly known as Forge.
In this video, in the first part, we will obtain multiple object properties by using this very simple nodejs application. Here we are using the application in a terminal window. Obviously, we can build some nicer user interface. We can host this application in the web. So we can see the very comprehensive list of parameters for all objects in the models.
Obviously, there's a lot of data that we don't want to be displayed. So in the second part, we'll set up some filter to obtain only specific elements that belongs to models in ABC organization. And we will focus only on concrete grade and fire-rating parameters.
We are still not validating data. We are simply accessing the data from outside model environment. So in this case, we can see that the results looks quite different. We just see the list of elements that contains concrete grade and fire-rating parameter with corresponding element ID. So this is a proof that we can access data outside of the model environment, as we never opened Revit. And finally, we are going to validate these four models for multiple parameters.
We are now accessing the project specific folder. You can see that now four models have been validated. And we are going to see the results shortly. So now we can see the results. So this method is very powerful. In a matter of few seconds, we can validate multiple models containing thousands of objects without even downloading the files. So if you are on a client site or have subconsultants working under your organization, this is probably the perfect way to validate the data.
Now, let's go further. As we know that clients may have also some data validation tools to give us feedback about our deliverables, how we can ensure 100% data compliance and avoid negative feedback?
In this case, we are going to explore solution of preventing users from saving incorrect data by validating data when user attempts to synchronize the model or change the view. So these are our events, model synchronize and view change. If there is any incompliance, user will be alerted by the warning and prevented from saving his work.
In the first part of the video, we can see that organization name parameter is not filled, whereas, it is required as per our data requirements spreadsheet. Now, whenever user is changing the view, he will be reminded about missing data by seeing this pop-up message. The only way to disable this pop-up message, is to add the missing data. So now once we add the missing data, we see that the pop-up message disappeared.
In the second example, we'll observe incorrect parameter values. So again, we are looking on the project requirements. We are looking at the restrictions for fire rating, concrete grade, and air velocity.
You can see incorrect entries for structural columns. There is no such grade as 50-60. Self compliant to the naming convention that we implemented, there is no 45 fire rating, as we are operating on 30 minutes Intervals. And similar for the duct, the velocity is outside of the range.
So now once we attempt to synchronize the model, we receive the detail information about noncompliant elements. We'll also observe that the model sync event is canceled. Sorry, not the sync event, the model sync is canceled. So basically, the user is not able to synchronize his work. And the only way to do it is to fix the data.
So in the summary, probably you can see that this method is extremely efficient. We can prevent incorrect data being saved in the model and can actually ensure 100% data compliance as wrong data cannot be saved in the model. So now after fixing the model, we can see that model can be synchronized correctly. And as you notice, we didn't click any button. All these scripts are happening on event. There is no need to click any button in Revit to execute this script.
In all previous examples, we are focused on reading data and validating using various methods. In this example, we will explore different scenario and rather than informing user about certain incompliances, we will use the tool to not only verify the data but also to rectify the issues found.
In addition, this process will be happening silently without user being aware of certain script running in the background. In the previous examples, we have seen all these pop-up messages telling the user please fix this and that. In this case, we will see slightly different scenario.
In the video, we can see 12 sheets in the model out of 29 required as per the sheets list for this particular file. In the previous example, we simply informed the user about missing sheets. However, in this case, we will simply create sheets at model sync event.
So we are synchronizing the model. There is no pop-up notification, but we will notice that there are 17 seats automatically added. Obviously, this tool is dynamic. If there are more sheets added, the spreadsheet in the future, they would be added automatically in the next model sync event. So basically, from the user perspective, he was not even aware that something's happened, but we actually ensured this 100% compliance against [INAUDIBLE] requirements.
So in summary, in certain cases, instead of validating data and informing user about incompliances, we can automatically add correct data in the model. In the last example, I will introduce extreme method of ensuring data compliance. Instead of checking, we will simply block certain functionalities.
Looking at BIM execution plans, very often we can notice that use of DWG files is prohibited. Also, there may be special requirements about the origin of parametric families, such as approved location in Autodesk Construction Cloud or any other common data environment.
These requirements can be captured in project requirements spreadsheet. This is the folder path. We also say that cut import functionality is blocked. When the user attempts to use import cut functionality, he will be alerted that this functionality is blocked, and there is no way to import cut.
Also, the load family command is blocked whenever we user attempt to load family that is not from the approved location. So in this case, we are trying to load the family from work in progress folder, rather from the approved library.
So in summary, this is very simple blocking mechanism, but it guarantees 100% compliance with data requirements, in this case, DWG file format restrictions and the origin of Revit families.
In conclusion, data is an asset in AEC industry. Structured and specific data requirements are essential in ensuring efficient data validation. Data can be validated using different approaches. We have explored various examples starting from manual through semi-automated to fully automated event-based custom solutions.
This slide concludes my presentation. I hope you enjoyed the presentation, and thank you for watching.
Downloads
Tags
Product | |
Industries | |
Topics |