Introducing AI Rendering: From Okay to Wow!

설명

Join this session to learn how a web-based configurator built with Autodesk Platform Services can help to improve sales. With an easy-to-use web tool, we'll see how nontechnical users can design interior floor plans and produce high-quality rendering. From enhancing traditional path-tracer renderings to offering new WebGL optioneering workflows, let's explore what new AI rendering techniques can do in a fraction of the required traditional time and cost.

주요 학습

Find the more suitable approach to enabling 3D rendering from a web-based app.
Learn how to enhance traditional path-tracer renderings with AI.
Learn how to generate 3D rendering from a WebGL-based app with prompt-style control.

발표자

Alexandre Piro
I studied mechanical engineering and industrial organisation. Graduated in 2012, I started to work in aeronautics on tooling and efficiency improvement. In 2016, I began Augmented Reality development with CAD models and after 6 months I joined PIRO CIE. I spent one year of research and POC developments on Augmented and Virtual Reality. Since 2018, I lead AR/VR projects and web workflows to be more flexible with data exchange. I also started to work on Autodesk Platform Services to explore another aspect of CAD and 3D manipulation outside of editors. I am actively working with Forge Dev Team and participated to 4 APS Accelerators. I am able to develop on different parts of the APS API. At AU 2022, I presented a class about point cloud integration in a construction workflow with APS.
Michael Beale
Michael Beale is a Senior Developer Consultant since July 2017 for the Autodesk Developer Network and Forge Development Partner Program. Before joining the Forge Platform Team, Michael worked on Autodesk Homestyler, Cloud Rendering, Stereo Panorama Service (pano.autodesk.com) and A360 Interactive Cloud Renderer before working on the ‘Forge Viewer’ and Viewing APIs. Twitter: @micbeale Blog: https://aps.autodesk.com/author/michael-beale

Video Player is loading.

Current Time 0:00

Duration 0:00

Loaded: 0%

Stream Type LIVE

Remaining Time 0:00

Transcript

MICHAEL BEALE: Hi, everyone. And welcome to today's talk-- "AI rendering from OK to Wow!" I'm Michael Beale, developer advocate for Autodesk Platform Services. And today, we've got an awesome speaker with us, Alexandra Piro from PIRO CIE. And together, we're going to dig into how traditional rendering workflows are evolving with AI. And trust me, the results have a bit of a wow factor.

So we'll kick off with a real-world case study from Alex. His company, PIRO CIE, worked with Werqwise, a company that leases office spaces to build floorplan configurators, and that works in your browser. It lets you create wall partitions, CAD drawings, and high-quality renderings. Now, they use Enscape for their rendering engine, so we're going to see how we can use AI to enhance these visuals to the next level.

And here's the rest of the agenda. After we talk about traditional rendering, we'll start with the AI rendering part and explore some new workflows that can speed up your creativity. I'll introduce AI control nets, a tool called Comfy UI. And we'll explore how to generate high-quality AI renderings with depth and normal maps.

Finally, we'll automate what we've learnt into a sample web app that you can try yourself. Now, like any new disruptions, there's pros and cons, and Alex is going to cover these in the future directions and conclusions. So with that, let's get into it. Alex, the floor is yours.

ALEXANDRE PIRO: Thank you, Michael. Good morning, everyone. So my name is Alex. I'm the software engineer for PIRO CIE. I'm based in France. So who is PIRO CIE? So this is a software development company providing consulting services. They are helping customers in their BIM CAD workflows, whether in the industry or in the construction.

They are providing dedicated software such as desktop plugin for Autodesk products, Revit, AutoCAD, Navisworks, visualization application with augmented or mixed reality, as well as web solution using Autodesk Platform Services for different purposes. Can be configurators, digital twins or IOT, or even just handling the data. And for the last seven years, I've been focusing on workflows outside of editors, and I've been leading all the augmented reality on web-based projects for the company.

And today, we are talking about one of our customers, Werqwise. So PIRO CIE managed to support them. So Werqwise is a San Francisco-based company offering workspaces. They revolutionize the workspace organization by creating their own system of modular walls. It gives them the ability to maximize space utilization and provide adaptable workspaces. They offer scalable and customizable solutions to the customer to ensure productivity and well-being.

So where did we start? How they are doing that? And which tool are they using today to design their workspaces? Today, their workflow is based on Revit to generate 3D views as well as 2D floorplans.

They're providing the complete 3D model of the building, but also each level separated. To help sales promote their workspaces, they are currently using Enscape to create beautiful renderings as a top down view to replace traditional blueprints or 3D interior views.

So here we can view two different views. So a top down rendering view compared to a traditional 2D line style floorplan. We can easily understand that most people will prefer, use, and present this image over traditional blueprints, which are initially made for technicals and builders.

Unfortunately, this workflow is not very intuitive for salespeople, as any other non-technical users. Revit is a building design software and requires some knowledge to be easily manipulated. The landscape loading also requires some visualization and wondering concept. Underlying lightning textures are necessary steps to create the most realistic images, and it cannot be really trivial sometimes. Enscape plugin helps to manage some of these parameters with predefined functionalities, but it still requires some manual work.

So to help with these two softwares, here's an example of guidance page-- step-by-step tutorial for non-technical user to be able to navigate and create floorplan in Revit, and then render with Enscape where we can select a level, where we can click for each functionalities, and how these navigation tools are working every feature that the needs to manage this workflow themselves. It's the kind of document that highlight why we want to move to a more intuitive approach and reduce the learning curve. In addition to the non-agent, the more we automate, the less time people spend on non-value-added task, even for technical persons.

So here's the platform that will be introducing breakdowns. So we have three main components-- first, the configuration part. Having the ability to create new layouts, edit the wall position, add or remove furniture, and a few helpers like duplicate or mirror rooms. Of course, they can save multiple design options and apply the current design to the Revit model with the help of design automation for Revit.

The second component is the digital twin parts. They want to keep track of the current state of the building through the layout, but also collecting data from sensors, manual entries, and be able to run analytics on this data. This helped them to optimize the building usage. But we are not going to talk so much on this part in details in this session.

Finally, the rendering component-- this is the one that we are the most interested in this session. So as we said before, to help salespeople, they need high-quality images of the layout proposal. Top down views and 3D interior views can be both generated and sent to the final user.

So here's a quick look of the platform we built in using Autodesk Platform Services. Some of them-- some of you can recognize the Autodesk viewer. And this platform, we can see different views like the 3D views and the 2D views. We can use the built-in navigation tools provided by the viewer, and we can navigate through the model as any other model.

Here's an example of a separated level, which helps us to edit a single level. And this is the editor. This is how it looks like. So on the left part, we have a few elements catalog that we can select and drop them into the floor plan. You can see on the view that we have a grid system to place the wall because they are predefined lens elements. So like this, we have the ability to place the elements manually.

So let's watch a demo of this platform. So starting from the viewing mode, we can switch to the Edit mode. And we can see all the elements already placed in this layout, as well as the grid, which up here with every element placed on top of it. So we are deleting some wall to create a new space. We can see the ghost of the previous element, which reminds us where was the previous layout.

Starting from the first, we can select the plus symbol to add adjacent element or drag element from the left side. We also see that we have a collision feature which prevents from overlapping elements. So it's red when it's overlapping other elements. And we can easily duplicate existing elements such as a group of desk in this example, or add different furniture from the catalog here. Then this element can be rotated and moved to the final position.

In this example, we edited everything from the top view like a 2D editor, but we are still in 3D. So if we want to edit in 3D, this is possible because we are in 3D. And we can edit our plan. And finally, we can save our design.

Now, let's talk about rendering, and more specifically, the traditional techniques-- what result we can get from this renderer, and how we can use them in an automated pipeline. First is the result with an Enscape. This is the rendering plugin that you use by Enscape and by Werqwise in their original workflow.

We can see this image is very smooth and sharp. The lighting is quite subtle, and the texturing, pretty flat. Can notice that the rendering time for this rendering is about 30 seconds.

A VRAY IS rendering software provided by [INAUDIBLE], the same companies that produce Enscape. This engine is different, and the setup is also different. In this case, we saw the lighting is more taken into account and provide more depth to the different textures. The rendering time is quite the same, about 20 seconds in this case.

GPU-PathTracing is another technique that we can use to render. It takes advantage of the ability of WebGL 2 to access the GPU and run pathtracing and rendering jobs from a web viewer. In this case, it is made with 3DS. We can see that the result is pretty similar to the rendering, except some materials and colors, which are different due to file conversions. But the rendering time is much higher for the same quality, and we'll see that later.

And last but not least, Blender. It's a very famous open source multi-purpose 3D software. It allows you to render and can be automated with Python scripts. Rendered image looks also very smooth and sharp. The lighting and textures are very similar to the others, even if this image looks darker. This is the more performance in terms of rendering time that we got.

So speaking about performance, I run this rendering on the same hardware, and we can see that they are pretty close each other-- less than 30 seconds to render this top down view-- except for when GPU-PathTracing, which require more sample to get a similar level of quality. And it takes about one to two minutes to render.

Let's take a look at some 3D-rendered images. Here we have three different scenes-- a single and share from an office scene with multiple desk, break rooms with library. You can appreciate some details in the library, such as different books and armchairs. Then another general place with table on different seats. There's a chair, couch. In all this scene, we can see the multiple lights affecting the material reflection and the shadows even if it still look a little bit flat.

Now, let's compare the different capabilities of this software. So when comparing these different software options, several important point come to light, especially about automation, the file compatibility, the cost, and the performance-- keeping in mind that our objective is to automate this workflow.

So first, landscape. And one of the main limitation with landscape is that it doesn't offer any public API. It's a plug-in so it requires to run Revit in headless mode. And unfortunately, this is not allowed, which makes it difficult or impossible to automate the rendering process. Even if the performance and the quality are great and having the possibility to work with native Revit file, it's not a good choice for automation, so we must find another option.

Next, we have VRAY, another software from Chaos, which is a bit more flexible. VRAY does provide an API through its SDK, making automation possible, which is a huge plus. However, we need to convert Revit files to glTF or any compatible format before rendering. VRAY comes with licensing costs as well and require more setup, since setup than Enscape. As seen before, quality, and performance are great.

Now, moving to design automation for 3DSMax. We didn't talk about this solution before because when-- but when it comes to automation, we remember that Autodesk is providing this kind of services. Despite 3DSMax, will be a great solution for rendering and importing Revit files.

We need to keep in mind that design automation relies on CPU-only servers. That means slower performance. And it results in higher cost with lower-quality results. So while it can be a useful tool, in some cases, using it for realistic rendering of complex scenes is not worth it. So we need to find another solution in this case.

Then there's GPU-PathTracing with 3DS. With a quality which can be close to ray tracing, GPU-PathTracing tracing for 3DS relies on the client side, which is a bit trickier to automate. We also need to convert the file into a supported format as glTF. And the performance are below the other solution as we saw in the performance comparison before. So we need to consider this solution for this purpose, but it can be interesting in some use case.

Finally, let's talk about Blender. Blender is a great option, as it is very popular open source software with a great API and a lot of support. And it's highly customizable. Even if it's require file conversion, it's a good competitor to VRAY.

Despite the automation capabilities, rendering high-quality and realistic images still requires scene preparation, such as lighting, material setup, and texturing. These steps involve a time-consuming work. And now Michael is going to show us how AI can help us in this process.

MICHAEL BEALE: So Alex has been showing some ray-tracing results and ray tracing has been the gold standard in rendering for decades. But what if AI could take this a step further? We're going to see how AI can enhance existing ray-traced renderings to help generate something that looks even more realistic with less effort.

So let's try it. Let's take that existing workflow from Werqwise configurator. Here it is. Here's a rendering. And we're going to enhance it with AI. That's what it looks like. So let's go back.

This is the original rendering. You can see that it looks very clean. It's traditional computer-generated rendering and a little bit too empty. How could we make it more realistic? Now, I'm not an artist. But the time and effort to make things look more realistic goes up exponentially. To make things more realistic, it costs us exponentially more.

So let's see what AI is doing here. So when I flip back and forth here, you can see it immediately just looks better, but it's hard to pinpoint exactly what looks better. There's the original, and this is the AI-rendered image. So what changed? Well, let's take a closer look.

I've highlighted a few things with a circle that's colored in green. First, you can see improvements in all of the texture materials. You can see the green chair in the top left. You can see that the leather now looks a little bit more reflective. The steelcase chairs in the middle now have a more pronounced fabric backing. And the table textures, the wood grain, is now a little bit more pronounced.

And finally, if you look in the whiteboard in the top left, you can see that there's actually new smudges there. Now, this took me all of about three minutes of effort. And that's a big time saving, even for an artist. The amount of effort it would take me to clean up these UV coordinates and then add high-quality textures would take me hours.

So how did I do it? Well, I could say I cheated a little bit. I used an AI tool called Krea AI. And this is another example of that Krea AI tool with Blender. Here I've got a an interior scene, and I'm using Krea AI, and this is based on a a podcast.

And you can see that AI has enhanced the rendering quality of this carpet. You can zoom in here and take a closer look at what's changed exactly. It's changed the way the carpet is shifted around, and that would take a lot of artist effort to make that happen. And AI has just done that with a simple 30second conversion or enhancement.

So this is AI enhancement. It can enhance the realism of a traditional ray-traced rendering with just a few clicks. But what if AI could do all of it? What if it could also do the ray tracing, too? Now, this is what we're calling AI rendering. And this is part 2, "Exploring AI Rendering Techniques."

So let's give it a go. We're going to give AI a few PBR, Physically-Based Rendering layers from our viewport. Specifically, we'll give it the RGB map, maybe the edge map, and the depth map, or maybe the normal map. And we'll provide a simple text prompt, like, make this kitchen scene nice and sunny and look like an IKEA catalog. So we'll go from this to this.

We can guide the style we want with natural language. We can just give it a text description that describes our style. And it goes and does the rendering. It's kind of like Clippy, Clippy for rendering. Does anyone remember Clippy? Hey, Clippy, can you stylize this house 100 different ways really quickly?

Well, let's see. So here it is here. It's stylizing lots of different options. Some look realistic; some, not so much. They look more like a toy. Or sometimes they look like an oil painting. And other times, they look like a toy model.

Or maybe I've got a 2D floor isometric view. I can also render that out, or maybe an interior scene like this one. I can have something that's quite subtle like this with dappled sunlight or something that's much more amplified that looks like this.

So one other thing to remember, though-- always think AI. Thanks Clippy, because you never know that one day maybe, they'll say, keep this one alive. He always said, please.

All right, so how does this work? This is all based on a technology called Stable Diffusion, a text-to-image diffusion model that combines a VAE a UNet, and a sampler. And it operates in two distinct phases you can see from this diagram.

The first phase, the forward phase, is also called the training phase. You can see a diagram here on the right, an animation on the right. We're going to feed-- we're going to feed the model millions of images, like these cats and dogs. And we're gradually going to add noise until they become unrecognizable. The goal of training is for the model to learn how to remove that noise.

The model also learns to associate these images with text prompts. So for example, it might see an image of a white cat and learn to link that image with the prompt, white cat. The UNet is used to predict and remove noise from images during this diffusion process. And it plays a key role in refining the image at each time step, as you can see in this animation. OK, so that's the first phase.

The reverse phase, we also call the sampling phase. Now, this is where the magic happens. When you give that model a prompt, like, show me a white cat, it starts from pure random noise, which you can see in the bottom right, and it gradually removes that noise step by step. The model uses what it learned during the training phase to refine the noise into a clear image of a white cat, guided by our text prompt.

So the simple takeaway-- during training, the model learns from real images of cats and dogs and the text prompts. And then during this phase, the sampling phase, we can give it a text prompt like white cat. And it used what it learned to generate an image of something that matches a white cat. So this is pretty familiar. If you're used to using things like mid-journey or DALL-E. This is that same process.

So, so far, we've talked about text prompts, and that can guide the image generation. But sometimes, this text prompt isn't enough. We want to control our image a little bit more so it can match an exact camera angle, and we want to be able to fine tune the textures that it chooses.

We can add this extra information through what's called a UNet model. And that part is responsible for removing the noise and influence it using the cross-attention layers. And that will use these new input layers like the depth map or the edge map or a segmentation map. And that'll guide that image generation more precisely.

This is a control net. It goes beyond these text prompts, and it can help us generate things based on, say, a person in a specific pose. And you can see that here in the image on the top, the top right. I've given a particular pose map, and then that's going to control how the person is positioned. Or I can give it, say, an edge map or an outline of an image like this little bird. And that's going to be the edge map that will guide the control net into producing that final image of a bird. Now, the last one is if we want to give it a depth map, it can understand the perspective in the scene, and that will ensure the model understands the 3D structure.

So this all sounds really complex and academic, right? But thankfully, there are some tools out there that makes this stuff easy to get up to speed. The open source community has moved fast, and they've provided a number of great tools that are a balance between power users and quick and easy-to-use tools.

One of these tools is Comy.UI. This is a graph node system. Now, it can run on a powerful laptop computer with a GPU, or it can run on the cloud. Comfy embeds your graph nodes inside the PNG or JPEG that's generated, and that makes it really easy for you to share your work with other folks. Let me walk you through a quick example.

Now, in this demo, I'm just running Comfy from my browser, but I'm going to make it submit the jobs to an AI job in the cloud, a GPU running in the cloud. So let's start with, say, an empty canvas like this one, and I'm going to add some nodes. I double click. I search for the node that I want. And then I want to join these two nodes together. So I can click on their little interface here and join these two nodes together, or I can create a new node by just dragging it out.

Now notice here in this-- I'm going to clear the canvas here. And you're going to notice that I'm going to just drag an image onto the canvas. So here's an image I generated earlier. I drag it onto the canvas, and I can see my node graph is already there. So that's a really easy way to share things.

One other thing-- I want to just quickly pause the video here and just show you the text prompt. In the top left, it says, "wide angle camera, modern office with open workspace, add some sunlight et cetera." So that's the text prompt. Now, before I run this, let's also change the resolution, which I just did. It's changed from 1,024 512 as the height. Now, I'm going to run this. Oops, let me go back and I'm going to run this.

OK, so let me drag that image back again. I'll drag that image onto the canvas. You can see that workspace that I was using, that workflow, my text prompt. I changed one of the nodes, one of the values. Now, I click the Run button. And you can see that it's now running on NVIDIA GPU and L4. And it's running that process, and it's generated an image. Great.

Now, let's generate a few more options. I'll change the batch size here to four, and I'll run this job again. Now because it's running GPU serverless, it's going to perform more operations a little bit more efficiently. So here's the result-- I've got four different random images of my interior office space, and I can compare them side by side like this. So that's the basic workflow.

Now, let's add our depth and normal maps. So this is a little bit more of a complex workflow. And I just wanted to show you the additional node graphs that are required. So a little bit more complex. You can see there's a lot more node graphs going on here. And this time, I'm going to add a depth map. This is job running now. I click Run, and it's now going to run the result.

So specifically, there are a few areas that I want to dig into with you. The first one is the inputs. Down the bottom left there, you can see I've added a depth map and a normal map. And at the top left, I've also provided a style image. So this could be like an image from an IKEA catalog that I like. And that style image is going to get converted into latent space using the VAE. And the model will come from a checkpoint. And then finally, this all comes together in our case sampler to render out the final image.

Now, what if we combine this with our APS viewer? So here it is, my APS viewer. I've got the RGB image, but how do I get these other layers? So specifically, I want to get the RGB layer, the depth map, and the normal map. And I can also get the segmentation map if I want to. And then I'm also going to provide a text prompt to give it that style that I'm looking for.

And when I do all of that, and I feed that into a control net, I can get this great result. Let's just compare those two things. You can see the before, and then you can see the after. And it's really added some nice global illumination with sunlight coming in from the side.

But how did I actually get those depth and normal maps out of the viewer? That's not quite part of our API right now. And here's the source code on the right to do that. But let's walk through a full example.

Let's start with the basic viewer. We'll go to viewer, autodesk.com. I'm going to quickly just log in, and I'm going to upload a Revit file that I've got of a house. And I'll use my viewer, use the LNV viewer, to position my viewport.

So I'll set to walk mode and, I'll walk into this kitchen, position things. And now I'm going to open up the Chrome debug console, and this is where I can enter in some source code. In particular, I want to capture the depth map. So here's the code for the depth map for grabbing that depth map. And inside is encoded the normal map as well.

And so once I've captured that buffer, I'm going to add some code to decode the normal map. And I'm also going to decode the depth map and finally create two links that I can just quickly click on. So now I'll just run that code on my current canvas. And it's going to generate two files. It's going to generate my normal map file-- here it is-- and my depth map file. And I'll run that through my control net to get this great result.

So let's move the camera around a little bit. Let's try a different camera angle, maybe. So I'll zoom out. And let's maybe try a clip plane so I can get a floorlevel view. So I'll add the section plane, and I'll position it down to that same kitchen area little bit. And then I'll run that same command set again, to capture the depth and normal map.

Here they are-- the normal map, the depth map, and then the final result. And it looks pretty good. Let's do a close comparison. Here's the final result, and there's the normal map. So switching back, and that normal map. It really added that nice global illumination and soft shadow that would take a long time for me to render.

All right, who's heard of Midjourney here? Midjourney makes everything look fantastic. It adds that wow factor to everything. It's kind of a saturation dial. Well, it turns out there is a Midjourney node that's very similar that we can add into our comfy workspace.

So I'm going to add that to our comfy workspace in this example, and I'm going to slowly dial up the results. So we'll start with this. We'll add some very subtle shadow lighting here, and we'll add a little bit more shadow lighting, maybe add some wood textures, maybe some dappled sunlight.

And then finally, we'll really add that Midjourney effect. It's going to start adding a bowl of apples. It's going to add some kitchen utensils in the background. It's starting to add leatherette seating and concrete shiny floors. And then now, now it's really going crazy. It's starting to add marble floor tops. And now we've just got a completely wild scene here.

Let's dial it back down a little bit. We'll come back to this scene. And now I can also- inside Comfy, I can add a few more nodes, actually quite a lot more nodes, to add an animation using animation diff and a few other things. And that means I can prompt AI to generate what it understands of the 3D world and generate a walking video like this one.

And going back to our configurator result, we can go from this to something like this, very subtle, to this, to our other scene, which is the office layout scene. We can also create a video of this one. We get a result like this. And AI has just figured out the scene. It understands structure, and it's created a walking video.

Another example could be this kitchenette scene. I'll ask it to add much shinier floors and try to make the scene look more realistic. Here it's added some great coffee pots and metallic appliances. Or maybe I want to add some red accent to this scene. I can just simply say, add some red accent, and it comes up with something like this. Or I can dial it back down and say, create a walking video of this one. And so it goes ahead and it can create a walking video of my scene.

OK, so that's all well and good with Comy.UI workspaces on my laptop. But what if I actually want to automate something? What if I want to build an application, and I want to build it with, say, the APS viewer? Well, let's go through it.

Here's a quick example. I've built this simple web application with a viewer, and I can compare my rendered image with my APS viewer. So I can just double check things look the same. Let's create a new image. I'll use the walking mode camera, and I'll set up a scene that I like. I'll provide a text prompt or maybe a pre-canned style, maybe add a bit of hazy sunlight, and then I'll click the Render button here, which will trigger a job, and then finally produce my rendered result, which I can compare with a left and right panning.

So how do we build this? So we can build this with APS view-- are pretty straightforward. We start with one of the Getting Started guides. We start with the hubs browser example. This means I can log into my ACC account or my BIM 360 account and choose any one of my models. I'm then going to add an extension to the viewer, and I'm going to add a button that does the Click Render button option.

And then I'll finally add a little bit of UI. I'm going to add a pull out drawer so I can open up my BIM 360 files and folders. I'll add a text prompt input up at the top, three pre-canned visual styles, a preview gallery down the bottom, and a split panel slider.

And then he's the result again that I was showing you. So here you can see that preview panel down the bottom. I can click on different previews. I've got that split panel. I can preview things and compare the results. And then finally, I've added that render button option, or I can set up my scene, change the styling with some natural language, and click on the Render button. And it's going to send all of those layers to my AI workflow and generate this rendered result.

So let's quickly take a look at the architectural diagram behind this. In a nutshell, the APS viewer has an onclick renderer event. It sends it to some GPU servers that runs the Comfy.ICU API, and that generates a final rendering. Specifically, when I click on render, I'm capturing those three layers-- the depth, the normal, and the RGB.

I send that to the Comfy.ICU API with my workflow, and serverless GPU runs the job and it renders an output and sends it to a signed URL to my OSS storage. From there, my viewer pulls that OSS storage into the preview panel.

Now, I keep talking about Comfy.ICU. What is that? Well, I mentioned that Comy.UI can run locally on my laptop, but I can also run it in the cloud. And that's what I've been showing you guys here.

This is Comfy.ICU It also happens to have an API. This is the API. It's a simple post job request. You provide the node graph workflow, which is a JSON file, and you provide all of the signed URL links to the images and provide a signed URL result image that Comy.UI will save the final result to. And that's it.

Here's a quick code walkthrough of where you can find the code. So if you go to-- there we go. Here you can see the GitHub repo for our APS viewer AI renderer. You'll go to the www root folder for example. And specifically, go to the image render extension. And here you can see that fetch request that I mentioned and also how to capture the canvas, for example, with the get screenshot, and how to actually add that create toolbar method. And that's how you can get started with using the APS viewer and AI rendering.

So I've been showing a lot of AEC renderings, but the AI models have also been trained on a large corpus of images. And that can render-- so it can render a lot of other things, too. So for example, let's say I have this chemical plant, which is very difficult with all the detail. It can produce results like this.

Or say, a remote control car like this one, it can render out manufactured models like this, make it fancy like this, or even upscale things to super resolution like this, which is 16k by 8K. Now, that would normally crash my GPU. But thanks to upscaling techniques that use tiling, we can upscale to very high resolutions.

And also, there are new AI models coming. There's a new AI model called Flux that's come from Black Forest Labs, some of the ex-engineering folks from Stability AI. And this new model can handle and understand text prompt semantics much better than before. It can also generate higher resolution and higher quality results because it has a greater understanding of the scene.

Now, this field is changing rapidly. So when I first started on this about 10 months ago, Stable Diffusion, 1.5 was in full effect. Stable Diffusion. XL was just released. And now flux AI has come in. And with each version comes a new node graph set up, which is a little bit tricky to learn.

And these latest flux models, they require a lot more memory. So you'll need a bigger GPU, much bigger GPU to make some of these things work. The good thing, though, is because these models understand more-- they have 16-channel VAE, for example-- they can produce much higher-quality results. You can also take advantage of the training. They've been trained on specific resolutions, so you can now get unusual aspect ratios.

All right, so that's future directions. Now, here's Alex to wrap up with the pros and cons of AI rendering in our workflows. Over to you, Alex.

ALEXANDRE PIRO: Wow, Thank you, Michael, for this very impressive deep dive into the details with promising AI new capabilities. In this session, we explored how the API viewer can be used to view and edit layouts through a practical customer case study. We compared various traditional rendering workflows, evaluating the performance, quality, cost, and potential for automation. This helped us understand the trade-offs between different approaches and how they fit into different project needs.

AI rendering and disrupting our traditional rendering workflows. In one end, we have traditional rendering, which provides exactly what you asked for. It has accuracy, but it takes a lot of manual tweaking to get that result. On the other hand, you have the AI creativity to help you generate hundreds of options with a text prompt, but it lacks accuracy. So we are in a hybrid mode where we pick the right tool for the job. We want to keep the best of both worlds.

Now, we welcome you to dive and explore these tools yourself. Check the handout to get access to the code samples, some useful information to get started with Comy.UI. And we also provided some rendering results so you can see what you can expect from AI. By experimenting, you will be able to find the right tool to improve your workflows. So do not hesitate and try AI rendering today.

Thank you for attending this session. And thank you, Michael. And remember to always thank AI. So thank you, Clippy.

presentation

handout

제품	Autodesk Platform Services Revit Autodesk Viewer
산업 분야	Architecture
주제	Architect Artificial Intelligence Building Information Modeling (BIM) Machine Learning Interior Designer Programmer / SW Engineer Visualization Software Development Animation & Visual Effects

Introducing AI Rendering: From Okay to Wow!

설명

주요 학습

발표자

Downloads

태그