02: Findings Summary
Version 2.0, Updated March 2023 using Octane 2022.1 and C4D 2023.1.3
About This Guide
This is part of an ongoing series on the exciting world of resource management!
The goal of this series is to explore what C4D and Octane are doing under the hood, and how to tune our system resources and habits to make our workflow as zen and frustration-free as possible. Part 01 is an overview of how the whole system works. Part 02 (this one) is a rollup of all the findings of Part 03 and beyond, which will each be a deep-dive into a particular area of interest.
Guides in this series
- 01: Overview
- 02: Findings Summary <--This one
- 03: Polygons
- 04: Instances (coming soon)
- 05: Textures (coming soon)
- 06: Displacement (coming soon)
This guide is also available in PDF format here.
This guide will serve a running summary of the takeaways learned in the other parts of the series, and will be updated every time a new part comes out, or something new is discovered.
We’ll be using a 2021 Razer Blade 15 for these guides. Specs: i9-11900H CPU/64GB RAM/RTX3080 Laptop GPU/16GB VRAM/4K 60Hz OLED. OS: Win 11 Home 22H2. Cinema 4D 2023.1/Octane Render 2022.1
Test scenes are 1280x1280, 30FPS. All unnecessary apps and processes have been shut down.
The Gold Standard
An Octane render is at its absolute best when all of the scene data (polygons, textures, instances, overhead, and everything else) fits completely in VRAM, and the system RAM does not fill up to 100% while processing.
C4D’s performance is at its absolute best when the frames per second reported in the viewport does not drop below the target FPS of the scene. This is covered in detail at the end of the Intro guide for this series.
When we first launch C4D (assuming we remembered to shut everything else down), we have ~12.5 GB out of 16 GB of VRAM to use for Octane, ~57 GB out of 64 GB of system RAM to use for pre-processing, and are running at 300-350 FPS in the viewport. This leaves us a lot of room to add geometry, textures, and other stuff so our render isn’t just a white square.
II. Findings from the polygons guide
A more detailed breakdown of this information can be found in the Polygons Guide in this series.
- In C4D, one million quads takes about 70 MB (0.07GB) of RAM.
- Octane can only use triangles, not quads or n-gons.
- Two million triangles use 117 MB of RAM in C4D.
- Octane uses ~360 MB (0.36) of VRAM for the same two million tris.
- 10-16 million quads is where perfomance starts to take a noticeable hit - time to start looking into optimizing.
- ~32 million baked quads or ~40 million subdivided quads fills the VRAM of this test machine.
- Out-of-core memory is much slower, but allows us to render even more polygons if necessary.
In C4D, polygons as either baked (editable) and stored in a C4D file, or generated on the fly by generators such as primitives (spheres, cubes, etc), NURBS objects (Sweeps, Lathes, etc), or other stuff like Subdivision Surface objects. All generated polygons are baked as part of the pre-processing phase of a render.
Important: C4D recognizes triangles, quads (4-sided) and n-gon (n-sided) polygons, where Octane only recognizes triangles, so all geometry is converted to triangles prior to being sent to Octane to render.
C4D’s Subdivision Surface object creates the newly subdivided polygons and stores them in RAM, which allows us to see the results of the subdivided mesh in the viewport, but takes up a little more RAM and VRAM when processing.
Octane’s Object Tag allows for subdividing at render time, which lets C4D take up less VRAM during the render, and keeps the viewport FPS up, but does not show what the mesh will really look like until the scene is rendered.
Polygons and Memory
Ctl-I gives a good idea of how much RAM the polygon objects in the scene will use. To see amount of VRAM used (after a render), we can go to Octane Settings>Settings Tab>Device Settings.
The same amount of polygons take up less RAM than VRAM, but we want to keep them in VRAM (not push them to Out-of-core RAM) during the render to make sure the render is as fast as possible.
.37 GB per 1,000,000 quads (2M tris) is a good guideline for estimating VRAM usage.
Getting a Polygon Count
Lots of polygons means lower performance, so it’s a good idea to keep an eye on how many are in the scene. Both Octane and C4D have methods for seeing how many polygons are in a scene.
In Octane, The Live Viewer has an overlay that displays the number of triangles the most recent render used. There is a more accurate count (to the triangle) in the GPU Information popup in the Help menu. The Octane Log will also display this value if the scene was rendered to the Picture Viewer.
In C4D, Ctl-I (Cmd-I on the Mac) pops open an information window that shows how many polygons are in the scene. This isn’t super accurate for Octane because C4D treats a triangle, a quad, and an n-gon all as one polygon, so after splitting these into triangles, the number could be vastly different if it’s an all-quad or all-n-gon scene.
a Total Polycount can be turned on in the HUD (Shift-V > HUD tab). This relies on being in Polygons mode and selecting some geometry. This only works for baked polygons.
The Structure Manager will show a breakdown into Tris/Quads/N-gons for selected geometry. This also only works for baked polygons. After being in Polygons mode and selecting the baked geo, the Structure Manager can be found by going to the Attributes Manager, choosing the Mode menu, and Project Info, then clicking the Structure tab.
Tips for Using Out-of-core (OOC) memory
Avoid using the Live Viewer when OOC is needed. Instead, use a lower polygon proxy object, fewer instances, and/or lower resolution textures when doing lighting and setup, and then only turn on the final high poly geometry or high res textures when rendering to the Picture Viewer.
Before the final render using OOC memory, restart the computer, and then pause ALL extra processes (cloud storage apps, anti-malware, etc.). This frees up the maximum amount of resources to limit the amount of swapping and pre-processing time.
RTX is not enabled while using OOC memory, so this may further slow the render down.
Polygon milestones of pain
- 1 million quads (2M tris) or less: Almost no impact on the system.
- 4 million quads (8M tris): Render pre-processing time is 5-6 seconds, Everything else is still fine.
- 16 million quads (32M tris): Render pre-processing time is 20-25 seconds. RAM and VRAM don’t fill up, and the viewport and playback FPS is still fine. It’s just a matter of how long we want to wait to see something happen if the geo changes.
- Somewhere between 16-32 million quads: RAM maxes out, so things slow way down. Viewport performance drops below 60, but it’s still above 30, so it’s real-time.
- 32 million quads (64M tris): This is about the max for baked polygons. RAM maxes out, pre-processing time is high.
- 40 million quads (80M tris): This is about the max for subdivided polygons. FPS in the viewport is in danger of dropping below 30, pre-processing time is really high.
- 64 million quads (~128M tris) will actually render if we change the settings to give Octane 20GB of Out-of-core memory. This takes 13 minutes of pre-processing time before it can even start to render. Viewport response is below 30FPS, and it’s laggy and stuttery. Good as a last resort, but not recommended. Definitely render to Picture Viewer with a polycount this high.
III. Findings from the instances guide
A more detailed breakdown of this information can be found in the Instances Guide in this series.
Instances are procedural copies of objects. These are usually found in Cloners and referred to as “clones”, but C4D has a standalone Instance Object which is essentially a single clone that’s not attached to an internal Matrix system.
Cinema 4D has three flavors of instances that can be chosen in a cloner or Instance object. Regular ol’ Instance, Render Instance, and Multi-instance. Octane doesn’t play well with Render Instance, so we’re going to ignore it here.
Regular Instances are similar to making copies by hand. Each one consumes the same amount of RAM and VRAM, and has the same impact on the viewport as the original source object. This means we’re limited in the number we can have due to how C4D handles individual objects before our system grinds to a halt. Most of what we can do to a regular geometry object, we can do to an instance (deformation, animation timing, texturing, etc.)
Multi-instances exist as a system which is treated as a single object. This gets around C4D’s limitation of the number of objects in a scene, but at the cost of versatility when it comes to individual animation, deformation, and texturing. The source object is only loaded once into RAM and VRAM, making it efficient with lots of high poly objects. Very large systems (millions of instances) will start to take a while to build and pre-process before a render, and have a negative impact on the viewport performance.
Octane Scatter is Octane’s native instance system. It works mostly like Multi-instance, but there are some key differences that make it better when visualizing huge systems (millions of instances +). The Display Mode actually makes a difference in whether it acts like regular Instances or Multi-Instance, and how fluid the viewport experience is. Since it doesn’t have a built-in grid system, it relies on the Matrix Object if a regular 3D grid is needed, and the Matrix itself can start slowing things down with lots of instances.
When to use what
We can get away with regular Instances when we just have a few hundred low-to-medium poly instances that don’t really impact our performance much. We need to use this mode if we want to deform the instances individually (say, set up a twist deformer that only affects certain clones as they get closer to it), or if our source geometry has keyframed animation in it that we want to offset per clone. There are also cases that we’ll run into where texturing doesn’t work as expected in other modes, so we may to revert back to regular Instances, and optimize our setup more if that happens.
We’ll want to switch to Multi-instances if the performance starts to lag due to having a ton of clones in the scene (thousands +). We may have to make some concessions about animation offsets or deformation (or find workarounds and hacks), but our working environment will be a lot smoother and we’ll still be able to reliably time animations or view the actual object in the viewport up to tens of thousands of clones.
We’ll want to investigate Scatter if we have a system that has tens of thousands of instances and we want better viewport performance, or if we reach a point where timing an animation is crucial and Multi-instance keeps stuttering at the loop point (frame 0) when we’re playing back in the viewport. There will also be times where Shader Effectors/Fields don’t work so well in Multi-instance, but we have too many objects to use a regular Instance - it may work ok in Scatter.
Individual Object/Instance Mode Limit
Regardless of which systems are used, C4D can only handle about 20,000 individual objects of any kind in the viewport before the FPS drops below 30 on this computer. If any of those objects are made of polygons, the number drops significantly. A Cloner or Instance object in Instance mode treats each clone as an individual object, and is therefore subject to this limitation.
The following applies to both individual objects or clones in Instance mode.
3,000 visible polygon objects (up to ~3k polys apiece) drops below 30 FPS in the viewport.
10,000 visible polygon objects is about where things get too sluggish to work with (~10 FPS). It also adds 8 seconds to the pre-processing time prior to a render for a one polygon source object.
At this point, we need to start seriously considering Multi-instance or Scatter if at all possible.
20,000 visible non-polygon objects (nulls, etc) is about the limit for working in real time. These can be used as placeholders that show position and swapped for real geometry prior to a render.
30,000 hidden objects (polygon or non-polygon) should still get us about real time performance. Pre-processing time gets up to 40 seconds with a 1 polygon source object, and RAM and VRAM can quickly max out if the polycount isn’t kept low.
100,000 polygon objects puts us into the single digits for FPS, even when they’re hidden from the viewport. The ridiculous pre-processing time (several minutes) means we should avoid getting our count up this high.
C4D treats a Multi-instance system as a single object, which vastly increases the viewport and render performance. It comes at a cost - individual multi-instances can’t be individually deformed, have their properties modified or animation offset, and a lot of versatility in texturing is lost.
A Cloner also has to rebuild itself on frame 0 of the animation, so if this takes any significant amount of time, it becomes difficult to time the first few frames of animation while C4D is rebuilding. On this computer, it’s usually around the 30,000 clone mark when the Viewport Mode is set to Object and the source is fairly low poly, higher in the other modes.
Multi-instance has different Viewport Modes which are good for increasing viewport performance without affecting the render.
<30,000 multi-instances is usually a pretty fluid experience unless the source object is particularly heavy.
30,000 multi-instances is where the rebuild time will start to be noticeable and affect the FPS of the first few frames of the animation. If this timing is essential, we should consider moving to the Octane Scatter where this isn’t an issue.
80,000 multi- instances is about the limit of real-time navigation (and playback after the first few frames) in the viewport if the Viewport Mode is set to Object.
---Consider setting Viewport Mode to Points---
170,000 multi-instances with the Viewport Mode set to Points Mode is where we can no longer reliably time the first few frames of an animation due to the rebuild time taking some time at frame 0.
400,000 multi-instances drops us below real time navigation if the Viewport Mode is set to Points.
---Consider setting Viewport Mode to None---
2,000,000-5,000,000 multi-instances is where viewport lag is bad enough that we’re going to want to switch the Viewport Mode to None. Navigation FPS in this mode is never an issue since it’s always just one hidden object. Animation FPS will continue to suffer more and more at the loop point as we add more clones from here.
10,000,000 multi instances is where we start seeing significant pre-processing time.
15,000,000 multi-instances takes 40 seconds to redraw, 30 seconds of pre-processing, and gets close to maxing out the RAM during a render.
25,000,000 multi-instances is probably our upper working limit. It takes a minute to redraw, and 1:10 of pre-processing time. VRAM is at nearly 9GB at this point without any extra geometry, textures, etc.
25,000,000-30,000,000 multi-instances is where Octane stops being able to render the scene.
40,000,000 multi-instances is where C4D runs out of memory.
Scatter is Octane’s native instancing system. Most of the time it’s used in a similar fashion to the Cloner in Multi-instance mode, but it can be set up to work more like Instance mode as well.
Things to consider with Scatter
Scatter does not have a built-in grid system like the Cloner. It’s most efficient in the Scatter Distribution Type set to Surface, and using a low-poly piece of geometry to spread instances across. If a grid pattern is desired, it works with a C4D Matrix object using the Vertex Distribution type. With huge systems, the Matrix can consume resources and slow everything down.
Scatter’s Display Mode drastically impacts performance. Line mode (default) is the most efficient and can display the most instances in the viewport before slowing down below real time (~200k on this test machine). Box, Circle, and Sphere modes can display progressively fewer instances, but show more information about the source object. in Object mode, Scatter acts like an Instance Cloner. Only a 3,000 or so can be shown, but they can be individually deformed.
Display Rate is one of Octane’s super powers. The Cloner in Multi-instance mode can only display 100,000 points or so before we need to turn off viewport feedback altogether. Scatter can display about 200,000 lines, but if we have a system with say, two million instances, we can set the display rate to 10%, meaning only one in ten instances are shown for a total of 200k lines again. Two million instances will still render, but now we can at least see some sort of representation of where the instances are.
Build time with Scatter can take longer initially, but it doesn’t have to redraw when the animation loops around to frame 0 like Multi-instance. That combined with the Display Rate makes it a much better candidate for visualizing and timing the animation of very large systems.
Scatter doesn’t hide the source geometry (the thing being cloned) or the Surface (the thing it’s putting the clones on) from the viewport. They can be hidden via the traffic lights which works fine in most Display Modes, but it ends up hiding all the clones in Object mode. When Object mode is being used, just make the Y or Z coordinate of the source object some crazy high number to push it off the viewport.
When using a Matrix object as our Surface, things can get unstable over 5 million instances or so. To mitigate crashes, this is the best order to do things when we need to change the number of instances or the size, or something else related to the Matrix.
- Stop the Octane Live Viewer
- Hit the green check next to the Scatter to disable it.
- Hit the green check next to the Matrix to disable it.
- Adjust the Matrix count or size, or whatever.
- If any significant number of matrices were added, consider lowering the Display Rate in the Scatter
- Re-enable the Matrix.
- Re-enable the Scatter.
3,000 Scatter Instances is about the max for real time performance when the Display Mode is set to Object.
--- Adjust the Display Rate to get the number of visibile instances close to these limits ---
20,000 Scatter Instances is about the max for 30 FPS using Display Mode: Sphere
25,000 is the max visible instances for 30 FPS using Display Mode: Circle
100,000 is the max visible instances for 30 FPS using Display Mode: Box
200,000 is the max visible instances for 30 FPS using Display Mode: Line
5,000,000 instances is where the Matrix starts to get unwieldy. Follow the steps above to avoid crashes. Build time for Scatter is about 17 seconds, 30 FPS is achievable with a Display Rate of 2.5%.
10,000,000 instances takes a full minute to build on this machine. 30 FPS at 0.7%. Pre-processing is at 40 seconds. VRAM is at 4.46 GB
15,000,000 instances takes 2:30 to build. Pre-processing is at 40 seconds, VRAM is at 5.6 GB
25,000,000 instances is about the upper limit. It takes 3:10 to build the Scatter (and more for the Matrix). Pre-processing is at 50 seconds, and VRAM is at 9.9 GB. RAM starts to max out as well. 30FPS can still be had at 0.1% Display Rate.
30,000,000 instances won’t render.
At 25,000,000 multi-instances, the system still isn’t maxing out the VRAM since Octane only needs to load the source geometry into memory once. That means the source object can have a lot more polygons. The upper limit appears to be about 13.5 million triangles per instance for a grand total of 338 trillion polygons in the render.
That’s it for right now, but this will get fleshed out more as more guides are written in this series, so stay tuned!
OG029 Resource Management: Findings, version 1.0, Last modified January 2023.
This guide originally appeared on https://be.net/scottbenson and https://help.otoy.com/hc/en-us/articles/212549326-OctaneRender-for-CINEMA-4D-Cheatsheet
All rights reserved.
The written guide may be distributed freely and can be used for personal or professional training, but not modified or sold. The assets distributed within this guide are either generated specifically for this guide and released as cc0, or sourced from cc0 sites, so they may be used for any reason, personal or commercial. The emoji font used here is Noto Color Emoji.