Different platforms have vastly different performance capabilities; a high-end PC GPU can handle much more in terms of graphics and shadersA program that runs on the GPU. More info
See in Glossary than a low-end mobile GPU. The same is true even on a single platform; a fast GPU is dozens of times faster than a slow integrated GPU.
GPU performance on mobile platforms and low-end PCs is likely to be much lower than on your development machine. It’s recommended that you manually optimize your shaders to reduce calculations and texture reads, in order to get good performance across low-end GPU machines. For example, some built-in Shader objectsAn instance of the Shader class, a Shader object is container for shader programs and GPU instructions, and information that tells Unity how to use them. Use them with materials to determine the appearance of your scene. More info
See in Glossary have “mobile” equivalents that are much faster, but have some limitations or approximations.
This page contains information on optimizing your shaders for runtime performance.
The more computations and processing your shader code needs to do, the more it will impact the performance of your game. For example, supporting color per material is nice to make a shader more flexible, but if you always leave that color set to white then useless computations are performed for each vertex or pixelThe smallest unit in a computer image. Pixel size depends on your screen resolution. Pixel lighting is calculated at every screen pixel. More info
See in Glossary rendered on screen.
The frequency of computations will also impact the performance of your game. Usually there are many more pixels rendered (and subsequently more pixel shader executions) than there are vertices (vertex shader executions), and more vertices than objects being rendered. Where possible, move computations out of the pixel shader code into the the vertex shaderA program that runs on each vertex of a 3D model when the model is being rendered. More info
See in Glossary code, or move them out of shaders completely and set the values in a script.
When writing shaders in Cg/HLSL, there are three basic number types: float
, half
and fixed
(see Data Types and Precision).
For good performance, always use the lowest precision that is possible. This is especially important on lower-end hardware. Good rules of thumb are:
float
precision.half
precision. Increase only if necessary.fixed
precision.In practice, exactly which number type you should use for depends on the platform and the GPU. Generally speaking:
float
precision, so float/half/fixed
end up being exactly the same underneath. This can make testing difficult, as it’s harder to see if half/fixed precision is really enough, so always test your shaders on the target device for accurate results.half
precision support. This is usually faster, and uses less power to do calculations.Fixed
precision is generally only useful for older mobile GPUs. Most modern GPUs (the ones that can run OpenGL ES 3 or Metal) internally treat fixed
and half
precision exactly the same.See Data Types and Precision for more details.
Transcendental mathematical functions (such as pow
, exp
, log
, cos
, sin
, tan
) are quite resource-intensive, so avoid using them where possible on low-end hardware. Consider using lookup textures as an alternative to complex math calculations if applicable.
Avoid writing your own operations (such as normalize
, dot
, inversesqrt
). Unity’s built-in options ensure that the driver can generate much better code. Remember that the Alpha Test (discard
) operation often makes your fragment shader slower.
Surface ShadersA streamlined way of writing shaders for the Built-in Render Pipeline. More info
See in Glossary are great for writing shaders that interact with lighting. However, their default options are tuned to cover a broad number of general cases. Tweak these for specific situations to make shaders run faster or at least be smaller:
approxview
directive for shaders that use view direction (i.e. Specular) makes the view direction normalized per vertex instead of per pixel. This is approximate, but often good enough.halfasview
for Specular shader types is even faster. The half-vector (halfway between lighting direction and view vector) is computed and normalized per vertex, and the lighting function receives the half-vector as a parameter instead of the view vector.noforwardadd
makes a shader fully support one-directional light in Forward renderingA rendering path that renders each object in one or more passes, depending on lights that affect the object. Lights themselves are also treated differently by Forward Rendering, depending on their settings and intensity. More infonoambient
disables ambient lighting and spherical harmonics lights on a shader. This can make performance slightly faster.The fixed-function AlphaTest - or its programmable equivalent, clip()
- has different performance characteristics on different platforms:
On some platforms (mostly mobile GPUs found in iOS and Android devices), using ColorMask to leave out some channels (e.g. ColorMask RGB
) can be resource-intensive, so only use it if really necessary.