A game is a mathematical model of a virtual world simulated in real time on a computer of some kind. Therefore, mathematics pervades everything we do in the game industry. Game programmers make use of virtually all branches of mathematics, from trigonometry to algebra to statistics to calculus. However, by far the most prevalent kind of mathematics you’ll be doing as a game programmer is 3D vector and matrix math (i.e., 3D linear algebra).
游戏是在某种计算机上实时模拟的虚拟世界的数学模型。因此,数学渗透到我们在游戏行业所做的一切。游戏程序员几乎利用了数学的所有分支,从三角学到代数,从统计学到微积分。然而,到目前为止,作为游戏程序员,最流行的数学类型是 3D 向量和矩阵数学(即 3D 线性代数)。
Even this one branch of mathematics is very broad and very deep, so we cannot hope to cover it in any great depth in a single chapter. Instead, I will attempt to provide an overview of the mathematical tools needed by a typical game programmer. Along the way, I’ll offer some tips and tricks, which should help you keep all of the rather confusing concepts and rules straight in your head. For an excellent in-depth coverage of 3D math for games, I highly recommend Eric Lengyel’s book [32] on the topic. Chapter 3 of Christer Ericson’s book [14] on real-time collision detection is also an excellent resource.
即使数学的这一分支也非常广泛和深刻,因此我们不能指望在一章中详细介绍它。相反,我将尝试概述典型游戏程序员所需的数学工具。在此过程中,我将提供一些提示和技巧,这应该可以帮助您将所有相当混乱的概念和规则牢记在心。对于游戏 3D 数学的精彩深入报道,我强烈推荐 Eric Lengyel 关于该主题的书 [32]。 Christer Ericson 关于实时碰撞检测的书 [14] 第 3 章也是一个很好的资源。
Many of the mathematical operations we’re going to learn about in the following chapter work equally well in 2D and 3D. This is very good news, because it means you can sometimes solve a 3D vector problem by thinking and drawing pictures in 2D (which is considerably easier to do!) Sadly, this equivalence between 2D and 3D does not hold all the time. Some operations, like the cross product, are only defined in 3D, and some problems only make sense when all three dimensions are considered. Nonetheless, it almost never hurts to start by thinking about a simplified two-dimensional version of the problem at hand. Once you understand the solution in 2D, you can think about how the problem extends into three dimensions. In some cases, you’ll happily discover that your 2D result works in 3D as well. In others, you’ll be able to find a coordinate system in which the problem really is two-dimensional. In this book, we’ll employ two-dimensional diagrams wherever the distinction between 2D and 3D is not relevant.
我们将在下一章中学习的许多数学运算在 2D 和 3D 中同样有效。这是一个非常好的消息,因为这意味着您有时可以通过在 2D 中思考和绘制图片来解决 3D 矢量问题(这要容易得多!)遗憾的是,2D 和 3D 之间的这种等价性并不总是成立。有些运算(例如叉积)仅在 3D 中定义,而有些问题只有在考虑所有三个维度时才有意义。尽管如此,从考虑手头问题的简化二维版本开始几乎没有什么坏处。一旦你理解了二维的解决方案,你就可以思考问题如何扩展到三个维度。在某些情况下,您会很高兴地发现 2D 结果也适用于 3D。在其他情况下,您将能够找到一个坐标系,其中问题实际上是二维的。在本书中,只要 2D 和 3D 之间的区别不相关,我们将使用二维图。
The majority of modern 3D games are made up of three-dimensional objects in a virtual world. A game engine needs to keep track of the positions, orientations and scales of all these objects, animate them in the game world, and transform them into screen space so they can be rendered on screen. In games, 3D objects are almost always made up of triangles, the vertices of which are represented by points. So, before we learn how to represent whole objects in a game engine, let’s first take a look at the point and its closely related cousin, the vector.
大多数现代 3D 游戏都是由虚拟世界中的三维物体组成。游戏引擎需要跟踪所有这些对象的位置、方向和比例,在游戏世界中为它们设置动画,并将它们转换到屏幕空间,以便可以在屏幕上渲染它们。在游戏中,3D 对象几乎总是由三角形组成,三角形的顶点由点表示。因此,在我们学习如何在游戏引擎中表示整个对象之前,让我们首先看一下点及其密切相关的表亲——向量。
Technically speaking, a point is a location in n-dimensional space. (In games, n is usually equal to 2 or 3.) The Cartesian coordinate system is by far the most common coordinate system employed by game programmers. It uses two or three mutually perpendicular axes to specify a position in 2D or 3D space. So, a point P is represented by a pair or triple of real numbers, (Px, Py) or (Px, Py, Pz) (see Figure 5.1).
从技术上讲,点是 n 维空间中的一个位置。 (在游戏中,n 通常等于 2 或 3。)笛卡尔坐标系是迄今为止游戏程序员最常用的坐标系。它使用两个或三个相互垂直的轴来指定 2D 或 3D 空间中的位置。因此,点 P 由一对或三重实数表示, (P x , P y ) 或 (P x , P y 、 P z )(见图 5.1)。
Of course, the Cartesian coordinate system is not our only choice. Some other common systems include:
当然,笛卡尔坐标系并不是我们唯一的选择。其他一些常见系统包括:
Cartesian coordinates are by far the most widely used coordinate system in game programming. However, always remember to select the coordinate system that best maps to the problem at hand. For example, in the game Crank the Weasel by Midway Home Entertainment, the main character Crank runs around an art-deco city picking up loot. I wanted to make the items of loot swirl around Crank’s body in a spiral, getting closer and closer to him until they disappeared. I represented the position of the loot in cylindrical coordinates relative to the Crank character’s current position. To implement the spiral animation, I simply gave the loot a constant angular speed in 9, a small constant linear speed inward along its radial axis r and a very slight constant linear speed upward along the h-axis so the loot would gradually rise up to the level of Crank’s pants pockets. This extremely simple animation looked great, and it was much easier to model using cylindrical coordinates than it would have been using a Cartesian system.
笛卡尔坐标是迄今为止游戏编程中使用最广泛的坐标系。但是,请始终记住选择最适合当前问题的坐标系。例如,在 Midway Home Entertainment 的游戏《Crank the Weasel》中,主角 Crank 在一座装饰艺术城市中奔跑,捡拾战利品。我想让战利品在克兰克的身体周围呈螺旋状旋转,离他越来越近,直到消失。我用柱坐标表示战利品相对于 Crank 角色当前位置的位置。为了实现螺旋动画,我简单地给了战利品一个恒定的角速度(9),一个沿着其径向轴 r 向内的小恒定线速度,以及一个沿着 h 轴向上的非常小的恒定线速度,这样战利品就会逐渐上升到克兰克裤子口袋的高度。这个极其简单的动画看起来很棒,而且使用柱坐标系建模比使用笛卡尔系统要容易得多。
In three-dimensional Cartesian coordinates, we have two choices when arranging our three mutually perpendicular axes: right-handed (RH) and left-handed (LH). In a right-handed coordinate system, when you curl the fingers of your right hand around the z-axis with the thumb pointing toward positive z coordinates, your fingers point from the x-axis toward the y-axis. In a left-handed coordinate system the same thing is true using your left hand.
在三维笛卡尔坐标中,我们在排列三个相互垂直的轴时有两种选择:右手(RH)和左手(LH)。在右手坐标系中,当您将右手手指绕 z 轴卷曲且拇指指向正 z 坐标时,您的手指从 x 轴指向 y 轴。在左手坐标系中,使用左手也是如此。
The only difference between a left-handed coordinate system and a right-handed coordinate system is the direction in which one of the three axes is pointing. For example, if the y-axis points upward and x points to the right, then z comes toward us (out of the page) in a right-handed system, and away from us (into the page) in a left-handed system. Left- and right-handed Cartesian coordinate systems are depicted in Figure 5.4.
左手坐标系和右手坐标系之间的唯一区别是三个轴之一指向的方向。例如,如果 y 轴指向上方,x 指向右侧,则在右手系统中 z 朝向我们(离开页面),而在左手系统中远离我们(进入页面) 。左手和右手笛卡尔坐标系如图 5.4 所示。
It is easy to convert from left-handed to right-handed coordinates and vice versa. We simply flip the direction of any one axis, leaving the other two axes alone. It’s important to remember that the rules of mathematics do not change between left-handed and right-handed coordinate systems. Only our interpretation of the numbers—our mental image of how the numbers map into 3D space—changes. Left-handed and right-handed conventions apply to visualization only, not to the underlying mathematics. (Actually, handedness does matter when dealing with cross products in physical simulations, because a cross product is not actually a vector—it’s a special mathematical object known as a pseudovector. We’ll discuss pseudovectors in a little more depth in Section 5.2.4.9.)
从左手坐标转换为右手坐标很容易,反之亦然。我们只需翻转任意一个轴的方向,而保留其他两个轴。重要的是要记住,数学规则在左手坐标系和右手坐标系之间不会改变。只是我们对数字的解释(我们对数字如何映射到 3D 空间的心理图像)发生了变化。左手和右手约定仅适用于可视化,不适用于基础数学。 (实际上,在物理模拟中处理叉积时,惯用手确实很重要,因为叉积实际上并不是一个向量,它是一个称为伪向量的特殊数学对象。我们将在第 5.2.4.9 节中更深入地讨论伪向量.)
The mapping between the numerical representation and the visual representation is entirely up to us as mathematicians and programmers. We could choose to have the y-axis pointing up, with z forward and x to the left (RH) or right (LH). Or we could choose to have the z-axis point up. Or the x-axis could point up instead—or down. All that matters is that we decide upon a mapping, and then stick with it consistently.
数字表示和视觉表示之间的映射完全取决于我们作为数学家和程序员。我们可以选择 y 轴朝上,z 向前,x 向左 (RH) 或向右 (LH)。或者我们可以选择将 z 轴指向上方。或者 x 轴可以指向上方或下方。重要的是我们决定一个映射,然后始终坚持下去。
That being said, some conventions do tend to work better than others for certain applications. For example, 3D graphics programmers typically work with a left-handed coordinate system, with the y-axis pointing up, x to the right and positive z pointing away from the viewer (i.e., in the direction the virtual camera is pointing). When 3D graphics are rendered onto a 2D screen using this particular coordinate system, increasing z-coordinates correspond to increasing depth into the scene (i.e., increasing distance away from the virtual camera). As we will see in subsequent chapters, this is exactly what is required when using a z-buffering scheme for depth occlusion.
话虽如此,对于某些应用程序,某些约定确实比其他约定更有效。例如,3D 图形程序员通常使用左手坐标系,其中 y 轴朝上,x 朝右,正 z 指向远离观察者的方向(即虚拟相机指向的方向)。当使用此特定坐标系将 3D 图形渲染到 2D 屏幕上时,增加 z 坐标对应于增加场景深度(即增加距虚拟相机的距离)。正如我们将在后续章节中看到的,这正是使用 z 缓冲方案进行深度遮挡时所需要的。
A vector is a quantity that has both a magnitude and a direction in n-dimensional space. A vector can be visualized as a directed line segment extending from a point called the tail to a point called the head. Contrast this to a scalar (i.e., an ordinary real-valued number), which represents a magnitude but has no direction. Usually scalars are written in italics (e.g., v) while vectors are written in boldface (e.g., v).
矢量是在 n 维空间中同时具有大小和方向的量。向量可以可视化为从称为尾部的点延伸到称为头的点的有向线段。将此与标量(即普通实数值)进行对比,标量表示大小但没有方向。通常标量以斜体书写(例如,v),而向量以粗体书写(例如,v)。
A 3D vector can be represented by a triple of scalars (x, y, z), just as a point can be. The distinction between points and vectors is actually quite subtle. Technically, a vector is just an offset relative to some known point. A vector can be moved anywhere in 3D space—as long as its magnitude and direction don’t change, it is the same vector.
3D 向量可以用三个标量 (x, y, z) 表示,就像点一样。点和向量之间的区别实际上非常微妙。从技术上讲,矢量只是相对于某个已知点的偏移。向量可以在 3D 空间中移动到任何位置——只要它的大小和方向不改变,它就是同一个向量。
A vector can be used to represent a point, provided that we fix the tail of the vector to the origin of our coordinate system. Such a vector is sometimes called a position vector or radius vector. For our purposes, we can interpret any triple of scalars as either a point or a vector, provided that we remember that a position vector is constrained such that its tail remains at the origin of the chosen coordinate system. This implies that points and vectors are treated in subtly different ways mathematically. One might say that points are absolute, while vectors are relative.
向量可以用来表示一个点,只要我们将向量的尾部固定到坐标系的原点即可。这样的矢量有时称为位置矢量或半径矢量。出于我们的目的,我们可以将任何标量三元组解释为点或向量,前提是我们记住位置向量受到约束,使其尾部保持在所选坐标系的原点处。这意味着点和向量在数学上的处理方式略有不同。有人可能会说点是绝对的,而向量是相对的。
The vast majority of game programmers use the term “vector” to refer both to points (position vectors) and to vectors in the strict linear algebra sense (purely directional vectors). Most 3D math libraries also use the term “vector” in this way. In this book, we’ll use the term “direction vector” or just “direction” when the distinction is important. Be careful to always keep the difference between points and directions clear in your mind (even if your math library doesn’t). As we’ll see in Section 5.3.6.1, directions need to be treated differently from points when converting them into homogeneous coordinates for manipulation with 4 ϗ 4 matrices, so getting the two types of vector mixed up can and will lead to bugs in your code.
绝大多数游戏程序员使用术语“向量”来指代点(位置向量)和严格线性代数意义上的向量(纯方向向量)。大多数 3D 数学库也以这种方式使用术语“向量”。在本书中,当区别很重要时,我们将使用术语“方向向量”或仅使用“方向”。请务必始终牢记点和方向之间的差异(即使您的数学库没有)。正如我们将在第 5.3.6.1 节中看到的,在将方向转换为齐次坐标以使用 4 ϗ 4 矩阵进行操作时,需要对方向与点进行不同的处理,因此将两种类型的向量混合起来可能会导致您的错误代码。
It is often useful to define three orthogonal unit vectors (i.e., vectors that are mutually perpendicular and each with a length equal to one), corresponding to the three principal Cartesian axes. The unit vector along the x-axis is typically called i, the y-axis unit vector is called j, and the z-axis unit vector is called k. The vectors i, j and k are sometimes called Cartesian basis vectors.
定义对应于三个笛卡尔主轴的三个正交单位向量(即相互垂直且每个长度等于 1 的向量)通常很有用。沿 x 轴的单位向量通常称为 i,y 轴单位向量称为 j,z 轴单位向量称为 k。向量 i、j 和 k 有时称为笛卡尔基向量。
Any point or vector can be expressed as a sum of scalars (real numbers) multiplied by these unit basis vectors. For example,
任何点或向量都可以表示为标量(实数)乘以这些单位基向量的总和。例如,
Most of the mathematical operations that you can perform on scalars can be applied to vectors as well. There are also some new operations that apply only to vectors.
您可以对标量执行的大多数数学运算也可以应用于向量。还有一些仅适用于向量的新操作。
Multiplication of a vector a by a scalar s is accomplished by multiplying the individual components of a by s:
向量 a 与标量 s 的乘法是通过将 a 的各个分量乘以 s 来完成的:
Multiplication by a scalar has the effect of scaling the magnitude of the vector, while leaving its direction unchanged, as shown in Figure 5.5. Multiplication by −1 flips the direction of the vector (the head becomes the tail and vice versa).
乘以标量可以缩放矢量的大小,同时保持其方向不变,如图 5.5 所示。乘以 -1 会翻转向量的方向(头部变为尾部,反之亦然)。
The scale factor can be different along each axis. We call this nonuniform scale, and it can be represented as the component-wise product of a scaling vector s and the vector in question, which we’ll denote with the ⊗ operator. Technically speaking, this special kind of product between two vectors is known as the Hadamard product. It is rarely used in the game industry—in fact, nonuniform scaling is one of its only commonplace uses in games:
每个轴的比例因子可以不同。我们称之为非均匀尺度,它可以表示为缩放向量 s 和相关向量的按分量乘积,我们用 ⊗ 运算符表示。从技术上讲,两个向量之间的这种特殊乘积称为哈达玛乘积。它很少在游戏行业中使用——事实上,非均匀缩放是它在游戏中唯一常见的用途之一:
As we’ll see in Section 5.3.7.3, a scaling vector s is really just a compact way to represent a 3 ϗ 3 diagonal scaling matrix S. So another way to write Equation (5.1) is as follows:
正如我们将在第 5.3.7.3 节中看到的,缩放向量 s 实际上只是表示 3 ϗ 3 对角缩放矩阵 S 的一种紧凑方式。因此,编写方程 (5.1) 的另一种方法如下:
We’ll explore matrices in more depth in Section 5.3.
我们将在 5.3 节中更深入地探讨矩阵。
The addition of two vectors a and b is defined as the vector whose components are the sums of the components of a and b. This can be visualized by placing the head of vector a onto the tail of vector b—the sum is then the vector from the tail of a to the head of b (see also Figure 5.6):
两个向量 a 和 b 的和被定义为其分量是 a 和 b 的分量之和的向量。这可以通过将向量 a 的头部放在向量 b 的尾部上来可视化 - 那么总和就是从 a 的尾部到 b 的头部的向量(另请参见图 5.6):
Vector subtraction a − b is nothing more than addition of a and −b (i.e., the result of scaling b by −1, which flips it around). This corresponds to the vector whose components are the difference between the components of a and the components of b:
向量减法 a − b 无非就是 a 和 −b 的加法(即将 b 按 −1 缩放的结果,将其翻转)。这对应于其分量是 a 分量与 b 分量之差的向量:
Vector addition and subtraction are depicted in Figure 5.6.
矢量加法和减法如图 5.6 所示。
You can add and subtract direction vectors freely. However, technically speaking, points cannot be added to one another—you can only add a direction vector to a point, the result of which is another point. Likewise, you can take the difference between two points, resulting in a direction vector. These operations are summarized below:
您可以自由地添加和减去方向向量。然而,从技术上讲,点不能相互相加——只能将方向向量与一个点相加,其结果是另一个点。同样,您可以获取两点之间的差异,从而得到方向向量。这些操作总结如下:
The magnitude of a vector is a scalar representing the length of the vector as it would be measured in 2D or 3D space. It is denoted by placing vertical bars around the vector’s boldface symbol. We can use the Pythagorean theorem to calculate a vector’s magnitude, as shown in Figure 5.7:
矢量的大小是一个标量,表示在 2D 或 3D 空间中测量的矢量长度。它通过在向量的粗体符号周围放置竖线来表示。我们可以使用毕达哥拉斯定理来计算向量的大小,如图 5.7 所示:
Believe it or not, we can already solve all sorts of real-world game problems given just the vector operations we’ve learned thus far. When trying to solve a problem, we can use operations like addition, subtraction, scaling and magnitude to generate new data out of the things we already know. For example, if we have the current position vector of an AI character P1, and a vector v representing her current velocity, we can find her position on the next frame P2 by scaling the velocity vector by the frame time interval Δt, and then adding it to the current position. As shown in Figure 5.8, the resulting vector equation is P2 = P1 + v Δt. (This is known as explicit Euler integration—it’s actually only valid when the velocity is constant, but you get the idea.)
不管你相信与否,仅凭我们迄今为止学到的向量运算,我们已经可以解决现实世界中的各种游戏问题。当尝试解决问题时,我们可以使用加法、减法、缩放和幅度等运算从我们已知的事物中生成新数据。例如,如果我们有 AI 角色 P 1 的当前位置向量,以及代表她当前速度的向量 v,我们可以通过以下方式找到她在下一帧 P 2 上的位置:将速度矢量缩放帧时间间隔 Δt,然后将其添加到当前位置。如图 5.8 所示,得到的向量方程为 P 2 = P 1 + v Δt。 (这被称为显式欧拉积分——实际上只有当速度恒定时才有效,但你明白了。)
As another example, let’s say we have two spheres, and we want to know whether they intersect. Given that we know the center points of the two spheres, C1 and C2, we can find a direction vector between them by simply subtracting the points, d = C2 − C1. The magnitude of this vector d = |d| determines how far apart the spheres’ centers are. If this distance is less than the sum of the spheres’ radii, they are intersecting; otherwise they’re not. This is shown in Figure 5.9.
再举个例子,假设我们有两个球体,我们想知道它们是否相交。鉴于我们知道两个球体的中心点 C 1 和 C 2 ,我们可以通过简单地减去这些点来找到它们之间的方向向量 d = C 2 - C 1 。该向量的大小 d = |d|确定球体中心的距离。如果该距离小于球体半径之和,则它们相交;否则他们就不是。如图 5.9 所示。
Square roots are expensive to calculate on most computers, so game programmers should always use the squared magnitude whenever it is valid to do so:
在大多数计算机上计算平方根的成本很高,因此游戏程序员应始终使用平方大小(只要有效):
Using the squared magnitude is valid when comparing the relative lengths of two vectors (“is vector a longer than vector b?”), or when comparing a vector’s magnitude to some other (squared) scalar quantity. So in our sphere-sphere intersection test, we should calculate d2 = |d|2 and compare this to the squared sum of the radii, (r1 + r2)2 for maximum speed. When writing high-performance software, never take a square root when you don’t have to!
在比较两个向量的相对长度(“向量 a 比向量 b 长吗?”)或将向量的大小与其他(平方)标量进行比较时,使用平方大小是有效的。因此,在我们的球体与球体相交测试中,我们应该计算 d 2 = |d| 2 并将其与半径的平方和进行比较,(r 1 + r 2 ) 2 以获得最大速度。在编写高性能软件时,除非必要,切勿求平方根!
A unit vector is a vector with a magnitude (length) of one. Unit vectors are very useful in 3D mathematics and game programming, for reasons we’ll see below.
单位向量是大小(长度)为 1 的向量。单位向量在 3D 数学和游戏编程中非常有用,原因我们将在下面看到。
Given an arbitrary vector v of length v = |v|, we can convert it to a unit vector u that points in the same direction as v, but has unit length. To do this, we simply multiply v by the reciprocal of its magnitude. We call this normalization:
给定一个长度为 v = |v| 的任意向量 v,我们可以将其转换为与 v 指向相同方向但具有单位长度的单位向量 u。为此,我们只需将 v 乘以其大小的倒数即可。我们称之为标准化:
A vector is said to be normal to a surface if it is perpendicular to that surface. Normal vectors are highly useful in games and computer graphics. For example, a plane can be defined by a point and a normal vector. And in 3D graphics, lighting calculations make heavy use of normal vectors to define the direction of surfaces relative to the direction of the light rays impinging upon them.
如果矢量垂直于该表面,则称该矢量垂直于该表面。法向量在游戏和计算机图形学中非常有用。例如,平面可以由点和法向量定义。在 3D 图形中,照明计算大量使用法线向量来定义表面相对于照射到表面的光线方向的方向。
Normal vectors are usually of unit length, but they do not need to be. Be careful not to confuse the term “normalization” with the term “normal vector.” A normalized vector is any vector of unit length. A normal vector is any vector that is perpendicular to a surface, whether or not it is of unit length.
法线向量通常具有单位长度,但并非必须如此。请注意不要将术语“归一化”与术语“法向量”混淆。归一化向量是任何单位长度的向量。法向量是垂直于表面的任何向量,无论它是否具有单位长度。
Vectors can be multiplied, but unlike scalars there are a number of different kinds of vector multiplication. In game programming, we most often work with the following two kinds of multiplication:
向量可以相乘,但与标量不同的是,向量乘法有多种不同类型。在游戏编程中,我们最常使用以下两种乘法:
The dot product of two vectors yields a scalar; it is defined by adding the products of the individual components of the two vectors:
两个向量的点积产生一个标量;它是通过将两个向量的各个分量的乘积相加来定义的:
The dot product can also be written as the product of the magnitudes of the two vectors and the cosine of the angle between them:
点积也可以写成两个向量的幅值与它们之间角度的余弦的乘积:
The dot product is commutative (i.e., the order of the two vectors can be reversed) and distributive over addition:
点积是可交换的(即,两个向量的顺序可以颠倒)并且可分配于加法:
And the dot product combines with scalar multiplication as follows:
点积与标量乘法结合如下:
If u is a unit vector (|u| = 1), then the dot product (a · u) represents the length of the projection of vector a onto the infinite line defined by the direction of u, as shown in Figure 5.10. This projection concept works equally well in 2D or 3D and is highly useful for solving a wide variety of three-dimensional problems.
如果u是单位向量(|u| = 1),则点积(a·u)表示向量a在u方向定义的无限直线上投影的长度,如图5.10所示。这种投影概念在 2D 或 3D 中同样有效,对于解决各种三维问题非常有用。
The squared magnitude of a vector can be found by taking the dot product of that vector with itself. Its magnitude is then easily found by taking the square root:
向量的平方大小可以通过该向量与其自身的点积得到。然后通过求平方根很容易找到它的大小:
This works because the cosine of zero degrees is 1, so |a| |a| cos θ = |a| |a| = |a|2.
这是可行的,因为 0 度的余弦为 1,所以 |a| |一个|余弦θ = |a| |一个| =|一个| 2 。
Dot products are great for testing if two vectors are collinear or perpendicular, or whether they point in roughly the same or roughly opposite directions. For any two arbitrary vectors a and b, game programmers often use the following tests, as shown in Figure 5.11:
点积非常适合测试两个向量是否共线或垂直,或者它们是否指向大致相同或大致相反的方向。对于任意两个任意向量a和b,游戏程序员经常使用以下测试,如图5.11所示:
Dot products can be used for all sorts of things in game programming. For example, let’s say we want to find out whether an enemy is in front of the player character or behind him. We can find a vector from the player’s position P to the enemy’s position E by simple vector subtraction (v = E − P). Let’s assume we have a vector f pointing in the direction that the player is facing. (As we’ll see in Section 5.3.10.3, the vector f can be extracted directly from the player’s model-to-world matrix.) The dot product d = v · f can be used to test whether the enemy is in front of or behind the player—it will be positive when the enemy is in front and negative when the enemy is behind.
点积可用于游戏编程中的各种用途。例如,假设我们想找出敌人是在玩家角色的前面还是后面。我们可以通过简单的向量减法找到从玩家位置 P 到敌人位置 E 的向量 (v = E − P)。假设我们有一个向量 f 指向玩家面对的方向。 (正如我们将在第 5.3.10.3 节中看到的,向量 f 可以直接从玩家的模型到世界矩阵中提取。)点积 d = v · f 可用于测试敌人是否在前方或在玩家身后——敌人在前面时为正值,敌人在后面时为负值。
The dot product can also be used to find the height of a point above or below a plane (which might be useful when writing a moon-landing game for example). We can define a plane with two vector quantities: a point Q lying anywhere on the plane, and a unit vector n that is perpendicular (i.e., normal) to the plane. To find the height h of a point P above the plane, we first calculate a vector from any point on the plane (Q will do nicely) to the point in question P. So we have v = P − Q. The dot product of vector v with the unit-length normal vector n is just the projection of v onto the line defined by n. But that is exactly the height we’re looking for. Therefore,
点积还可用于查找平面上方或下方的点的高度(例如,在编写登月游戏时这可能很有用)。我们可以用两个向量定义一个平面:位于平面上任意位置的点 Q 和垂直于该平面(即法线)的单位向量 n。为了找到平面上方点 P 的高度 h,我们首先计算从平面上的任意点(Q 就可以)到所讨论的点 P 的向量。因此我们有 v = P − Q。具有单位长度法向量 n 的向量 v 只是 v 到由 n 定义的直线上的投影。但这正是我们正在寻找的高度。所以,
This is illustrated in Figure 5.12.
如图 5.12 所示。
The cross product (also known as the outer product or vector product) of two vectors yields another vector that is perpendicular to the two vectors being multiplied, as shown in Figure 5.13. The cross product operation is only defined in three dimensions:
两个向量的叉积(也称为外积或向量积)会产生另一个与相乘的两个向量垂直的向量,如图 5.13 所示。叉积运算仅在三个维度上定义:
The magnitude of the cross product vector is the product of the magnitudes of the two vectors and the sine of the angle between them. (This is similar to the definition of the dot product, but it replaces the cosine with the sine.)
叉积向量的大小是两个向量的大小与它们之间的角度的正弦值的乘积。 (这与点积的定义类似,但它用正弦代替了余弦。)
The magnitude of the cross product | a ϗ b| is equal to the area of the parallelogram whose sides are a and b, as shown in Figure 5.14. Since a triangle is one half of a parallelogram, the area of a triangle whose vertices are specified by the position vectors V1, V2 and V3 can be calculated as one half of the magnitude of the cross product of any two of its sides:
叉积的大小 | a ϗ b|等于边长为 a 和 b 的平行四边形的面积,如图 5.14 所示。由于三角形是平行四边形的一半,因此,其顶点由位置向量 V 1 、 V 2 和 V 3 指定的三角形的面积可以计算为其任意两条边的叉积大小的一半:
When using a right-handed coordinate system, you can use the right-hand rule to determine the direction of the cross product. Simply cup your fingers such that they point in the direction you’d rotate vector a to move it on top of vector b, and the cross product (a ϗ b) will be in the direction of your thumb.
使用右手坐标系时,可以使用右手定则来确定叉积的方向。只需将手指弯曲,使其指向旋转矢量 a 的方向,将其移动到矢量 b 的顶部,叉积 (a ϗ b) 将沿您拇指的方向。
Note that the cross product is defined by the left-hand rule when using a left-handed coordinate system. This means that the direction of the cross product changes depending on the choice of coordinate system. This might seem odd at first, but remember that the handedness of a coordinate system does not affect the mathematical calculations we carry out—it only changes our visualization of what the numbers look like in 3D space. When converting from a right-handed system to a left-handed system or vice versa, the numerical representations of all the points and vectors stay the same, but one axis flips. Our visualization of everything is therefore mirrored along that flipped axis. So if a cross product just happens to align with the axis we’re flipping (e.g., the z-axis), it needs to flip when the axis flips. If it didn’t, the mathematical definition of the cross product itself would have to be changed so that the z-coordinate of the cross product comes out negative in the new coordinate system. I wouldn’t lose too much sleep over all of this. Just remember: when visualizing a cross product, use the right-hand rule in a right-handed coordinate system and the left-hand rule in a left-handed coordinate system.
请注意,使用左手坐标系时,叉积是由左手定则定义的。这意味着叉积的方向根据坐标系的选择而变化。一开始这可能看起来很奇怪,但请记住,坐标系的旋向性不会影响我们执行的数学计算,它只会改变我们对 3D 空间中数字的可视化。当从右手系统转换为左手系统时,反之亦然,所有点和向量的数值表示保持不变,但一个轴会翻转。因此,我们对一切的可视化都是沿着翻转的轴镜像的。因此,如果叉积恰好与我们要翻转的轴(例如 z 轴)对齐,则它需要在轴翻转时翻转。如果没有,则必须更改叉积本身的数学定义,以便叉积的 z 坐标在新坐标系中显示为负值。我不会因为这一切而失去太多睡眠。请记住:在可视化叉积时,在右手坐标系中使用右手法则,在左手坐标系中使用左手法则。
The cross product is not commutative (i.e., order matters):
叉积不可交换(即顺序很重要):
However, it is anti-commutative:
然而,它是反交换的:
The cross product is distributive over addition:
叉积对于加法是可分配的:
And it combines with scalar multiplication as follows:
它与标量乘法结合如下:
The Cartesian basis vectors are related by cross products as follows:
笛卡尔基向量通过叉积相关,如下所示:
These three cross products define the direction of positive rotations about the Cartesian axes. The positive rotations go from x to y (about z), from y to z (about x) and from z to x (about y). Notice how the rotation about the y-axis “reversed” alphabetically, in that it goes from z to x (not from x to z). As we’ll see below, this gives us a hint as to why the matrix for rotation about the y-axis looks inverted when compared to the matrices for rotation about the x- and z-axes.
这三个叉积定义了绕笛卡尔轴正旋转的方向。正旋转从 x 到 y(关于 z)、从 y 到 z(关于 x)以及从 z 到 x(关于 y)。请注意绕 y 轴的旋转如何按字母顺序“反转”,因为它是从 z 到 x(而不是从 x 到 z)。正如我们将在下面看到的,这给了我们一个提示,为什么与绕 x 轴和 z 轴旋转的矩阵相比,绕 y 轴旋转的矩阵看起来是倒置的。
The cross product has a number of applications in games. One of its most common uses is for finding a vector that is perpendicular to two other vectors. As we’ll see in Section 5.3.10.2, if we know an object’s local unit basis vectors, (ilocal, jlocal and klocal), we can easily find a matrix representing the object’s orientation. Let’s assume that all we know is the object’s klocal vector—i.e., the direction in which the object is facing. If we assume that the object has no roll about klocal, then we can find ilocal by taking the cross product between klocal (which we already know) and the world-space up vector jworld (which equals [0 1 0]). We do so as follows: ilocal = normalize(jworld ϗ klocal). We can then find jlocal by simply crossing ilocal and klocal as follows: jlocal = klocal ϗ ilocal.
叉积在游戏中有很多应用。它最常见的用途之一是查找与其他两个向量垂直的向量。正如我们将在第 5.3.10.2 节中看到的,如果我们知道对象的局部单位基向量(i local 、 j local 和 k local ),我们可以很容易地找到一个表示物体方向的矩阵。假设我们只知道物体的 k local 向量,即物体面向的方向。如果我们假设该对象没有关于 k local 的滚动,那么我们可以通过 k local 之间的叉积找到 i local (我们已经知道) 和世界空间向上向量 j world (等于 [0 1 0])。我们这样做如下: i local = normalize(j world ϗ k local )。然后,我们可以通过简单地交叉 i local 和 k local 来找到 j local ,如下所示: j local = k local 。
A very similar technique can be used to find a unit vector normal to the surface of a triangle or some other plane. Given three points on the plane, P1, P2 and P3, the normal vector is just n = normalize ((P2 − P1) ϗ (P3 − P1)).
可以使用非常相似的技术来查找垂直于三角形或其他平面的表面的单位向量。给定平面上的三个点 P 1 、 P 2 和 P 3 ,法向量就是 n = normalize ((P 2 ) ϗ (P 3 - P 1 ))。
Cross products are also used in physics simulations. When a force is applied to an object, it will give rise to rotational motion if and only if it is applied off-center. This rotational force is known as a torque, and it is calculated as follows. Given a force F, and a vector r from the center of mass to the point at which the force is applied, the torque N = r ϗ F.
叉积也用于物理模拟。当力施加到物体上时,当且仅当力偏离中心时,它才会产生旋转运动。该旋转力称为扭矩,其计算如下。给定力 F 和从质心到力施加点的矢量 r,扭矩 N = r ϗ F。
We mentioned in Section 5.2.2 that the cross product doesn’t actually produce a vector—it produces a special kind of mathematical object known as a pseudovector. The difference between a vector and a pseudovector is pretty subtle. In fact, you can’t tell the difference between them at all when performing the kinds of transformations we normally encounter in game programming—translation, rotation and scaling. It’s only when you reflect the coordinate system (as happens when you move from a left-handed coordinate system to a right-handed system) that the special nature of pseudovectors becomes apparent. Under reflection, a vector transforms into its mirror image, as you’d probably expect. But when a pseudovector is reflected, it transforms into its mirror image and also changes direction.
我们在第 5.2.2 节中提到,叉积实际上并不产生向量——它产生一种特殊的数学对象,称为伪向量。矢量和伪矢量之间的区别非常微妙。事实上,当执行我们在游戏编程中通常遇到的转换(平移、旋转和缩放)时,您根本无法区分它们之间的区别。只有当您反映坐标系时(就像从左手坐标系移动到右手坐标系时发生的那样),赝向量的特殊性质才会变得明显。正如您可能期望的那样,在反射下,矢量会转换为其镜像。但是当赝矢量被反射时,它会变成镜像并且方向也会改变。
Positions and all of the derivatives thereof (linear velocity, acceleration, jerk) are represented by true vectors (also known as polar vectors or contravariant vectors). Angular velocities and magnetic fields are represented by pseudovectors (also known as axial vectors, covariant vectors, bivectors or 2-blades). The surface normal of a triangle (which is calculated using a cross product) is also a pseudovector.
位置及其所有导数(线速度、加速度、加加速度)均由真向量(也称为极向量或逆变向量)表示。角速度和磁场由赝矢量(也称为轴向矢量、协变矢量、双矢量或 2 叶片)表示。三角形的表面法线(使用叉积计算)也是伪向量。
It’s pretty interesting to note that the cross product (A ϗ B), the scalar triple product (A · (B ϗ C)) and the determinant of a matrix are all inter-related, and pseudovectors lie at the heart of it all. Mathematicians have come up with a set of algebraic rules, called an exterior algebra or Grassman algebra, which describe how vectors and pseudovectors work and allow us to calculate areas of parallelograms (in 2D), volumes of parallelepipeds (in 3D), and so on in higher dimensions.
有趣的是,叉积 (A ϗ B)、标量三重积 (A · (B ϗ C)) 和矩阵的行列式都是相互关联的,而伪向量是这一切的核心。数学家提出了一组代数规则,称为外代数或格拉斯曼代数,它们描述了向量和伪向量的工作原理,并允许我们计算平行四边形的面积(2D)、平行六面体的体积(3D)等等在更高的维度中。
We won’t get into all the details here, but the basic idea of Grassman algebra is to introduce a special kind of vector product known as the wedge product, denoted A ∧ B. A pairwise wedge product yields a pseudovector and is equivalent to a cross product, which also represents the signed area of the parallelogram formed by the two vectors (where the sign tells us whether we’re rotating from A to B or vice versa). Doing two wedge products in a row, A ∧ B ∧ C, is equivalent to the scalar triple product A · (B ϗ C) and produces another strange mathematical object known as a pseudoscalar (also known as a trivector or a 3-blade), which can be interpreted as the signed volume of the parallelepiped formed by the three vectors (see Figure 5.15). This extends into higher dimensions as well.
我们不会在这里详细讨论所有细节,但格拉斯曼代数的基本思想是引入一种特殊的向量积,称为楔积,表示为 A ∧ B。成对楔积产生一个伪向量,相当于叉积,它也表示由两个向量形成的平行四边形的有符号面积(其中符号告诉我们是否从 A 旋转到 B,反之亦然)。连续进行两个楔积 A ∧ B ∧ C,相当于标量三重积 A · (B ϗ C),并产生另一个奇怪的数学对象,称为伪标量(也称为三向量或 3 刀片) ,可以解释为由三个向量形成的平行六面体的带符号体积(见图 5.15)。这也延伸到更高的维度。
What does all this mean for us as game programmers? Not too much. All we really need to keep in mind is that some vectors in our code are actually pseudovectors, so that we can transform them properly when changing handedness, for example. Of course if you really want to geek out, you can impress your friends by talking about exterior algebras and wedge products and explaining how cross products aren’t really vectors. Which might make you look cool at your next social engagement …or not.
这一切对我们游戏程序员意味着什么?不是太多。我们真正需要记住的是,代码中的某些向量实际上是伪向量,因此我们可以在改变惯用手等情况时正确地转换它们。当然,如果你真的想极客,你可以通过谈论外代数和楔积并解释叉积为什么不是真正的向量来给你的朋友留下深刻的印象。这可能会让你在下一次社交活动中看起来很酷……也可能不会。
For more information, see http://en.wikipedia.org/wiki/Pseudovector, http://en.wikipedia.org/wiki/Exterior_algebra, and http://www.terathon.com/gdc12_lengyel.pdf.
有关更多信息,请参阅 http://en.wikipedia.org/wiki/Pseudovector、http://en.wikipedia.org/wiki/Exterior_algebra 和 http://www.terathon.com/gdc12_lengyel.pdf。
In games, we often need to find a vector that is midway between two known vectors. For example, if we want to smoothly animate an object from point A to point B over the course of two seconds at 30 frames per second, we would need to find 60 intermediate positions between A and B.
在游戏中,我们经常需要找到一个位于两个已知向量中间的向量。例如,如果我们想要在两秒内以每秒 30 帧的速度平滑地制作一个对象从 A 点到 B 点的动画,我们需要在 A 和 B 之间找到 60 个中间位置。
A linear interpolation is a simple mathematical operation that finds an intermediate point between two known points. The name of this operation is often shortened to LERP. The operation is defined as follows, where β ranges from 0 to 1 inclusive:
线性插值是一种简单的数学运算,可找到两个已知点之间的中间点。此操作的名称通常缩写为 LERP。该操作定义如下,其中 β 的范围为 0 到 1(包括 0 和 1):
Geometrically, L = LERP(A, B, β) is the position vector of a point that lies β percent of the way along the line segment from point A to point B, as shown in Figure 5.16. Mathematically, the LERP function is just a weighted average of the two input vectors, with weights (1 − β) and β, respectively. Notice that the weights always add to 1, which is a general requirement for any weighted average.
几何上,L = LERP(A, B, β) 是从 A 点到 B 点的线段上 β% 处的点的位置向量,如图 5.16 所示。从数学上讲,LERP 函数只是两个输入向量的加权平均值,权重分别为 (1 − β) 和 β。请注意,权重总和为 1,这是任何加权平均值的一般要求。
A matrix is a rectangular array of m ϗ n scalars. Matrices are a convenient way of representing linear transformations such as translation, rotation and scale.
矩阵是 m ϗ n 标量的矩形阵列。矩阵是表示线性变换(例如平移、旋转和缩放)的便捷方法。
A matrix M is usually written as a grid of scalars Mrc enclosed in square brackets, where the subscripts r and c represent the row and column indices of the entry, respectively. For example, if M is a 3 ϗ 3 matrix, it could be written as follows:
矩阵 M 通常写为括在方括号中的标量 M rc 的网格,其中下标 r 和 c 分别表示条目的行索引和列索引。例如,如果 M 是一个 3 ϗ 3 矩阵,则可以写成如下:
We can think of the rows and/or columns of a 3 ϗ 3 matrix as 3D vectors. When all of the row and column vectors of a 3 ϗ 3 matrix are of unit magnitude, we call it a special orthogonal matrix. This is also known as an isotropic matrix, or an orthonormal matrix. Such matrices represent pure rotations.
我们可以将 3 ϗ 3 矩阵的行和/或列视为 3D 向量。当 3 ϗ 3 矩阵的所有行向量和列向量均为单位量值时,我们将其称为特殊正交矩阵。这也称为各向同性矩阵或正交矩阵。这样的矩阵代表纯旋转。
Under certain constraints, a 4 ϗ 4 matrix can represent arbitrary 3D transformations, including translations, rotations, and changes in scale. These are called transformation matrices, and they are the kinds of matrices that will be most useful to us as game engineers. The transformations represented by a matrix are applied to a point or vector via matrix multiplication. We’ll investigate how this works below.
在某些约束下,4 ϗ 4 矩阵可以表示任意 3D 变换,包括平移、旋转和比例变化。这些称为变换矩阵,它们是对我们游戏工程师最有用的矩阵。矩阵表示的变换通过矩阵乘法应用于点或向量。我们将在下面研究它是如何工作的。
An affine matrix is a 4 ϗ 4 transformation matrix that preserves parallelism of lines and relative distance ratios, but not necessarily absolute lengths and angles. An affine matrix is any combination of the following operations: rotation, translation, scale and/or shear.
仿射矩阵是一个 4 ϗ 4 变换矩阵,它保留线的平行度和相对距离比,但不一定保留绝对长度和角度。仿射矩阵是以下操作的任意组合:旋转、平移、缩放和/或剪切。
The product P of two matrices A and B is written P = AB. If A and B are transformation matrices, then the product P is another transformation matrix that performs both of the original transformations. For example, if A is a scale matrix and B is a rotation, the matrix P would both scale and rotate the points or vectors to which it is applied. This is particularly useful in game programming, because we can precalculate a single matrix that performs a whole sequence of transformations and then apply all of those transformations to a large number of vectors efficiently.
两个矩阵 A 和 B 的乘积 P 写作 P = AB。如果 A 和 B 是变换矩阵,则乘积 P 是执行这两个原始变换的另一个变换矩阵。例如,如果 A 是缩放矩阵,B 是旋转矩阵,则矩阵 P 将缩放和旋转其所应用到的点或向量。这在游戏编程中特别有用,因为我们可以预先计算执行整个变换序列的单个矩阵,然后将所有这些变换有效地应用于大量向量。
To calculate a matrix product, we simply take dot products between the rows of the nA ϗ mA matrix A and the columns of the nB ϗ mB matrix B. Each dot product becomes one component of the resulting matrix P. The two matrices can be multiplied as long as the inner dimensions are equal (i.e., mA = nB). For example, if A and B are 3 ϗ 3 matrices, then P = AB may be expressed as follows:
要计算矩阵乘积,我们只需取 n A ϗ m A 矩阵 A 的行与 n B ϗ m 的列之间的点积 B 矩阵 B。每个点积成为结果矩阵 P 的一个组成部分。只要内部维度相等(即 m A = n < b5>)。例如,如果 A 和 B 是 3 ϗ 3 矩阵,则 P = AB 可以表示如下:
Matrix multiplication is not commutative. In other words, the order in which matrix multiplication is done matters:
矩阵乘法不可交换。换句话说,矩阵乘法的执行顺序很重要:
We’ll see exactly why this matters in Section 5.3.2.
我们将在第 5.3.2 节中确切地了解为什么这很重要。
Matrix multiplication is often called concatenation, because the product of n transformation matrices is a matrix that concatenates, or chains together, the original sequence of transformations in the order the matrices were multiplied.
矩阵乘法通常称为串联,因为 n 个变换矩阵的乘积是按照矩阵相乘的顺序将原始变换序列串联或链接在一起的矩阵。
Points and vectors can be represented as row matrices (1 ϗ n) or column matrices (n ϗ 1), where n is the dimension of the space we’re working with (usually 2 or 3). For example, the vector v = (3, 4, − 1) can be written either as
点和向量可以表示为行矩阵 (1 ϗ n) 或列矩阵 (n ϗ 1),其中 n 是我们正在使用的空间的维度(通常为 2 或 3)。例如,向量 v = (3, 4, − 1) 可以写为
or as
或作为
Here, the superscripted T represents matrix transposition (see Section 5.3.5).
这里,上标 T 表示矩阵转置(参见第 5.3.5 节)。
The choice between column and row vectors is a completely arbitrary one, but it does affect the order in which matrix multiplications are written. This happens because when multiplying matrices, the inner dimensions of the two matrices must be equal, so
列向量和行向量之间的选择是完全任意的,但它确实会影响矩阵乘法的写入顺序。发生这种情况是因为矩阵相乘时,两个矩阵的内部维度必须相等,所以
If multiple transformation matrices A, B and C are applied in order to a vector v, the transformations “read” from left to right when using row vectors, but from right to left when using column vectors. The easiest way to remember this is to realize that the matrix closest to the vector is applied first. This is illustrated by the parentheses below:
如果将多个变换矩阵 A、B 和 C 按顺序应用于向量 v,则在使用行向量时,变换从左到右“读取”,而在使用列向量时,变换从右到左“读取”。记住这一点的最简单方法是首先应用最接近向量的矩阵。下面的括号说明了这一点:
In this book we’ll adopt the row vector convention, because the left-to-right order of transformations is most intuitive to read for English-speaking people. That said, be very careful to check which convention is used by your game engine, and by other books, papers or web pages you may read. You can usually tell by seeing whether vector-matrix multiplications are written with the vector on the left (for row vectors) or the right (for column vectors) of the matrix. When using column vectors, you’ll need to transpose all the matrices shown in this book.
在本书中,我们将采用行向量约定,因为从左到右的转换顺序对于英语国家的人来说是最直观的阅读方式。也就是说,请务必仔细检查您的游戏引擎以及您可能阅读的其他书籍、论文或网页使用的约定。通常,您可以通过查看向量矩阵乘法是使用矩阵左侧(对于行向量)还是右侧(对于列向量)的向量来进行判断。使用列向量时,您需要转置本书中显示的所有矩阵。
The identity matrix is a matrix that, when multiplied by any other matrix, yields the very same matrix. It is usually represented by the symbol I. The identity matrix is always a square matrix with 1’s along the diagonal and 0’s everywhere else:
单位矩阵是一个矩阵,当与任何其他矩阵相乘时,会产生完全相同的矩阵。它通常用符号 I 表示。单位矩阵始终是一个方阵,对角线上为 1,其他位置为 0:
The inverse of a matrix A is another matrix (denoted A−1) that undoes the effects of matrix A. So, for example, if A rotates objects by 37 degrees about the z-axis, then A−1 will rotate by −37 degrees about the z-axis. Likewise, if A scales objects to be twice their original size, then A−1 scales objects to be half-sized. When a matrix is multiplied by its own inverse, the result is always the identity matrix, so A(A−1) ≡ (A−1)A ≡ I. Not all matrices have inverses. However, all affine matrices (combinations of pure rotations, translations, scales and shears) do have inverses. Gaussian elimination or lower-upper (LU) decomposition can be used to find the inverse, if one exists.
矩阵 A 的逆矩阵是另一个矩阵(表示为 A −1 ),它消除了矩阵 A 的影响。因此,例如,如果 A 将对象绕 z 轴旋转 37 度,则 A −1 将绕 z 轴旋转 -37 度。同样,如果 A 将对象缩放为其原始大小的两倍,则 A −1 将对象缩放为原来大小的一半。当一个矩阵乘以它自己的逆矩阵时,结果始终是单位矩阵,因此 A(A −1 ) ≡ (A −1 )A ≡ I。并非所有矩阵都有逆矩阵。然而,所有仿射矩阵(纯旋转、平移、缩放和剪切的组合)都具有逆矩阵。高斯消去法或下上(LU)分解可用于求逆矩阵(如果存在)。
Since we’ll be dealing with matrix multiplication a lot, it’s important to note here that the inverse of a sequence of concatenated matrices can be written as the reverse concatenation of the individual matrices’ inverses. For example,
由于我们将大量处理矩阵乘法,因此在此需要注意的是,串联矩阵序列的逆可以写为各个矩阵逆的反向串联。例如,
The transpose of a matrix M is denoted MT. It is obtained by reflecting the entries of the original matrix across its diagonal. In other words, the rows of the original matrix become the columns of the transposed matrix, and vice versa:
矩阵 M 的转置表示为 M T 。它是通过反映原始矩阵对角线的条目而获得的。换句话说,原始矩阵的行变成转置矩阵的列,反之亦然:
The transpose is useful for a number of reasons. For one thing, the inverse of an orthonormal (pure rotation) matrix is exactly equal to its transpose—which is good news, because it’s much cheaper to transpose a matrix than it is to find its inverse in general. Transposition can also be important when moving data from one math library to another, because some libraries use column vectors while others expect row vectors. The matrices used by a row-vector-based library will be transposed relative to those used by a library that employs the column vector convention.
出于多种原因,转置很有用。一方面,正交(纯旋转)矩阵的逆矩阵完全等于其转置——这是个好消息,因为转置矩阵比求其逆矩阵通常要便宜得多。将数据从一个数学库移动到另一个数学库时,转置也很重要,因为有些库使用列向量,而其他库则使用行向量。基于行向量的库使用的矩阵将相对于采用列向量约定的库使用的矩阵进行转置。
As with the inverse, the transpose of a sequence of concatenated matrices can be rewritten as the reverse concatenation of the individual matrices’ transposes. For example,
与逆矩阵一样,串联矩阵序列的转置可以重写为各个矩阵转置的反向串联。例如,
This will prove useful when we consider how to apply transformation matrices to points and vectors.
当我们考虑如何将变换矩阵应用于点和向量时,这将证明是有用的。
You may recall from high-school algebra that a 2 ϗ 2 matrix can represent a rotation in two dimensions. To rotate a vector r through an angle of ϕ degrees (where positive rotations are counterclockwise), we can write
您可能还记得高中代数中的 2 ϗ 2 矩阵可以表示二维旋转。要将矢量 r 旋转 phi 度的角度(其中正旋转是逆时针),我们可以写
It’s probably no surprise that rotations in three dimensions can be represented by a 3 ϗ 3 matrix. The two-dimensional example above is really just a three-dimensional rotation about the z-axis, so we can write
三维旋转可以用 3 ϗ 3 矩阵表示,这可能并不奇怪。上面的二维例子实际上只是绕 z 轴的三维旋转,所以我们可以写
The question naturally arises: Can a 3 ϗ 3 matrix be used to represent translations? Sadly, the answer is no. The result of translating a point r by a translation t requires adding the components of t to the components of r individually:
问题自然而然地出现了:可以用 3 ϗ 3 矩阵来表示平移吗?遗憾的是,答案是否定的。通过平移 t 平移点 r 的结果需要将 t 的分量分别添加到 r 的分量中:
Matrix multiplication involves multiplication and addition of matrix elements, so the idea of using multiplication for translation seems promising. But, unfortunately, there is no way to arrange the components of t within a 3 ϗ 3 matrix such that the result of multiplying it with the column vector r yields sums like (rx + tx).
矩阵乘法涉及矩阵元素的乘法和加法,因此使用乘法进行翻译的想法似乎很有前途。但不幸的是,没有办法将 t 的分量排列在 3 ϗ 3 矩阵内,以便将其与列向量 r 相乘的结果得到类似 (r x + t x
The good news is that we can obtain sums like this if we use a 4 ϗ 4 matrix. What would such a matrix look like? Well, we know that we don’t want any rotational effects, so the upper 3 ϗ 3 should contain an identity matrix. If we arrange the components of t across the bottom-most row of the matrix and set the fourth element of the r vector (usually called w) equal to 1, then taking the dot product of the vector r with column 1 of the matrix will yield (1 · rx) + (0 · ry) + (0 · rz) + (tx · 1), which is exactly what we want. If the bottom right-hand corner of the matrix contains a 1 and the rest of the fourth column contains zeros, then the resulting vector will also have a 1 in its w component. Here’s what the final 4 ϗ 4 translation matrix looks like:
好消息是,如果我们使用 4 ϗ 4 矩阵,我们可以获得这样的总和。这样的矩阵会是什么样子?好吧,我们知道我们不想要任何旋转效应,所以上面的 3 ϗ 3 应该包含一个单位矩阵。如果我们将 t 的分量排列在矩阵的最底行,并将 r 向量的第四个元素(通常称为 w)设置为 1,则将向量 r 与矩阵的第 1 列进行点积:产量 (1 · r x ) + (0 · r y ) + (0 · r z ) + (t x · 1),这正是我们想要的。如果矩阵的右下角包含 1 并且第四列的其余部分包含 0,则所得向量的 w 分量中也将包含 1。最终的 4 ϗ 4 平移矩阵如下所示:
When a point or vector is extended from three dimensions to four in this manner, we say that it has been written in homogeneous coordinates. A point in homogeneous coordinates always has w = 1. Most of the 3D matrix math done by game engines is performed using 4 ϗ 4 matrices with four-element points and vectors written in homogeneous coordinates.
当一个点或向量以这种方式从三维扩展到四维时,我们说它已经被写成齐次坐标。齐次坐标中的点始终具有 w = 1。游戏引擎完成的大多数 3D 矩阵数学运算都是使用 4 ϗ 4 矩阵(具有四元素点和以齐次坐标编写的向量)来执行。
Mathematically, points (position vectors) and direction vectors are treated in subtly different ways. When transforming a point by a matrix, the translation, rotation and scale of the matrix are all applied to the point. But when transforming a direction by a matrix, the translational effects of the matrix are ignored. This is because direction vectors have no translation per se—applying a translation to a direction would alter its magnitude, which is usually not what we want.
从数学上讲,点(位置向量)和方向向量的处理方式略有不同。当通过矩阵变换点时,矩阵的平移、旋转和缩放都会应用于该点。但是当通过矩阵变换方向时,矩阵的平移效应被忽略。这是因为方向向量本身没有平移——对方向应用平移会改变其大小,这通常不是我们想要的。
In homogeneous coordinates, we achieve this by defining points to have their w components equal to one, while direction vectors have their w components equal to zero. In the example below, notice how the w = 0 component of the vector v multiplies with the t vector in the matrix, thereby eliminating translation in the final result:
在齐次坐标中,我们通过定义点使其 w 分量等于 1,而方向向量使其 w 分量等于 0 来实现这一点。在下面的示例中,请注意向量 v 的 w = 0 分量如何与矩阵中的 t 向量相乘,从而消除最终结果中的平移:
Technically, a point in homogeneous (four-dimensional) coordinates can be converted into non-homogeneous (three-dimensional) coordinates by dividing the x, y and z components by the w component:
从技术上讲,齐次(四维)坐标中的点可以通过将 x、y 和 z 分量除以 w 分量来转换为非齐次(三维)坐标:
This sheds some light on why we set a point’s w component to one and a vector’s w component to zero. Dividing by w = 1 has no effect on the coordinates of a point, but dividing a pure direction vector’s components by w = 0 would yield infinity. A point at infinity in 4D can be rotated but not translated, because no matter what translation we try to apply, the point will remain at infinity. So in effect, a pure direction vector in three-dimensional space acts like a point at infinity in four-dimensional homogeneous space.
这揭示了为什么我们将点的 w 分量设置为 1,将向量的 w 分量设置为零。除以 w = 1 对点的坐标没有影响,但将纯方向向量的分量除以 w = 0 将产生无穷大。 4D 中的无穷远点可以旋转但不能平移,因为无论我们尝试应用什么平移,该点都将保持在无穷远。因此,实际上,三维空间中的纯方向向量就像四维齐次空间中的无穷远点。
Any affine transformation matrix can be created by simply concatenating a sequence of 4 ϗ 4 matrices representing pure translations, pure rotations, pure scale operations and/or pure shears. These atomic transformation building blocks are presented below. (We’ll omit shear from these discussions, as it tends to be used only rarely in games.)
任何仿射变换矩阵都可以通过简单地连接表示纯平移、纯旋转、纯缩放操作和/或纯剪切的 4 ϗ 4 矩阵序列来创建。这些原子转换构建块如下所示。 (我们将在这些讨论中省略剪切,因为它在游戏中很少使用。)
Notice that all affine 4 ϗ 4 transformation matrices can be partitioned into four components:
请注意,所有仿射 4 ϗ 4 变换矩阵都可以分为四个分量:
When a point is multiplied by a matrix that has been partitioned like this, the result is as follows:
当一个点乘以这样划分的矩阵时,结果如下:
The following matrix translates a point by the vector t:
下面的矩阵通过向量 t 平移一个点:
or in partitioned shorthand:
或分区速记:
To invert a pure translation matrix, simply negate the vector t (i.e., negate tx, ty and tz).
要反转纯平移矩阵,只需对向量 t 取负(即对 t x 、 t y 和 t z 求反)。
All 4 ϗ 4 pure rotation matrices have the form
所有 4 ϗ 4 纯旋转矩阵具有以下形式
The t vector is zero, and the upper 3 ϗ 3 matrix R contains cosines and sines of the rotation angle, measured in radians.
t 向量为零,上面的 3 ϗ 3 矩阵 R 包含旋转角的余弦和正弦(以弧度为单位)。
The following matrix represents rotation about the x-axis by an angle ϕ.
以下矩阵表示绕 x 轴旋转角度 phi。
The matrix below represents rotation about the y-axis by an angle θ. (Notice that this one is transposed relative to the other two—the positive and negative sine terms have been reflected across the diagonal.)
下面的矩阵表示绕 y 轴旋转角度 θ。 (请注意,这一项相对于其他两项进行了调换——正正弦项和负正弦项已反映在对角线上。)
The following matrix represents rotation about the z-axis by an angle γ:
以下矩阵表示绕 z 轴旋转角度 γ:
Here are a few observations about these matrices:
以下是对这些矩阵的一些观察:
The following matrix scales the point r by a factor of sx along the x-axis, Sy along the y-axis and sz along the z-axis:
以下矩阵将点 r 沿 x 轴缩放为 s x 倍,沿 y 轴缩放为 s y 倍,沿 z 轴缩放为 s z 倍-轴:
or in partitioned shorthand:
或分区速记:
Here are some observations about this kind of matrix:
以下是关于此类矩阵的一些观察:
The rightmost column of an affine 4 ϗ 4 matrix always contains the vector [0 0 0 1]T. As such, game programmers often omit the fourth column to save memory. You’ll encounter 4 ϗ 3 affine matrices frequently in game math libraries.
仿射 4 ϗ 4 矩阵的最右列始终包含向量 [0 0 0 1] T 。因此,游戏程序员经常省略第四列以节省内存。在游戏数学库中,您会经常遇到 4 ϗ 3 仿射矩阵。
We’ve seen how to apply transformations to points and direction vectors using 4 ϗ 4 matrices. We can extend this idea to rigid objects by realizing that such an object can be thought of as an infinite collection of points. Applying a transformation to a rigid object is like applying that same transformation to every point within the object. For example, in computer graphics an object is usually represented by a mesh of triangles, each of which has three vertices represented by points. In this case, the object can be transformed by applying a transformation matrix to all of its vertices in turn.
我们已经了解了如何使用 4 ϗ 4 矩阵对点和方向向量应用变换。我们可以通过认识到刚性物体可以被认为是点的无限集合来将这个想法扩展到刚性物体。对刚性对象应用变换就像对对象内的每个点应用相同的变换一样。例如,在计算机图形学中,对象通常由三角形网格表示,每个三角形具有由点表示的三个顶点。在这种情况下,可以通过依次将变换矩阵应用于对象的所有顶点来变换对象。
We said above that a point is a vector whose tail is fixed to the origin of some coordinate system. This is another way of saying that a point (position vector) is always expressed relative to a set of coordinate axes. The triplet of numbers representing a point changes numerically whenever we select a new set of coordinate axes. In Figure 5.17, we see a point P represented by two different position vectors—the vector PA gives the position of P relative to the “A” axes, while the vector PB gives the position of that same point relative to a different set of axes “B.”
上面我们说过,点是一个向量,其尾部固定在某个坐标系的原点。这是另一种说法,点(位置向量)总是相对于一组坐标轴来表示。每当我们选择一组新的坐标轴时,表示点的三元组数字就会发生数字变化。在图 5.17 中,我们看到一个点 P 由两个不同的位置向量表示——向量 P A 给出了 P 相对于“A”轴的位置,而向量 P B 给出同一点相对于不同轴组“B”的位置。
In physics, a set of coordinate axes represents a frame of reference, so we sometimes refer to a set of axes as a coordinate frame (or just a frame). People in the game industry also use the term coordinate space (or simply space) to refer to a set of coordinate axes. In the following sections, we’ll look at a few of the most common coordinate spaces used in games and computer graphics.
在物理学中,一组坐标轴代表一个参考系,因此我们有时将一组轴称为坐标系(或简称为参考系)。游戏行业的人们也使用术语坐标空间(或简称空间)来指代一组坐标轴。在下面的部分中,我们将了解游戏和计算机图形学中使用的一些最常见的坐标空间。
When a triangle mesh is created in a tool such as Maya or 3DStudioMAX, the positions of the triangles’ vertices are specified relative to a Cartesian coordinate system, which we call model space (also known as object space or local space). The model-space origin is usually placed at a central location within the object, such as at its center of mass, or on the ground between the feet of a humanoid or animal character.
当在 Maya 或 3DStudioMAX 等工具中创建三角形网格时,三角形顶点的位置是相对于笛卡尔坐标系指定的,我们将其称为模型空间(也称为对象空间或局部空间)。模型空间原点通常放置在对象内的中心位置,例如其质心,或者人形或动物角色的脚之间的地面上。
Most game objects have an inherent directionality. For example, an airplane has a nose, a tail fin and wings that correspond to the front, up and left/right directions. The model-space axes are usually aligned to these natural directions on the model, and they’re given intuitive names to indicate their directionality as illustrated in Figure 5.18.
大多数游戏对象都有固有的方向性。例如,飞机有机头、尾翼和机翼,分别对应于前、上、左/右方向。模型空间轴通常与模型上的这些自然方向对齐,并且它们被赋予直观的名称来指示它们的方向性,如图 5.18 所示。
The mapping between the (front, up, left) labels and the (x, y, z) axes is completely arbitrary. A common choice when working with right-handed axes is to assign the label front to the positive z-axis, the label left to the positive x-axis and the label up to the positive y-axis (or in terms of unit basis vectors, F = k, L = i and U = j). However, it’s equally common for +x to be front and +z to be right (F = i, R = k, U = j). I’ve also worked with engines in which the z-axis is oriented vertically. The only real requirement is that you stick to one convention consistently throughout your engine.
(前、上、左)标签和(x、y、z)轴之间的映射是完全任意的。使用右手轴时的常见选择是将标签前面分配给正 z 轴,将标签分配给左侧正 x 轴,将标签分配给正 y 轴(或按照单位基向量) ,F = k,L = i 且 U = j)。然而,+x 位于前面且 +z 位于右侧同样常见(F = i、R = k、U = j)。我还使用过 z 轴垂直定向的引擎。唯一真正的要求是您在整个引擎中始终坚持一种约定。
As an example of how intuitive axis names can reduce confusion, consider Euler angles (pitch, yaw, roll), which are often used to describe an aircraft’s orientation. It’s not possible to define pitch, yaw, and roll angles in terms of the (i, j, k) basis vectors because their orientation is arbitrary. However, we can define pitch, yaw and roll in terms of the (L, U, F) basis vectors, because their orientations are clearly defined. Specifically,
作为直观的轴名称如何减少混乱的示例,请考虑欧拉角(俯仰角、偏航角、横滚角),它们通常用于描述飞机的方向。不可能根据 (i, j, k) 基向量定义俯仰角、偏航角和滚动角,因为它们的方向是任意的。然而,我们可以根据 (L, U, F) 基向量来定义俯仰、偏航和滚转,因为它们的方向是明确定义的。具体来说,
World space is a fixed coordinate space, in which the positions, orientations and scales of all objects in the game world are expressed. This coordinate space ties all the individual objects together into a cohesive virtual world.
世界空间是一个固定的坐标空间,其中表达了游戏世界中所有物体的位置、方向和尺度。这个坐标空间将所有单独的对象连接在一起,形成一个有凝聚力的虚拟世界。
The location of the world-space origin is arbitrary, but it is often placed near the center of the playable game space to minimize the reduction in floating-point precision that can occur when (x, y, z) coordinates grow very large. Likewise, the orientation of the x-, y- and z-axes is arbitrary, although most of the engines I’ve encountered use either a y-up or a z-up convention. The y-up convention was probably an extension of the two-dimensional convention found in most mathematics textbooks, where the y-axis is shown going up and the x-axis going to the right. The z-up convention is also common, because it allows a top-down orthographic view of the game world to look like a traditional two-dimensional xy-plot.
世界空间原点的位置是任意的,但它通常放置在可玩游戏空间的中心附近,以最大限度地减少当 (x, y, z) 坐标变得非常大时可能发生的浮点精度降低。同样,x、y 和 z 轴的方向是任意的,尽管我遇到的大多数引擎都使用 y 向上或 z 向上约定。 y 轴向上约定可能是大多数数学教科书中发现的二维约定的扩展,其中 y 轴显示为向上,x 轴显示为右侧。 z 向上约定也很常见,因为它允许游戏世界的自上而下的正交视图看起来像传统的二维 xy 绘图。
As an example, let’s say that our aircraft’s left wingtip is at (5, 0, 0) in model space. (In our game, front vectors correspond to the positive z-axis in model space with y up, as shown in Figure 5.18.) Now imagine that the jet is facing down the positive x-axis in world space, with its model-space origin at some arbitrary location, such as (−25, 50, 8). Because the F vector of the airplane, which corresponds to +z in model space, is facing down the +x-axis in world space, we know that the jet has been rotated by 90 degrees about the world y-axis. So, if the aircraft were sitting at the world-space origin, its left wingtip would be at (0, 0, − 5) in world space. But because the aircraft’s origin has been translated to (−25, 50, 8), the final position of the jet’s left wingtip in world space is (−25, 50, [8 − 5]) = (−25, 50, 3). This is illustrated in Figure 5.19.
举个例子,假设我们飞机的左翼尖位于模型空间中的 (5, 0, 0) 处。 (在我们的游戏中,前向量对应于模型空间中 y 向上的正 z 轴,如图 5.18 所示。)现在想象喷气机面向世界空间中的正 x 轴,其模型空间原点在某个任意位置,例如 (−25, 50, 8)。由于飞机的 F 向量(对应于模型空间中的 +z)面向世界空间中的 +x 轴,因此我们知道飞机已绕世界 y 轴旋转了 90 度。因此,如果飞机位于世界空间原点,则其左翼尖将位于世界空间中的 (0, 0, − 5) 处。但由于飞机的原点已转换为 (−25, 50, 8),因此飞机左翼尖在世界空间中的最终位置为 (−25, 50, [8 − 5]) = (−25, 50, 3 )。如图 5.19 所示。
We could of course populate our friendly skies with more than one Lear jet. In that case, all of their left wingtips would have coordinates of (5, 0, 0) in model space. But in world space, the left wingtips would have all sorts of interesting coordinates, depending on the orientation and translation of each aircraft.
当然,我们可以用不止一架李尔喷气机来填充我们友好的天空。在这种情况下,它们的所有左翼尖在模型空间中的坐标均为 (5, 0, 0)。但在世界空间中,左翼尖将具有各种有趣的坐标,具体取决于每架飞机的方向和平移。
View space (also known as camera space) is a coordinate frame fixed to the camera. The view space origin is placed at the focal point of the camera. Again, any axis orientation scheme is possible. However, a y-up convention with z increasing in the direction the camera is facing (left-handed) is typical because it allows z coordinates to represent depths into the screen. Other engines and APIs, such as OpenGL, define view space to be right-handed, in which case the camera faces towards negative z, and z coordinates represent negative depths. Two possible definitions of view space are illustrated in Figure 5.20.
视图空间(也称为相机空间)是固定在相机上的坐标系。视空间原点放置在相机的焦点处。同样,任何轴方向方案都是可能的。然而,y 向上约定(z 沿相机面向的方向(左手)增加)是典型的,因为它允许 z 坐标表示屏幕的深度。其他引擎和 API(例如 OpenGL)将视图空间定义为右手坐标系,在这种情况下相机面向负 z,并且 z 坐标表示负深度。视图空间的两种可能的定义如图 5.20 所示。
In games and computer graphics, it is often quite useful to convert an object’s position, orientation and scale from one coordinate system into another. We call this operation a change of basis.
在游戏和计算机图形学中,将对象的位置、方向和比例从一种坐标系转换为另一种坐标系通常非常有用。我们称此操作为基础变更。
Coordinate frames are relative. That is, if you want to quantify the position, orientation and scale of a set of axes in three-dimensional space, you must specify these quantities relative to some other set of axes (otherwise the numbers would have no meaning). This implies that coordinate spaces form a hierarchy—every coordinate space is a child of some other coordinate space, and the other space acts as its parent. World space has no parent; it is at the root of the coordinate-space tree, and all other coordinate systems are ultimately specified relative to it, either as direct children or more-distant relatives.
坐标系是相对的。也就是说,如果要量化三维空间中一组轴的位置、方向和比例,则必须相对于其他一组轴指定这些量(否则这些数字将没有意义)。这意味着坐标空间形成层次结构 - 每个坐标空间都是其他某个坐标空间的子空间,而另一个空间充当其父空间。世界空间没有父空间;它位于坐标空间树的根部,所有其他坐标系最终都是相对于它指定的,无论是直接子坐标系还是更远的亲戚。
The matrix that transforms points and directions from any child coordinate system C to its parent coordinate system P can be written MC→P (pronounced “C to P”). The subscript indicates that this matrix transforms points and directions from child space to parent space. Any child-space position vector PC can be transformed into a parent-space position vector PP as follows:
将点和方向从任何子坐标系 C 变换到其父坐标系 P 的矩阵可以写为 M C →P (发音为“C to P”)。下标表示该矩阵将点和方向从子空间变换到父空间。任何子空间位置向量 P C 都可以变换为父空间位置向量 P P ,如下所示:
In this equation,
在这个等式中,
This result should not be too surprising. The tC vector is just the translation of the child-space axes relative to parent space, so if the rest of the matrix were identity, the point (0, 0, 0) in child space would become tC in parent space, just as we’d expect. The iC, jC and kC unit vectors form the upper 3 ϗ 3 of the matrix, which is a pure rotation matrix because these vectors are of unit length. We can see this more clearly by considering a simple example, such as a situation in which child space is rotated by an angle γ about the z-axis, with no translation. Recall from Equation (5.6) that the matrix for such a rotation is given by
这个结果应该不会太令人意外。 t C 向量只是子空间轴相对于父空间的平移,因此如果矩阵的其余部分是恒等的,则子空间中的点 (0, 0, 0) 将变为 t C 在父空间中,正如我们所期望的那样。 i C 、 j C 和 k C 单位向量形成矩阵的上 3 ϗ 3 ,这是一个纯旋转矩阵,因为这些向量是单位长度。通过考虑一个简单的例子,我们可以更清楚地看到这一点,例如子空间绕 z 轴旋转角度 γ 的情况,而不进行平移。回想一下方程 (5.6),这种旋转的矩阵由下式给出
But in Figure 5.21, we can see that the coordinates of the iC and jC vectors, expressed in parent space, are iC = [cos γ sin γ 0] and jC = [−sin γ cos γ 0]. When we plug these vectors into our formula for MC→P, with kC = [0 0 1], it exactly matches the matrix rotatez(r, γ) from Equation (5.6).
但在图 5.21 中,我们可以看到 i C 和 j C 向量在父空间中表示的坐标为 i C = [cos γ sin γ 0] 和 j C = [−sin γ cos γ 0]。当我们将这些向量代入 M C→P 的公式时,其中 k C = [0 0 1],它与矩阵旋转 z (r, γ) 来自方程 (5.6)。
Scaling of the child coordinate system is accomplished by simply scaling the unit basis vectors appropriately. For example, if child space is scaled up by a factor of two, then the basis vectors iC, jC and kC will be of length 2 instead of unit length.
子坐标系的缩放通过简单地适当缩放单位基向量来完成。例如,如果子空间放大两倍,则基向量 i C 、 j C 和 k C 的长度将为 2而不是单位长度。
The fact that we can build a change of basis matrix out of a translation and three Cartesian basis vectors gives us another powerful tool: Given any affine 4 ϗ 4 transformation matrix, we can go in the other direction and extract the child-space basis vectors iC, jC and kC from it by simply isolating the appropriate rows of the matrix (or columns if your math library uses column vectors).
事实上,我们可以通过平移和三个笛卡尔基向量构建基矩阵的变化,这一事实为我们提供了另一个强大的工具:给定任何仿射 4 ϗ 4 变换矩阵,我们可以朝另一个方向提取子空间基向量i C 、 j C 和 k C 通过简单地隔离矩阵的相应行(如果您的数学库使用列向量,则为列)。
This can be incredibly useful. Let’s say we are given a vehicle’s model-to-world transform as an affine 4 ϗ 4 matrix (a very common representation). This is really just a change of basis matrix, transforming points in model space into their equivalents in world space. Let’s further assume that in our game, the positive z-axis always points in the direction that an object is facing. So, to find a unit vector representing the vehicle’s facing direction, we can simply extract kC directly from the model-to-world matrix (by grabbing its third row). This vector will already be normalized and ready to go.
这非常有用。假设我们将车辆的模型到世界的变换作为仿射 4 ϗ 4 矩阵(一种非常常见的表示)。这实际上只是基础矩阵的变化,将模型空间中的点转换为世界空间中的等价点。我们进一步假设在我们的游戏中,正 z 轴始终指向对象所面对的方向。因此,为了找到代表车辆面向方向的单位向量,我们可以简单地直接从模型到世界矩阵中提取 k C (通过抓取其第三行)。该向量已经标准化并准备就绪。
We’ve said that the matrix MC→P transforms points and directions from child space into parent space. Recall that the fourth row of MC→P contains tC, the translation of the child coordinate axes relative to the world-space axes. Therefore, another way to visualize the matrix MC→P is to imagine it taking the parent coordinate axes and transforming them into the child axes. This is the reverse of what happens to points and direction vectors. In other words, if a matrix transforms vectors from child space to parent space, then it also transforms coordinate axes from parent space to child space. This makes sense when you think about it—moving a point 20 units to the right with the coordinate axes fixed is the same as moving the coordinate axes 20 units to the left with the point fixed. This concept is illustrated in Figure 5.22.
我们说过矩阵 M C→P 将点和方向从子空间转换到父空间。回想一下,M C→P 的第四行包含 t C ,即子坐标轴相对于世界空间轴的平移。因此,可视化矩阵 M C→P 的另一种方法是想象它采用父坐标轴并将它们转换为子轴。这与点和方向向量的情况相反。换句话说,如果矩阵将向量从子空间变换到父空间,那么它也会将坐标轴从父空间变换到子空间。仔细想想,这是有道理的——在坐标轴固定的情况下将点向右移动 20 个单位与在该点固定的情况下将坐标轴向左移动 20 个单位相同。这个概念如图 5.22 所示。
Of course, this is just another point of potential confusion. If you’re thinking in terms of coordinate axes, then transformations go in one direction, but if you’re thinking in terms of points and vectors, they go in the other direction! As with many confusing things in life, your best bet is probably to choose a single “canonical” way of thinking about things and stick with it. For example, in this book we’ve chosen the following conventions:
当然,这只是另一个潜在的混乱点。如果您考虑坐标轴,则变换会朝一个方向进行,但如果您考虑点和向量,则变换会朝另一个方向进行!与生活中许多令人困惑的事情一样,你最好的选择可能是选择一种“规范”的思考事物的方式并坚持下去。例如,在本书中我们选择了以下约定:
Taken together, these two conventions allow us to read sequences of matrix multiplications from left to right and have them make sense (e.g., in the expression rD = rAMA→BMB→CMC→D, the B’s and C’s in effect “cancel out,” leaving only rD = rAMA→D). Obviously if you start thinking about the coordinate axes moving around rather than the points and vectors, you either have to read the transforms from right to left, or flip one of these two conventions around. It doesn’t really matter what conventions you choose as long as you find them easy to remember and work with.
综合起来,这两个约定允许我们从左到右读取矩阵乘法序列并使它们有意义(例如,在表达式 r D = r A M A→B M B→C M C→D ,B 和 C 实际上“抵消”,只留下 r D = r A M A→D )。显然,如果您开始考虑移动的坐标轴而不是点和向量,那么您要么必须从右到左读取变换,要么翻转这两种约定之一。只要您发现它们易于记忆和使用,您选择什么约定并不重要。
That said, it’s important to note that certain problems are easier to think about in terms of vectors being transformed, while others are easier to work with when you imagine the coordinate axes moving around. Once you get good at thinking about 3D vector and matrix math, you’ll find it pretty easy to flip back and forth between conventions as needed to suit the problem at hand.
也就是说,重要的是要注意,某些问题通过向量变换更容易思考,而当你想象坐标轴移动时,其他问题更容易处理。一旦您擅长思考 3D 矢量和矩阵数学,您就会发现根据需要在约定之间来回切换以适应手头的问题非常容易。
A normal vector is a special kind of vector, because in addition to (usually!) being of unit length, it carries with it the additional requirement that it should always remain perpendicular to whatever surface or plane it is associated with. Special care must be taken when transforming a normal vector to ensure that both its length and perpendicularity properties are maintained.
法向量是一种特殊类型的向量,因为除了(通常!)具有单位长度之外,它还附带附加要求,即它应始终保持垂直于与其关联的任何表面或平面。变换法线向量时必须特别小心,以确保保持其长度和垂直属性。
In general, if a point or (non-normal) vector can be rotated from space A to space B via the 3 ϗ 3 matrix Ma→b, then a normal vector n will be transformed from space A to space B via the inverse transpose of that matrix, . We will not prove or derive this result here (see [32, Section 3.5] for an excellent derivation). However, we will observe that if the matrix MA→B contains only uniform scale and no shear, then the angles between all surfaces and vectors in space B will be the same as they were in space A. In this case, the matrix MA→B will actually work just fine for any vector, normal or non-normal. However, if MA→B contains nonuniform scale or shear (i.e., is non-orthogonal), then the angles between surfaces and vectors are not preserved when moving from space A to space B. A vector that was normal to a surface in space A will not necessarily be perpendicular to that surface in space B. The inverse transpose operation accounts for this distortion, bringing normal vectors back into perpendicularity with their surfaces even when the transformation involves nonuniform scale or shear. Another way of looking at this is that the inverse transpose is required because a surface normal is really a pseudovector rather than a regular vector (see Section 5.2.4.9).
一般来说,如果一个点或(非法向)向量可以通过 3 ϗ 3 矩阵 M a→b 从空间 A 旋转到空间 B,那么法向向量 n 将从空间 A 变换到空间 B B 通过该矩阵的逆转置 。我们不会在这里证明或推导这个结果(有关优秀的推导,请参见[32,第 3.5 节])。然而,我们会观察到,如果矩阵 M A→B 仅包含均匀尺度且没有剪切,则空间 B 中所有表面和向量之间的角度将与空间 A 中的相同。在这种情况下,矩阵 M A→B 实际上适用于任何向量,无论是法向向量还是非法向向量。但是,如果 M A→B 包含不均匀的尺度或剪切(即非正交),则当从空间 A 移动到空间 B 时,表面和向量之间的角度不会保留。空间 A 中的表面不一定垂直于空间 B 中的表面。反转置运算可以解释这种扭曲,即使变换涉及不均匀的缩放或剪切,也可以使法向量恢复与其表面垂直。另一种看待这个问题的方法是需要逆转置,因为表面法线实际上是一个伪向量而不是一个规则向量(参见第 5.2.4.9 节)。
In the C and C++ languages, a two-dimensional array is often used to store a matrix. Recall that in C/C++ two-dimensional array syntax, the first subscript is the row and the second is the column, and the column index varies fastest as you move through memory sequentially.
在C和C++语言中,经常使用二维数组来存储矩阵。回想一下,在 C/C++ 二维数组语法中,第一个下标是行,第二个下标是列,并且当您按顺序在内存中移动时,列索引变化最快。
float m[4][4]; // [row][col], col varies fastest // “flatten” the array to demonstrate ordering float* pm = &m[0][0]; ASSERT(&pm[0] == &m[0][0]); ASSERT(&pm[1] == &m[0][1]); ASSERT(&pm[2] == &m[0][2]); // etc.
We have two choices when storing a matrix in a two-dimensional C/C++ array. We can either
在二维 C/C++ 数组中存储矩阵时,我们有两种选择。我们可以
The benefit of approach (1) is that we can address any one of the four vectors by simply indexing into the matrix and interpreting the four contiguous values we find there as a 4-element vector. This layout also has the benefit of matching up exactly with row vector matrix equations (which is another reason why I’ve selected row vector notation for this book). Approach (2) is sometimes necessary when doing fast matrix-vector multiplies using a vector-enabled (SIMD) microprocessor, as we’ll see later in this chapter. In most game engines I’ve personally encountered, matrices are stored using approach (1), with the vectors in the rows of the two-dimensional C/C++ array. This is shown below:
方法 (1) 的好处是,我们可以通过简单地索引到矩阵并将我们在其中找到的四个连续值解释为 4 元素向量来寻址四个向量中的任何一个。这种布局还具有与行向量矩阵方程完全匹配的优点(这是我为本书选择行向量表示法的另一个原因)。当使用支持向量(SIMD)的微处理器进行快速矩阵向量乘法时,方法(2)有时是必要的,我们将在本章后面看到。在我个人遇到的大多数游戏引擎中,矩阵都是使用方法 (1) 存储的,向量位于二维 C/C++ 数组的行中。如下所示:
float M[4][4]; M[0][0]=ix; M[0][1]=iy; M[0][2]=iz; M[0][3]=0.0f; M[1][0]=jx; M[1][1]=jy; M[1][2]=jz; M[1][3]=0.0f; M[2][0]=kx; M[2][1]=ky; M[2][2]=kz; M[2][3]=0.0f; M[3][0]=tx; M[3][1]=ty; M[3][2]=tz; M[3][3]=1.0f;
The matrix looks like this when viewed in a debugger:
在调试器中查看时,该矩阵如下所示:
M[][] [0] [0] ix [1] iy [2] iz [3] 0.0000 [1] [0] jx [1] jy [2] jz [3] 0.0000 [2] [0] kx [1] ky [2] kz [3] 0.0000 [3] [0] tx [1] ty [2] tz [3] 1.0000
One easy way to determine which layout your engine uses is to find a function that builds a 4 ϗ 4 translation matrix. (Every good 3D math library provides such a function.) You can then inspect the source code to see where the elements of the t vector are being stored. If you don’t have access to the source code of your math library (which is pretty rare in the game industry), you can always call the function with an easy-to-recognize translation like (4, 3, 2), and then inspect the resulting matrix. If row 3 contains the values 4.0f, 3.0f, 2.0f, 1.0f, then the vectors are in the rows, otherwise the vectors are in the columns.
确定引擎使用哪种布局的一种简单方法是找到一个构建 4 ϗ 4 平移矩阵的函数。 (每个好的 3D 数学库都提供这样的函数。)然后您可以检查源代码以查看 t 向量的元素存储在哪里。如果您无法访问数学库的源代码(这在游戏行业中相当罕见),您始终可以使用易于识别的翻译来调用该函数,例如 (4, 3, 2) 和然后检查生成的矩阵。如果第 3 行包含值 4.0f, 3.0f, 2.0f, 1.0f, ,则向量位于行中,否则向量位于列中。
We’ve seen that a 3 ϗ 3 matrix can be used to represent an arbitrary rotation in three dimensions. However, a matrix is not always an ideal representation of a rotation, for a number of reasons:
我们已经看到,3 ϗ 3 矩阵可用于表示三维空间中的任意旋转。然而,由于多种原因,矩阵并不总是旋转的理想表示:
Thankfully, there is a rotational representation that overcomes these three problems. It is a mathematical object known as a quaternion. A quaternion looks a lot like a four-dimensional vector, but it behaves quite differently. We usually write quaternions using non-italic, non-boldface type, like this: q = [qx qy qz qw].
值得庆幸的是,轮换代表可以克服这三个问题。它是一个称为四元数的数学对象。四元数看起来很像四维向量,但其行为却截然不同。我们通常使用非斜体、非粗体类型来书写四元数,如下所示: q = [q x q y q z q w ]。
Quaternions were developed by Sir William Rowan Hamilton in 1843 as an extension to the complex numbers. (Specifically, a quaternion may be interpreted as a four-dimensional complex number, with a single real axis and three imaginary axes represented by the imaginary numbers i, j and k. As such, a quaternion can be written in “complex form” as follows: q = iqx + jqy + kqz + qw.) Quaternions were first used to solve problems in the area of mechanics. Technically speaking, a quaternion obeys a set of rules known as a four-dimensional normed division algebra over the real numbers. Thankfully, we won’t need to understand the details of these rather esoteric algebraic rules. For our purposes, it will suffice to know that the unit-length quaternions (i.e., all quaternions obeying the constraint ) represent three-dimensional rotations.
四元数由 William Rowan Hamilton 爵士于 1843 年开发,作为复数的扩展。 (具体来说,四元数可以解释为四维复数,具有单个实轴和由虚数 i、j 和 k 表示的三个虚轴。因此,四元数可以写为“复数形式”:如下: q = iq x + jq y + kq z + q w 。)四元数首先用于解决力学领域的问题。从技术上讲,四元数遵循一组规则,称为实数上的四维赋范除代数。值得庆幸的是,我们不需要了解这些相当深奥的代数规则的细节。出于我们的目的,只要知道单位长度四元数(即所有遵守约束 的四元数)表示三维旋转就足够了。
There are a lot of great papers, web pages and presentations on quaternions available on the web for further reading. Here’s one of my favorites: http://graphics.ucsd.edu/courses/cse169_w05/CSE169_04.ppt.
网络上有很多关于四元数的优秀论文、网页和演示文稿可供进一步阅读。这是我最喜欢的之一:http://graphics.ucsd.edu/courses/cse169_w05/CSE169_04.ppt。
A unit quaternion can be visualized as a three-dimensional vector plus a fourth scalar coordinate. The vector part qV is the unit axis of rotation, scaled by the sine of the half-angle of the rotation. The scalar part qS is the cosine of the half-angle. So the unit quaternion q can be written as follows:
单位四元数可以可视化为三维向量加上第四个标量坐标。矢量部分 q V 是单位旋转轴,按旋转半角的正弦缩放。标量部分 q S 是半角的余弦。所以单位四元数 q 可以写成如下:
where a is a unit vector along the axis of rotation, and θ is the angle of rotation. The direction of the rotation follows the right-hand rule, so if your thumb points in the direction of a, positive rotations will be in the direction of your curved fingers.
其中a是沿旋转轴的单位向量,θ是旋转角度。旋转方向遵循右手定则,因此如果您的拇指指向 a 方向,则正旋转将沿着弯曲手指的方向。
Of course, we can also write q as a simple four-element vector:
当然,我们也可以将 q 写成一个简单的四元素向量:
A unit quaternion is very much like an axis+angle representation of a rotation (i.e., a four-element vector of the form [a θ]). However, quaternions are more convenient mathematically than their axis+angle counterparts, as we shall see below.
单位四元数非常类似于旋转的轴+角度表示(即,[a θ] 形式的四元素向量)。然而,四元数在数学上比轴+角度对应物更方便,正如我们将在下面看到的。
Quaternions support some of the familiar operations from vector algebra, such as magnitude and vector addition. However, we must remember that the sum of two unit quaternions does not represent a 3D rotation, because such a quaternion would not be of unit length. As a result, you won’t see any quaternion sums in a game engine, unless they are scaled in some way to preserve the unit length requirement.
四元数支持向量代数中的一些常见运算,例如幅度和向量加法。但是,我们必须记住,两个单位四元数的和并不代表 3D 旋转,因为这样的四元数不会具有单位长度。因此,您不会在游戏引擎中看到任何四元数和,除非以某种方式缩放它们以保持单位长度要求。
One of the most important operations we will perform on quaternions is that of multiplication. Given two quaternions p and q representing two rotations P and Q, respectively, the product pq represents the composite rotation (i.e., rotation Q followed by rotation P). There are actually quite a few different kinds of quaternion multiplication, but we’ll restrict this discussion to the variety used in conjunction with 3D rotations, namely the Grassman product. Using this definition, the product pq is defined as follows:
我们将对四元数执行的最重要的运算之一是乘法。给定两个四元数 p 和 q 分别表示两个旋转 P 和 Q,乘积 pq 表示复合旋转(即旋转 Q 后跟旋转 P)。实际上有相当多不同类型的四元数乘法,但我们将把讨论限制在与 3D 旋转结合使用的种类上,即 Grassman 乘积。使用此定义,乘积 pq 定义如下:
Notice how the Grassman product is defined in terms of a vector part, which ends up in the x, y and z components of the resultant quaternion, and a scalar part, which ends up in the w component.
请注意 Grassman 乘积是如何根据向量部分和标量部分(最终为所得四元数的 x、y 和 z 分量)和标量部分(最终为 w 分量)来定义的。
The inverse of a quaternion q is denoted q−1 and is defined as a quaternion that, when multiplied by the original, yields the scalar 1 (i.e., qq−1 = 0i + 0j + 0k + 1). The quaternion [0 0 0 1] represents a zero rotation (which makes sense since sin(0) = 0 for the first three components, and cos(0) = 1 for the last component).
四元数 q 的逆表示为 q −1 并被定义为一个四元数,当乘以原始值时,产生标量 1(即 < b3> −1 = 0i + 0j + 0k + 1)。四元数 [0 0 0 1] 表示零旋转(这是有意义的,因为前三个分量的 sin(0) = 0,最后一个分量的 cos(0) = 1)。
In order to calculate the inverse of a quaternion, we must first define a quantity known as the conjugate. This is usually denoted q* and it is defined as follows:
为了计算四元数的倒数,我们必须首先定义一个称为共轭的量。这通常表示为 q* ,其定义如下:
In other words, we negate the vector part but leave the scalar part unchanged.
换句话说,我们否定向量部分,但保持标量部分不变。
Given this definition of the quaternion conjugate, the inverse quaternion q−1 is defined as follows:
鉴于四元数共轭的定义,逆四元数 q −1 定义如下:
Our quaternions are always of unit length (i.e., |q| = 1), because they represent 3D rotations. So, for our purposes, the inverse and the conjugate are identical:
我们的四元数始终是单位长度(即 | q | = 1),因为它们代表 3D 旋转。因此,就我们的目的而言,逆函数和共轭函数是相同的:
This fact is incredibly useful, because it means we can always avoid doing the (relatively expensive) division by the squared magnitude when inverting a quaternion, as long as we know a priori that the quaternion is normalized. This also means that inverting a quaternion is generally much faster than inverting a 3 ϗ 3 matrix—a fact that you may be able to leverage in some situations when optimizing your engine.
这个事实非常有用,因为这意味着只要我们先验地知道四元数是标准化的,我们就可以在反转四元数时始终避免除以平方幅度(相对昂贵的)。这也意味着反转四元数通常比反转 3 ϗ 3 矩阵快得多——在优化引擎时,您在某些情况下可以利用这一事实。
The conjugate of a quaternion product (pq) is equal to the reverse product of the conjugates of the individual quaternions:
四元数乘积 ( pq ) 的共轭等于各个四元数共轭的逆积:
Likewise, the inverse of a quaternion product is equal to the reverse product of the inverses of the individual quaternions:
同样,四元数乘积的倒数等于各个四元数的倒数的倒数乘积:
This is analogous to the reversal that occurs when transposing or inverting matrix products.
这类似于转置或求逆矩阵乘积时发生的反转。
How can we apply a quaternion rotation to a vector? The first step is to rewrite the vector in quaternion form. A vector is a sum involving the unit basis vectors i, j and k. A quaternion is a sum involving i, j and k, but with a fourth scalar term as well. So it makes sense that a vector can be written as a quaternion with its scalar term qS equal to zero. Given the vector v, we can write a corresponding quaternion v = [v 0] = [vx vy vz 0].
我们如何将四元数旋转应用于向量?第一步是将向量重写为四元数形式。向量是单位基向量 i、j 和 k 的和。四元数是包含 i、j 和 k 的和,但还带有第四个标量项。因此,向量可以写成四元数,其标量项 q S 等于 0,这是有道理的。给定向量 v,我们可以写出相应的四元数 v = [v 0] = [v x v y v z 0 ]。
In order to rotate a vector v by a quaternion q, we premultiply the vector (written in its quaternion form v) by q and then post-multiply it by the inverse quaternion q−1. Therefore, the rotated vector v′ can be found as follows:
为了将向量 v 旋转四元数 q ,我们将向量(以其四元数形式 v 编写)预乘 q ,然后将其后乘由反四元数 q −1 。因此,旋转向量v′可计算如下:
This is equivalent to using the quaternion conjugate, because our quaternions are always unit length:
这相当于使用四元数共轭,因为我们的四元数始终是单位长度:
The rotated vector v′ is obtained by simply extracting it from its quaternion form v′.
通过简单地从四元数形式 v ' 中提取旋转向量 v' 即可获得。
Quaternion multiplication can be useful in all sorts of situations in real games. For example, let’s say that we want to find a unit vector describing the direction in which an aircraft is flying. We’ll further assume that in our game, the positive z-axis always points toward the front of an object by convention. So the forward unit vector of any object in model space is always FM ≡ [0 0 1] by definition. To transform this vector into world space, we can simply take our aircraft’s orientation quaternion q and use it with Equation (5.9) to rotate our model-space vector FM into its world-space equivalent FW (after converting these vectors into quaternion form, of course):
四元数乘法在真实游戏中的各种情况下都很有用。例如,假设我们想要找到一个描述飞机飞行方向的单位向量。我们将进一步假设,在我们的游戏中,按照惯例,正 z 轴始终指向对象的正面。因此,根据定义,模型空间中任何对象的前向单位向量始终为 F M ≡ [0 0 1]。要将这个向量转换到世界空间,我们可以简单地采用飞机的方向四元数 q 并将其与方程(5.9)一起使用,将我们的模型空间向量 F M 旋转到它的世界空间中 -空间等价 F W (当然,将这些向量转换为四元数形式之后):
Rotations can be concatenated in exactly the same way that matrix-based transformations can, by multiplying the quaternions together. For example, consider three distinct rotations, represented by the quaternions q1, q2 and q3, with matrix equivalents R1, R2 and R3. We want to apply rotation 1 first, followed by rotation 2 and finally rotation 3. The composite rotation matrix Rnet can be found and applied to a vector v as follows:
旋转的连接方式与基于矩阵的变换完全相同,即将四元数相乘。例如,考虑三个不同的旋转,由四元数 q 1 、 q 2 和 q 表示b5> ,矩阵等效为 R 1 、 R 2 和 R 3 。我们希望首先应用旋转 1,然后是旋转 2,最后是旋转 3。可以找到复合旋转矩阵 R net 并将其应用于向量 v,如下所示:
Likewise, the composite rotation quaternion qnet can be found and applied to vector v (in its quaternion form, v) as follows:
同样,可以找到复合旋转四元数 q net 并将其应用于向量 v (以其四元数形式 v ),如下所示:
Notice how the quaternion product must be performed in an order opposite to that in which the rotations are applied (q3q2q1). This is because quaternion rotations always multiply on both sides of the vector, with the uninverted quaternions on the left and the inverted quaternions on the right. As we saw in Equation (5.8), the inverse of a quaternion product is the reverse product of the individual inverses, so the uninverted quaternions read right-to-left while the inverted quaternions read left-to-right.
请注意,四元数积必须以与应用旋转的顺序相反的顺序执行( q 3 q 2 < b4> 1 )。这是因为四元数旋转总是在向量的两侧相乘,未反转的四元数位于左侧,反转的四元数位于右侧。正如我们在方程(5.8)中看到的,四元数乘积的逆是各个逆的乘积,因此未反转四元数从右到左读取,而反转四元数从左到右读取。
We can convert any 3D rotation freely between a 3 ϗ 3 matrix representation R and a quaternion representation q. If we let q = [qV qS] = [qVx qVy qVz qS] = [x y z w], then we can find R as follows:
我们可以在 3 ϗ 3 矩阵表示 R 和四元数表示 q 之间自由转换任何 3D 旋转。如果我们让 q = [q V q S ] = [q Vx q Vy q Vz q S ] = [x y z w],则我们可以求出 R,如下:
Likewise, given R, we can find q as follows (where q[0] = qVx, q[1] = qVy, q[2] = qVz and q[3] = qS). This code assumes that we are using row vectors in C/C++ (i.e., that the rows of the matrix correspond to the rows of the matrix R shown above). The code was adapted from a Gamasutra article by Nick Bobic, published on July 5, 1998, which is available here: http://www.gamasutra.com/view/feature/3278/rotating_objects_using_quaternions.php. For a discussion of some even faster methods for converting a matrix to a quaternion, leveraging various assumptions about the nature of the matrix, see http://www.euclideanspace.com/maths/geometry/rotations/conversions/matrixToQuaternion/index.htm.
同样,给定 R,我们可以如下找到 q(其中 q[0] = q Vx , q[1] = q Vy , q[2] = q Vz 和 q[3] = q S )。此代码假设我们在 C/C++ 中使用行向量(即矩阵的行对应于上面所示的矩阵 R 的行)。该代码改编自 Nick Bobic 于 1998 年 7 月 5 日发表的 Gamasutra 文章,可在此处获取:http://www.gamasutra.com/view/feature/3278/rotating_objects_using_quaternions.php。有关利用有关矩阵性质的各种假设将矩阵转换为四元数的一些更快方法的讨论,请参阅http://www.euclideanspace.com/maths/geometry/rotations/conversions/matrixToQuaternion/index.htm 。
void matrixToQuaternion(
const float R[3][3],
float q[/*4*/])
{
float trace = R[0][0] + R[1][1] + R[2][2];
// check the diagonal
if (trace > 0.0f)
{
float s = sqrt(trace + 1.0f);
q[3] = s * 0.5f;
float t = 0.5f / s;
q[0] = (R[2][1] − R[1][2]) * t;
q[1] = (R[0][2] − R[2][0]) * t;
q[2] = (R[1][0] − R[0][1]) * t;
}
else
{
// diagonal is negative
int i = 0;
if (R[1][1] > R[0][0]) i = 1;
if (R[2][2] > R[i][i]) i = 2;
static const int NEXT[3] = {1, 2, 0};
int j = NEXT[i];
int k = NEXT[j];
float s = sqrt((R[i][j]
− (R[j][j] + R[k][k]))
+ 1.0f);
q[i] = s * 0.5f;
float t;
if (s != 0.0) t = 0.5f / s;
else t = s;
q[3] = (R[k][j] − R[j][k]) * t;
q[j] = (R[j][i] + R[i][j]) * t;
q[k] = (R[k][i] + R[i][k]) * t;
}
}
Let’s pause for a moment to consider notational conventions. In this book, we write our quaternions like this: [x y z w]. This differs from the [w x y z] convention found in many academic papers on quaternions as an extension of the complex numbers. Our convention arises from an effort to be consistent with the common practice of writing homogeneous vectors as [x y z 1] (with the w = 1 at the end). The academic convention arises from the parallels between quaternions and complex numbers. Regular two-dimensional complex numbers are typically written in the form c = a + jb, and the corresponding quaternion notation is q = w + ix + jy + kz. So be careful out there—make sure you know which convention is being used before you dive into a paper head first!
让我们暂停一下来考虑一下符号约定。在本书中,我们这样写四元数:[x y z w]。这与许多关于四元数的学术论文中作为复数扩展的 [w x y z] 约定不同。我们的约定源自于与将齐次向量写为 [x y z 1](末尾的 w = 1)的常见做法保持一致的努力。学术惯例源于四元数和复数之间的相似之处。常规二维复数通常写成 c = a + jb 的形式,相应的四元数表示法为 q = w + ix + jy + kz。所以要小心——在你首先深入研究纸张之前,确保你知道正在使用哪种约定!
Rotational interpolation has many applications in the animation, dynamics and camera systems of a game engine. With the help of quaternions, rotations can be easily interpolated just as vectors and points can.
旋转插值在游戏引擎的动画、动力学和相机系统中有许多应用。在四元数的帮助下,可以像向量和点一样轻松地对旋转进行插值。
The easiest and least computationally intensive approach is to perform a four-dimensional vector LERP on the quaternions you wish to interpolate. Given two quaternions qA and qB representing rotations A and B, we can find an intermediate rotation qLERP that is β percent of the way from A to B as follows:
最简单且计算强度最小的方法是对要插值的四元数执行四维向量 LERP。给定两个四元数 q A 和 q B 分别代表旋转 A 和 B,我们可以找到一个中间旋转 q LERP 即从 A 到 B 的 β 百分比,如下所示:
qA and qB.q A 和 q B 之间的线性插值 (LERP)。Notice that the resultant interpolated quaternion had to be renormalized. This is necessary because the LERP operation does not preserve a vector’s length in general.
请注意,必须对所得插值四元数进行重新归一化。这是必要的,因为 LERP 操作通常不保留向量的长度。
Geometrically, qLERP = LERP(qA, qB, β) is the quaternion whose orientation lies β percent of the way from orientation A to orientation B, as shown (in two dimensions for clarity) in Figure 5.23. Mathematically, the LERP operation results in a weighted average of the two quaternions, with weights (1 − β) and β (notice that these two weights sum to 1).
几何上, q LERP = LERP( q A , q B , β ) 是四元数,其方向位于从方向 A 到方向 B 的 β 百分比处,如图 5.23 所示(为了清楚起见,采用二维)。从数学上讲,LERP 运算会产生两个四元数的加权平均值,权重为 (1 − β) 和 β(请注意,这两个权重之和为 1)。
The problem with the LERP operation is that it does not take account of the fact that quaternions are really points on a four-dimensional hypersphere. A LERP effectively interpolates along a chord of the hypersphere, rather than along the surface of the hypersphere itself. This leads to rotation animations that do not have a constant angular speed when the parameter β is changing at a constant rate. The rotation will appear slower at the end points and faster in the middle of the animation.
LERP 运算的问题在于它没有考虑到四元数实际上是四维超球面上的点这一事实。 LERP 有效地沿着超球面的弦进行插值,而不是沿着超球面本身的表面。当参数 β 以恒定速率变化时,这会导致旋转动画不具有恒定角速度。旋转在动画的终点处会显得较慢,而在中间会显得较快。
To solve this problem, we can use a variant of the LERP operation known as spherical linear interpolation, or SLERP for short. The SLERP operation uses sines and cosines to interpolate along a great circle of the 4D hypersphere, rather than along a chord, as shown in Figure 5.24. This results in a constant angular speed when β varies at a constant rate.
为了解决这个问题,我们可以使用 LERP 运算的一种变体,称为球面线性插值,简称 SLERP。 SLERP 运算使用正弦和余弦沿着 4D 超球面的大圆进行插值,而不是沿着弦进行插值,如图 5.24 所示。当 β 以恒定速率变化时,这会导致恒定的角速度。
The formula for SLERP is similar to the LERP formula, but the weights (1 − β) and β are replaced with weights wp and wq involving sines of the angle between the two quaternions.
SLERP 的公式与 LERP 公式类似,但权重 (1 − β) 和 β 被替换为权重 w p 和 w q ,涉及两者之间角度的正弦四元数。
where
在哪里
The cosine of the angle between any two unit-length quaternions can be found by taking their four-dimensional dot product. Once we know cos θ, we can calculate the angle θ and the various sines we need quite easily:
任何两个单位长度四元数之间的角度的余弦可以通过获取它们的四维点积来找到。一旦我们知道了 cos θ,我们就可以很容易地计算出角度 θ 和我们需要的各种正弦值:
The jury is still out on whether or not to use SLERP in a game engine. Jonathan Blow wrote a great article positing that SLERP is too expensive, and LERP’s quality is not really that bad—therefore, he suggests, we should understand SLERP but avoid it in our game engines (see http://number-none.com/product/Understanding%20Slerp,%20Then%20Not%20Using%20It/index.html). On the other hand, some of my colleagues at Naughty Dog have found that a good SLERP implementation performs nearly as well as LERP. (For example, on the PS3’s SPUs, Naughty Dog’s Ice team’s implementation of SLERP takes 20 cycles per joint, while its LERP implementation takes 16.25 cycles per joint.) Therefore, I’d personally recommend that you profile your SLERP and LERP implementations before making any decisions. If the performance hit for SLERP isn’t unacceptable, I say go for it, because it may result in slightly better-looking animations. But if your SLERP is slow (and you cannot speed it up, or you just don’t have the time to do so), then LERP is usually good enough for most purposes.
关于是否在游戏引擎中使用 SLERP 尚无定论。 Jonathan Blow 写了一篇很棒的文章,认为 SLERP 太贵了,而 LERP 的质量并没有那么差——因此,他建议,我们应该理解 SLERP,但在我们的游戏引擎中避免使用它(参见 http://number-none.com/产品/理解%20Slerp,%20Then%20Not%20Using%20It/index.html)。另一方面,我在 Naughty Dog 的一些同事发现,良好的 SLERP 实施的性能几乎与 LERP 一样好。 (例如,在 PS3 的 SPU 上,Naughty Dog 的 Ice 团队的 SLERP 实施每个关节需要 20 个周期,而其 LERP 实施每个关节需要 16.25 个周期。)因此,我个人建议您在进行之前分析您的 SLERP 和 LERP 实施任何决定。如果 SLERP 的性能影响不是不可接受的,我建议就使用它,因为它可能会产生稍微更好看的动画。但如果您的 SLERP 很慢(并且您无法加快速度,或者您只是没有时间这样做),那么 LERP 通常足以满足大多数用途。
We’ve seen that rotations can be represented in quite a few different ways. This section summarizes the most common rotational representations and outlines their pros and cons. No one representation is ideal in all situations. Using the information in this section, you should be able to select the best representation for a particular application.
我们已经看到,旋转可以用多种不同的方式表示。本节总结了最常见的旋转表示并概述了它们的优缺点。没有一种表现形式在所有情况下都是理想的。使用本节中的信息,您应该能够为特定应用程序选择最佳表示。
We briefly explored Euler angles in Section 5.3.9.1. A rotation represented via Euler angles consists of three scalar values: yaw, pitch and roll. These quantities are sometimes represented by a 3D vector [θY θP θR].
我们在第 5.3.9.1 节中简要探讨了欧拉角。通过欧拉角表示的旋转由三个标量值组成:偏航、俯仰和滚动。这些量有时用 3D 矢量 [θ Y θ P θ R ] 表示。
The benefits of this representation are its simplicity, its small size (three floating-point numbers) and its intuitive nature—yaw, pitch and roll are easy to visualize. You can also easily interpolate simple rotations about a single axis. For example, it’s trivial to find intermediate rotations between two distinct yaw angles by linearly interpolating the scalar θY. However, Euler angles cannot be interpolated easily when the rotation is about an arbitrarily oriented axis.
这种表示的优点是简单、尺寸小(三个浮点数)和直观性——偏航、俯仰和滚动很容易可视化。您还可以轻松地围绕单个轴插入简单的旋转。例如,通过线性插值标量 θ Y 来找到两个不同偏航角之间的中间旋转是很简单的。然而,当绕任意方向的轴旋转时,欧拉角无法轻松插值。
In addition, Euler angles are prone to a condition known as gimbal lock. This occurs when a 90-degree rotation causes one of the three principal axes to “collapse” onto another principal axis. For example, if you rotate by 90 degrees about the x-axis, the y-axis collapses onto the z-axis. This prevents any further rotations about the original y-axis, because rotations about y and z have effectively become equivalent.
此外,欧拉角容易出现万向节锁的情况。当 90 度旋转导致三个主轴之一“塌陷”到另一个主轴上时,就会发生这种情况。例如,如果绕 x 轴旋转 90 度,则 y 轴会折叠到 z 轴上。这可以防止绕原始 y 轴的任何进一步旋转,因为绕 y 和 z 的旋转实际上已变得等效。
Another problem with Euler angles is that the order in which the rotations are performed around each axis matters. The order could be PYR, YPR, RYP and so on, and each ordering may produce a different composite rotation. No one standard rotation order exists for Euler angles across all disciplines (although certain disciplines do follow specific conventions). So the rotation angles [θY θP θR] do not uniquely define a particular rotation—you need to know the rotation order to interpret these numbers properly.
欧拉角的另一个问题是围绕每个轴执行旋转的顺序很重要。顺序可以是PYR、YPR、RYP等,并且每种顺序可能产生不同的复合旋转。所有学科中的欧拉角都不存在一种标准的旋转顺序(尽管某些学科确实遵循特定的约定)。因此,旋转角度 [θ Y θ P θ R ] 并不唯一定义特定的旋转 - 您需要知道旋转顺序才能正确解释这些数字。
A final problem with Euler angles is that they depend upon the mapping from the x-, y- and z-axes onto the natural front, left/right and up directions for the object being rotated. For example, yaw is always defined as rotation about the up axis, but without additional information we cannot tell whether this corresponds to a rotation about x, y or z.
欧拉角的最后一个问题是它们取决于从 x、y 和 z 轴到旋转对象的自然前、左/右和上方向的映射。例如,偏航始终定义为绕上轴的旋转,但如果没有附加信息,我们无法判断这是否对应于绕 x、y 或 z 的旋转。
A 3 ϗ 3 matrix is a convenient and effective rotational representation for a number of reasons. It does not suffer from gimbal lock, and it can represent arbitrary rotations uniquely. Rotations can be applied to points and vectors in a straightforward manner via matrix multiplication (i.e., a series of dot products). Most CPUs and all GPUs now have built-in support for hardware-accelerated dot products and matrix multiplication. Rotations can also be reversed by finding an inverse matrix, which for a pure rotation matrix is the same thing as finding the transpose—a trivial operation. And 4 ϗ 4 matrices offer a way to represent arbitrary affine transformations—rotations, translations and scaling—in a totally consistent way.
出于多种原因,3 ϗ 3 矩阵是一种方便且有效的旋转表示。它不受万向节锁定的影响,并且可以唯一地表示任意旋转。旋转可以通过矩阵乘法(即一系列点积)以直接的方式应用于点和向量。大多数 CPU 和所有 GPU 现在都内置了对硬件加速点积和矩阵乘法的支持。旋转也可以通过找到逆矩阵来反转,对于纯旋转矩阵来说,这与找到转置是一样的——一个简单的操作。 4 ϗ 4 矩阵提供了一种以完全一致的方式表示任意仿射变换(旋转、平移和缩放)的方法。
However, rotation matrices are not particularly intuitive. Looking at a big table of numbers doesn’t help one picture the corresponding transformation in three-dimensional space. Also, rotation matrices are not easily interpolated. Finally, a rotation matrix takes up a lot of storage (nine floating-point numbers) relative to Euler angles (three floats).
然而,旋转矩阵并不是特别直观。查看一大堆数字表并不能帮助我们了解三维空间中相应的变换。此外,旋转矩阵不容易插值。最后,相对于欧拉角(三个浮点数),旋转矩阵占用大量存储空间(九个浮点数)。
We can represent rotations as a unit vector, defining the axis of rotation plus a scalar for the angle of rotation. This is known as an axis+angle representation, and it is sometimes denoted by the four-dimensional vector [a θ] = [ax ay az θ], where a is the axis of rotation and θ the angle in radians. In a right-handed coordinate system, the direction of a positive rotation is defined by the right-hand rule, while in a left-handed system, we use the left-hand rule instead.
我们可以将旋转表示为单位向量,定义旋转轴加上旋转角度的标量。这称为轴+角度表示,有时用四维向量 [a θ] = [a x a y a z θ],其中 a 是旋转轴,θ 是弧度角。在右手坐标系中,正旋转的方向由右手定则定义,而在左手坐标系中,我们使用左手定则。
The benefits of the axis+angle representation are that it is reasonably intuitive and also compact. (It only requires four floating-point numbers, as opposed to the nine required for a 3 ϗ 3 matrix.)
轴+角度表示的优点是相当直观且紧凑。 (它只需要四个浮点数,而 3 ϗ 3 矩阵需要九个浮点数。)
One important limitation of the axis+angle representation is that rotations cannot be easily interpolated. Also, rotations in this format cannot be applied to points and vectors in a straightforward way—one needs to convert the axis+angle representation into a matrix or quaternion first.
轴+角度表示的一个重要限制是旋转不能轻易地插值。此外,这种格式的旋转不能以直接的方式应用于点和向量——需要首先将轴+角度表示转换为矩阵或四元数。
As we’ve seen, a unit-length quaternion can represent 3D rotations in a manner analogous to the axis+angle representation. The primary difference between the two representations is that a quaternion’s axis of rotation is scaled by the sine of the half-angle of rotation, and instead of storing the angle in the fourth component of the vector, we store the cosine of the half-angle.
正如我们所见,单位长度四元数可以以类似于轴+角度表示的方式表示 3D 旋转。两种表示形式之间的主要区别在于,四元数的旋转轴按旋转半角的正弦进行缩放,并且我们不是将角度存储在向量的第四个分量中,而是存储半角的余弦。
The quaternion formulation provides two immense benefits over the axis +angle representation. First, it permits rotations to be concatenated and applied directly to points and vectors via quaternion multiplication. Second, it permits rotations to be easily interpolated via simple LERP or SLERP operations. Its small size (four floating-point numbers) is also a benefit over the matrix formulation.
与轴+角度表示相比,四元数公式提供了两个巨大的好处。首先,它允许将旋转连接起来并通过四元数乘法直接应用于点和向量。其次,它允许通过简单的 LERP 或 SLERP 操作轻松插值旋转。它的小尺寸(四个浮点数)也是优于矩阵公式的一个优点。
By itself, a quaternion can only represent a rotation, whereas a 4 ϗ 4 matrix can represent an arbitrary affine transformation (rotation, translation and scale). When a quaternion is combined with a translation vector and a scale factor (either a scalar for uniform scaling or a vector for nonuniform scaling), then we have a viable alternative to the 4 ϗ 4 matrix representation of affine transformations. We sometimes call this an SRT transform, because it contains a scale factor, a rotation quaternion and a translation vector. (It’s also sometimes called an SQT, because the rotation is a quaternion.)
就其本身而言,四元数只能表示旋转,而 4 ϗ 4 矩阵可以表示任意仿射变换(旋转、平移和缩放)。当四元数与平移向量和比例因子(均匀缩放的标量或非均匀缩放的向量)组合时,我们就有了仿射变换的 4 ϗ 4 矩阵表示的可行替代方案。我们有时将其称为 SRT 变换,因为它包含比例因子、旋转四元数和平移向量。 (有时也称为 SQT,因为旋转是四元数。)
or
SRT transforms are widely used in computer animation because of their smaller size (eight floats for uniform scale, or ten floats for nonuniform scale, as opposed to the 12 floating-point numbers needed for a 4 ϗ 3 matrix) and their ability to be easily interpolated. The translation vector and scale factor are interpolated via LERP, and the quaternion can be interpolated with either LERP or SLERP.
SRT 变换在计算机动画中得到广泛应用,因为其尺寸较小(均匀缩放需要 8 个浮点数,非均匀缩放需要 10 个浮点数,而不是 4 ϗ 3 矩阵所需的 12 个浮点数)并且能够轻松实现插值。平移向量和比例因子通过 LERP 进行插值,四元数可以通过 LERP 或 SLERP 进行插值。
A rigid transformation is a transformation involving a rotation and a translation—a “corkscrew” motion. Such transformations are prevalent in animation and robotics. A rigid transformation can be represented using a mathematical object known as a dual quaternion. The dual quaternion representation offers a number of benefits over the typical vector-quaternion representation. The key benefit is that linear interpolation blending can be performed in a constant-speed, shortest-path, coordinate-invariant manner, similar to using LERP for translation vectors and SLERP for rotational quaternions (see Section 5.4.5.1), but in a way that is easily generalizable to blends involving three or more transforms.
刚性变换是涉及旋转和平移的变换——“螺旋”运动。这种转变在动画和机器人技术中很普遍。刚性变换可以使用称为对偶四元数的数学对象来表示。与典型的向量四元数表示相比,对偶四元数表示具有许多优点。主要好处是线性插值混合可以以恒定速度、最短路径、坐标不变的方式执行,类似于使用 LERP 用于平移向量和 SLERP 用于旋转四元数(参见第 5.4.5.1 节),但在某种程度上这很容易推广到涉及三个或更多变换的混合。
A dual quaternion is like an ordinary quaternion, except that its four components are dual numbers instead of regular real-valued numbers. A dual number can be written as the sum of a non-dual part and a dual part as follows: . Here ε is a magical number called the dual unit, defined in such a way that ε2 = 0 (yet without ε itself being zero). This is analogous to the imaginary number used when writing a complex number as the sum of a real and an imaginary part: c = a + jb.
对偶四元数与普通四元数类似,只是它的四个分量是对偶数而不是常规实值数。对偶数可以写为非对偶部分和对偶部分之和,如下所示: 。这里 ε 是一个神奇的数字,称为对偶单位,定义方式为 ε 2 = 0(但 ε 本身不为零)。这类似于将复数写为实部和虚部之和时使用的虚数 :c = a + jb。
Because each dual number can be represented by two real numbers (the non-dual and dual parts, a and b), a dual quaternion can be represented by an eight-element vector. It can also be represented as the sum of two ordinary quaternions, where the second one is multiplied by the dual unit, as follows: .
因为每个对偶数都可以用两个实数(非对偶部分和对偶部分,a 和 b)表示,所以对偶四元数可以用八元素向量表示。它也可以表示为两个普通四元数的和,其中第二个四元数乘以对偶单位,如下: 。
A full discussion of dual numbers and dual quaternions is beyond our scope here. However, the excellent paper entitled, “Dual Quaternions for Rigid Transformation Blending” by Kavan et al. outlines the theory and practice of using dual quaternions to represent rigid transformations—it is available online at https://bit.ly/2vjD5sz. Note that in this paper, a dual number is written in the form , whereas I have used a + εb above to underscore the similarity between dual numbers and complex numbers.1
对对偶数和对偶四元数的完整讨论超出了我们的范围。然而,Kavan 等人发表的题为“用于刚性变换混合的双四元数”的优秀论文。概述了使用对偶四元数表示刚性变换的理论和实践 - 可在线获取 https://bit.ly/2vjD5sz。请注意,在本文中,对偶数以 的形式编写,而我在上面使用了 a + εb 来强调对偶数和复数之间的相似性。 1
The term “degrees of freedom” (or DOF for short) refers to the number of mutually independent ways in which an object’s physical state (position and orientation) can change. You may have encountered the phrase “six degrees of freedom” in fields such as mechanics, robotics and aeronautics. This refers to the fact that a three-dimensional object (whose motion is not artificially constrained) has three degrees of freedom in its translation (along the x-, y- and z-axes) and three degrees of freedom in its rotation (about the x-, y- and z-axes), for a total of six degrees of freedom.
术语“自由度”(或简称 DOF)是指物体的物理状态(位置和方向)可以改变的相互独立的方式的数量。您可能在机械、机器人和航空等领域遇到过“六自由度”这个词。这是指三维物体(其运动不受人为约束)具有三个平移自由度(沿 x、y 和 z 轴)和三个旋转自由度(大约x、y 和 z 轴),总共有六个自由度。
The DOF concept will help us to understand how different rotational representations can employ different numbers of floating-point parameters, yet all specify rotations with only three degrees of freedom. For example, Euler angles require three floats, but axis+angle and quaternion representations use four floats, and a 3 ϗ 3 matrix takes up nine floats. How can these representations all describe 3-DOF rotations?
DOF 概念将帮助我们理解不同的旋转表示如何使用不同数量的浮点参数,但所有旋转都指定仅具有三个自由度的旋转。例如,欧拉角需要三个浮点数,但轴+角度和四元数表示使用四个浮点数,而 3 ϗ 3 矩阵则需要九个浮点数。这些表示如何全部描述 3-DOF 旋转?
The answer lies in constraints. All 3D rotational representations employ three or more floating-point parameters, but some representations also have one or more constraints on those parameters. The constraints indicate that the parameters are not independent—a change to one parameter induces changes to the other parameters in order to maintain the validity of the constraint(s). If we subtract the number of constraints from the number of floating-point parameters, we arrive at the number of degrees of freedom—and this number should always be three for a 3D rotation:
答案在于约束。所有 3D 旋转表示均采用三个或更多浮点参数,但某些表示对这些参数也有一个或多个约束。这些约束表明参数不是独立的——一个参数的更改会引起其他参数的更改,以维持约束的有效性。如果我们从浮点参数的数量中减去约束的数量,我们就会得到自由度的数量——对于 3D 旋转来说,这个数字应该始终是 3:
The following list shows Equation (5.10) in action for each of the rotational representations we’ve encountered in this book.
下面的列表显示了我们在本书中遇到的每个旋转表示的方程(5.10)。
Constraint: Axis is constrained to be unit length.
约束:轴被限制为单位长度。
Constraint: Quaternion is constrained to be unit length.M
约束:四元数被约束为单位长度.M
Constraints: All three rows and all three columns must be of unit length (when treated as three-element vectors).
约束:所有三行和所有三列都必须是单位长度(当视为三元素向量时)。
As game engineers, we will encounter a host of other mathematical objects in addition to points, vectors, matrices and quaternions. This section briefly outlines the most common of these.
作为游戏工程师,除了点、向量、矩阵和四元数之外,我们还会遇到许多其他数学对象。本节简要概述了其中最常见的内容。
An infinite line can be represented by a point P0 plus a unit vector u in the direction of the line. A parametric equation of a line traces out every possible point P along the line by starting at the initial point P0 and moving an arbitrary distance t along the direction of the unit vector v. The infinitely large set of points P becomes a vector function of the scalar parameter t:
无限长的直线可以用点 P 0 加上直线方向的单位向量 u 来表示。一条直线的参数方程从初始点 P 0 开始,沿单位向量 v 的方向移动任意距离 t,沿着直线描绘出每个可能的点 P。无限大的点集P 成为标量参数 t 的向量函数:
This is depicted in Figure 5.25.
如图 5.25 所示。
A ray is a line that extends to infinity in only one direction. This is easily expressed as P(t) with the constraint t ≥ 0, as shown in Figure 5.26.
射线是一条仅在一个方向上无限延伸的线。这可以很容易地表示为 P(t),且约束条件 t ≥ 0,如图 5.26 所示。
A line segment is bounded at both ends by P0 and P1. It too can be represented by P(t), in either one of the following two ways (where L = P1 − P0, L = |L| is the length of the line segment, and u = (1/L)L is a unit vector in the direction of L):
线段的两端以 P 0 和 P 1 为界。它也可以用 P(t) 表示,采用以下两种方式之一(其中 L = P 1 − P 0 ,L = |L| 是线段,u = (1/L)L 是 L) 方向的单位向量:
The latter format, depicted in Figure 5.27, is particularly convenient because the parameter t is normalized; in other words, t always goes from zero to one, no matter which particular line segment we are dealing with. This means we do not have to store the constraint L in a separate floating-point parameter; it is already encoded in the vector L = L u (which we have to store anyway).
后一种格式(如图 5.27 所示)特别方便,因为参数 t 已标准化;换句话说,无论我们正在处理哪条特定的线段,t 总是从零到一。这意味着我们不必将约束 L 存储在单独的浮点参数中;它已经被编码在向量 L = Lu u 中(无论如何我们都必须存储它)。
Spheres are ubiquitous in game engine programming. A sphere is typically defined as a center point C plus a radius r, as shown in Figure 5.28. This packs nicely into a four-element vector, [Cx Cy Cz r]. As we saw when we discussed SIMD vector processing, there are distinct benefits to being able to pack data into a vector containing four 32-bit floats (i.e., a 128-bit package).
球体在游戏引擎编程中无处不在。球体通常定义为中心点 C 加上半径 r,如图 5.28 所示。这很好地打包成一个四元素向量 [C x C y C z r]。正如我们在讨论 SIMD 向量处理时所看到的,能够将数据打包到包含四个 32 位浮点的向量(即 128 位包)中具有明显的好处。
A plane is a 2D surface in 3D space. As you may recall from high-school algebra, the equation of a plane is often written as follows:
平面是 3D 空间中的 2D 表面。您可能还记得高中代数中的内容,平面方程通常写成如下:
This equation is satisfied only for the locus of points P = [x y z] that lie on the plane.
仅当平面上的点 P = [x y z] 的轨迹满足该方程时。
Planes can be represented by a point P0 and a unit vector n that is normal to the plane. This is sometimes called point-normal form, as depicted in Figure 5.29.
平面可以用点 P 0 和垂直于平面的单位向量 n 来表示。这有时称为点正规形式,如图 5.29 所示。
It’s interesting to note that when the parameters A, B and C from the traditional plane equation are interpreted as a 3D vector, that vector lies in the direction of the plane normal. If the vector [A B C] is normalized to unit length, then the normalized vector [a b c] = n, and the normalized parameter is just the distance from the plane to the origin. The sign of d is positive if the plane’s normal vector n is pointing toward the origin (i.e., the origin is on the “front” side of the plane) and negative if the normal is pointing away from the origin (i.e., the origin is “behind” the plane).
有趣的是,当传统平面方程中的参数 A、B 和 C 被解释为 3D 矢量时,该矢量位于平面法线的方向。如果将向量[A B C]归一化为单位长度,则归一化向量[a b c] = n,归一化参数 就是平面到原点的距离。如果平面的法线向量 n 指向原点(即,原点位于平面的“前”侧),则 d 的符号为正;如果法线指向远离原点的方向(即,原点位于平面的“前”侧),则 d 的符号为负。飞机“后面”)。
Another way of looking at this is that the plane equation and the pointnormal form are really just two ways of writing the same equation. Imagine testing whether or not an arbitrary point P = [x y z] lies on the plane. To do this, we find the signed distance from point P to the origin along the normal n = [a b c], and if this signed distance is equal to the signed distance d = −n · P0 from the plane from the origin, then P must lie on the plane. So let’s set them equal and expand some terms:
另一种看待这个问题的方式是,平面方程和点正规形式实际上只是同一方程的两种写法。想象一下测试任意点 P = [x y z] 是否位于平面上。为此,我们需要沿着法线 n = [a b c] 找到从点 P 到原点的有符号距离,并且如果该有符号距离等于从从原点到平面,则 P 必须位于该平面上。因此,让我们将它们设置为相等并展开一些项:
Equation (5.12) only holds when the point P lies on the plane. But what happens when the point P does not lie on the plane? In this case, the left-hand side of the plane equation (ax + by + cz, which is equal to n · P) tells how far “off” the point is from being on the plane. This expression calculates the difference between the distance from P to the origin and the distance from the plane to the origin. In other words, the left-hand side of Equation (5.12) gives us the perpendicular distance h between the point and the plane! This is just another way to write Equation (5.2) from Section 5.2.4.7.
仅当点 P 位于平面上时,方程(5.12)才成立。但是当点 P 不在平面上时会发生什么?在这种情况下,平面方程的左侧(ax + by + cz,等于 n · P)表示该点与平面上的距离“偏离”多远。该表达式计算从 P 到原点的距离与从平面到原点的距离之间的差。换句话说,等式(5.12)的左边给出了点与平面之间的垂直距离h!这只是 5.2.4.7 节中方程 (5.2) 的另一种写法。
A plane can actually be packed into a four-element vector, much like a sphere can. To do so, we observe that to describe a plane uniquely, we need only the normal vector n = [a b c] and the distance from the origin d. The four-element vector L = [n d] = [a b c d] is a compact and convenient way to represent and store a plane in memory. Note that when P is written in homogeneous coordinates with w = 1, the equation (L · P) = 0 is yet another way of writing (n · P) = −d. These equations are satisfied for all points P that lie on the plane L.
平面实际上可以打包成一个四元素向量,就像球体一样。为此,我们观察到,要唯一地描述一个平面,我们只需要法向量 n = [a b c] 和距原点 d 的距离。四元素向量 L = [n d] = [a b c d] 是在内存中表示和存储平面的紧凑且方便的方法。请注意,当 P 写成齐次坐标且 w = 1 时,方程 (L·P) = 0 是 (n·P) = -d 的另一种写法。对于位于平面 L 上的所有点 P 都满足这些方程。
Planes defined in four-element vector form can be easily transformed from one coordinate space to another. Given a matrix MA→B that transforms points and (non-normal) vectors from space A to space B, we already know that to transform a normal vector such as the plane’s n vector, we need to use the inverse transpose of that matrix, . So it shouldn’t be a big surprise to learn that applying the inverse transpose of a matrix to a four-element plane vector L will, in fact, correctly transform that plane from space A to space B. We won’t derive or prove this result any further here, but a thorough explanation of why this little “trick” works is provided in Section 4.2.3 of [32].
以四元素向量形式定义的平面可以轻松地从一个坐标空间转换到另一个坐标空间。给定一个将点和(非法向)向量从空间 A 变换到空间 B 的矩阵 M A→B ,我们已经知道要变换法向向量(例如平面的 n 向量),我们需要使用逆矩阵该矩阵的转置 。因此,当得知将矩阵的逆转置应用于四元素平面向量 L 实际上会正确地将平面从空间 A 变换到空间 B 时,这应该不足为奇。我们不会推导或证明这里进一步说明了这个结果,但是 [32] 的第 4.2.3 节提供了为什么这个小“技巧”有效的彻底解释。
An axis-aligned bounding box (AABB) is a 3D cuboid whose six rectangular faces are aligned with a particular coordinate frame’s mutually orthogonal axes. As such, an AABB can be represented by a six-element vector containing the minimum and maximum coordinates along each of the 3 principal axes, [xmin, ymin, zmin, xmax, ymax, zmax], or two points Pmin and Pmax.
轴对齐边界框 (AABB) 是一个 3D 长方体,其六个矩形面与特定坐标系的相互正交轴对齐。因此,AABB 可以用六元素向量表示,其中包含沿 3 个主轴的最小和最大坐标 [x min 、 y min 、 z min , x max , y max , z max ], 或两个点 P min 和 P max 。
This simple representation allows for a particularly convenient and inexpensive method of testing whether a point P is inside or outside any given AABB. We simply test if all of the following conditions are true:
这种简单的表示允许一种特别方便且廉价的方法来测试点 P 是在任何给定 AABB 的内部还是外部。我们简单地测试以下所有条件是否成立:
Because intersection tests are so speedy, AABBs are often used as an “early out” collision check; if the AABBs of two objects do not intersect, then there is no need to do a more detailed (and more expensive) collision test.
由于相交测试速度非常快,因此 AABB 经常被用作“提前退出”的碰撞检查;如果两个物体的 AABB 不相交,则无需进行更详细(且更昂贵)的碰撞测试。
An oriented bounding box (OBB) is a cuboid that has been oriented so as to align in some logical way with the object it bounds. Usually an OBB aligns with the local-space axes of the object. Hence, it acts like an AABB in local space, although it may not necessarily align with the world-space axes.
定向边界框 (OBB) 是一个已定向的长方体,以便以某种逻辑方式与其所包围的对象对齐。通常 OBB 与对象的局部空间轴对齐。因此,它的行为就像局部空间中的 AABB,尽管它不一定与世界空间轴对齐。
Various techniques exist for testing whether or not a point lies within an OBB, but one common approach is to transform the point into the OBB’s “aligned” coordinate system and then use an AABB intersection test as presented above.
有多种技术可用于测试点是否位于 OBB 内,但一种常见的方法是将点转换为 OBB 的“对齐”坐标系,然后使用如上所述的 AABB 相交测试。
As shown in Figure 5.30, a frustum is a group of six planes that define a truncated pyramid shape. Frusta are commonplace in 3D rendering because they conveniently define the viewable region of the 3D world when rendered via a perspective projection from the point of view of a virtual camera. Four of the planes bound the edges of the screen space, while the other two planes represent the the near and far clipping planes (i.e., they define the minimum and maximum z coordinates possible for any visible point).
如图 5.30 所示,平截头体是一组定义截棱锥形状的六个平面。 Frusta 在 3D 渲染中很常见,因为当从虚拟相机的角度通过透视投影进行渲染时,它们可以方便地定义 3D 世界的可视区域。其中四个平面限制了屏幕空间的边缘,而另外两个平面代表近剪裁平面和远剪裁平面(即,它们定义任何可见点可能的最小和最大 z 坐标)。
One convenient representation of a frustum is as an array of six planes, each of which is represented in point-normal form (i.e., one point and one normal vector per plane).
平截头体的一种方便表示是由六个平面组成的数组,每个平面都以点法线形式表示(即每个平面一个点和一个法向量)。
Testing whether a point lies inside a frustum is a bit involved, but the basic idea is to use dot products to determine whether the point lies on the front or back side of each plane. If it lies inside all six planes, it is inside the frustum.
测试一个点是否位于截锥体内有点复杂,但基本思想是使用点积来确定该点是位于每个平面的正面还是背面。如果它位于所有六个平面内,则它位于截锥体内。
A helpful trick is to transform the world-space point being tested by applying the camera’s perspective projection to it. This takes the point from world space into a space known as homogeneous clip space. In this space, the frustum is just an axis-aligned cuboid (AABB). This permits much simpler in/out tests to be performed.
一个有用的技巧是通过应用相机的透视投影来转换正在测试的世界空间点。这会将点从世界空间带入称为齐次剪辑空间的空间。在这个空间中,平截头体只是一个轴对齐的长方体(AABB)。这允许执行更简单的输入/输出测试。
A convex polyhedral region is defined by an arbitrary set of planes, all with normals pointing inward (or outward). The test for whether a point lies inside or outside the volume defined by the planes is relatively straightforward; it is similar to a frustum test, but with possibly more planes. Convex regions are very useful for implementing arbitrarily shaped trigger regions in games. Many engines employ this technique; for example, the Quake engine’s ubiquitous brushes are just volumes bounded by planes in exactly this way.
凸多面体区域由任意一组平面定义,所有平面的法线均指向内(或外)。测试点是否位于平面定义的体积内部或外部相对简单;它类似于平截头体测试,但可能具有更多平面。凸区域对于在游戏中实现任意形状的触发区域非常有用。许多发动机都采用这种技术;例如,Quake 引擎中无处不在的画笔正是以这种方式被平面包围的体积。
Random numbers are ubiquitous in game engines, so it behooves us to have a brief look at the two most common random number generators (RNG), the linear congruential generator and the Mersenne Twister. It’s important to realize that random number generators don’t actually generate random numbers—they merely produce a complex, but totally deterministic, predefined sequence of values. For this reason, we call the sequences they produce pseudorandom, and technically speaking we should really call them “pseudorandom number generators” (PRNG). What differentiates a good generator from a bad one is how long the sequence of numbers is before it repeats (its period), and how well the sequences hold up under various well-known randomness tests.
随机数在游戏引擎中无处不在,因此我们有必要简要了解一下两种最常见的随机数生成器 (RNG):线性同余生成器和梅森旋转器。重要的是要认识到随机数生成器实际上并不生成随机数,它们只是生成复杂但完全确定性的预定义值序列。因此,我们将它们产生的序列称为伪随机,从技术上讲,我们实际上应该将它们称为“伪随机数生成器”(PRNG)。好生成器与坏生成器的区别在于数字序列重复之前的长度(周期),以及序列在各种众所周知的随机性测试下的表现如何。
Linear congruential generators are a very fast and simple way to generate a sequence of pseudorandom numbers. Depending on the platform, this algorithm is sometimes used in the C standard library’s rand() function. However, your mileage may vary, so don’t count on rand() being based on any particular algorithm. If you want to be sure, you’ll be better off implementing your own random number generator.
线性同余生成器是生成伪随机数序列的一种非常快速且简单的方法。根据平台的不同,该算法有时会在 C 标准库的 rand() 函数中使用。但是,您的里程可能会有所不同,因此不要指望 rand() 基于任何特定算法。如果您想确定,最好实现自己的随机数生成器。
The linear congruential algorithm is explained in detail in the book Numerical Recipes in C, so I won’t go into the details of it here.
线性同余算法在《Numerical Recipes in C》一书中有详细解释,所以这里不再赘述。
What I will say is that this random number generator does not produce particularly high-quality pseudorandom sequences. Given the same initial seed value, the sequence is always exactly the same. The numbers produced do not meet many of the criteria widely accepted as desirable, such as a long period, low- and high-order bits that have similarly long periods, and absence of sequential or spatial correlation between the generated values.
我要说的是,这个随机数生成器不会产生特别高质量的伪随机序列。给定相同的初始种子值,序列始终完全相同。产生的数字不符合许多被广泛接受的理想标准,例如长周期、具有类似长周期的低阶和高阶位,以及生成的值之间不存在顺序或空间相关性。
The Mersenne Twister pseudorandom number generator algorithm was designed specifically to improve upon the various problems of the linear congruential algorithm. Wikipedia provides the following description of the benefits of the algorithm:
Mersenne Twister 伪随机数生成器算法是专门为改进线性同余算法的各种问题而设计的。维基百科对该算法的优点提供了以下描述:
Various implementations of the Twister are available on the web, including a particularly cool one that uses SIMD vector instructions for an extra speed boost, called SFMT (SIMD-oriented fast Mersenne Twister). SFMT can be downloaded from http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/index.html.
网络上提供了各种 Twister 实现,其中包括一种特别酷的实现,它使用 SIMD 向量指令来实现额外的速度提升,称为 SFMT(面向 SIMD 的快速 Mersenne Twister)。 SFMT 可以从 http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/index.html 下载。
In 1994, George Marsaglia, a computer scientist and mathematician best known for developing the Diehard battery of tests of randomness (http://www.stat.fsu.edu/pub/diehard), published a pseudorandom number generation algorithm that is much simpler to implement and runs faster than the Mersenne Twister algorithm. He claimed that it could produce a sequence of 32-bit pseudorandom numbers with a period of non-repetition of 2250. It passed all of the Diehard tests and still stands today as one of the best pseudorandom number generators for high-speed applications. He called his algorithm the Mother of All Pseudorandom Number Generators, because it seemed to him to be the only random number generator one would ever need.
1994 年,计算机科学家和数学家 George Marsaglia 以开发 Diehard 随机性测试电池而闻名 (http://www.stat.fsu.edu/pub/diehard),发表了一种简单得多的伪随机数生成算法比 Mersenne Twister 算法实现和运行得更快。他声称它可以产生一个不重复周期为 2 250 的 32 位伪随机数序列。它通过了所有 Diehard 测试,至今仍然是高速应用的最佳伪随机数生成器之一。他称他的算法为“所有伪随机数生成器之母”,因为在他看来,这是人们唯一需要的随机数生成器。
Later, Marsaglia published another generator called Xorshift, which is between Mersenne and Mother-of-All in terms of randomness, but runs slightly faster than Mother.
后来,Marsaglia 发布了另一个名为 Xorshift 的生成器,它的随机性介于 Mersenne 和 Mother-of-All 之间,但运行速度比 Mother 稍快。
Marsaglia also developed a series of random number generators that are collectively called KISS (Keep It Simple Stupid). The KISS99 algorithm is a popular choice, because it has a large period (2123) and passes all tests in the TestU01 test suite (https://bit.ly/2r5FmSP).
Marsaglia 还开发了一系列随机数生成器,统称为 KISS(Keep It Simple Stupid)。 KISS99 算法是一种流行的选择,因为它具有很大的周期 (2 123 ),并且通过了 TestU01 测试套件 (https://bit.ly/2r5FmSP) 中的所有测试。
You can read about George Marsaglia at http://en.wikipedia.org/wiki/George_Marsaglia, and about the Mother-of-All generator at ftp://ftp.forth.org/pub/C/mother.c and at http://www.agner.org/random. You can download a PDF of George’s paper on Xorshift at http://www.jstatsoft.org/v08/i14/paper.
您可以在 http://en.wikipedia.org/wiki/George_Marsaglia 阅读有关 George Marsaglia 的内容,并在 ftp://ftp.forth.org/pub/C/mother.c 和 at http://www.agner.org/random。您可以在 http://www.jstatsoft.org/v08/i14/paper 下载 George 关于 Xorshift 的论文的 PDF 版本。
Another very popular and high-quality family of pseudorandom number generators is called PCG. It works by combining a congruential generator for its state transitions (the “CG” in PCG) with permutation functions to generate its output (the “P” in PCG). You can read more about this family of PRNGs at http://www.pcg-random.org/.
另一个非常流行且高质量的伪随机数生成器系列称为 PCG。它的工作原理是将用于状态转换的同余生成器(PCG 中的“CG”)与置换函数相结合以生成其输出(PCG 中的“P”)。您可以在 http://www.pcg-random.org/ 上了解有关该 PRNG 系列的更多信息。
_______________
1 Personally I would have preferred the symbol a1 ver a0, so that a dual number would be written . Just as when we plot a complex number in the complex plane, we can think of the real unit as a “basis vector” along the real axis, and the dual unit ε as a “basis vector” along the dual axis.
1 就我个人而言,我更喜欢使用符号 a 1 而不是 0 ,这样双数就会写成 。就像我们在复平面上绘制复数一样,我们可以将实数单位视为沿实轴的“基向量”,将对偶单位 ε 视为沿双轴的“基向量”。