Every game engine requires some low-level support systems that manage mundane but crucial tasks, such as starting up and shutting down the engine, configuring engine and game features, managing the engine’s memory usage, handling access to file system(s), providing access to the wide range of heterogeneous asset types used by the game (meshes, textures, animations, audio, etc.), and providing debugging tools for use by the game development team. This chapter will focus on the lowest-level support systems found in most game engines. In the chapters that follow, we will explore some of the larger core systems, including resource management, human interface devices and in-game debugging tools.
每个游戏引擎都需要一些低级支持系统来管理平凡但关键的任务,例如启动和关闭引擎、配置引擎和游戏功能、管理引擎的内存使用、处理对文件系统的访问、提供访问游戏使用的各种异构资源类型(网格、纹理、动画、音频等),并提供供游戏开发团队使用的调试工具。本章将重点介绍大多数游戏引擎中的最低级别支持系统。在接下来的章节中,我们将探讨一些较大的核心系统,包括资源管理、人机界面设备和游戏内调试工具。
A game engine is a complex piece of software consisting of many interacting subsystems. When the engine first starts up, each subsystem must be configured and initialized in a specific order. Interdependencies between subsystems implicitly define the order in which they must be started—i.e., if subsystem B depends on subsystem A, then A will need to be started up before B can be initialized. Shut-down typically occurs in the reverse order, so B would shut down first, followed by A.
游戏引擎是一个复杂的软件,由许多交互的子系统组成。当引擎首次启动时,必须按特定顺序配置和初始化每个子系统。子系统之间的相互依赖性隐式定义了它们必须启动的顺序,即,如果子系统 B 依赖于子系统 A,则需要先启动 A,然后才能初始化 B。关闭通常以相反的顺序发生,因此 B 将首先关闭,然后是 A。
Since the programming language used in most modern game engines is C++, we should briefly consider whether C++’s native start-up and shut-down semantics can be leveraged in order to start up and shut down our engine’s subsystems. In C++, global and static objects are constructed before the program’s entry point (main(), or WinMain() under Windows) is called. However, these constructors are called in a totally unpredictable order. The destructors of global and static class instances are called after main() (or WinMain()) returns, and once again they are called in an unpredictable order. Clearly this behavior is not desirable for initializing and shutting down the subsystems of a game engine, or indeed any software system that has interdependencies between its global objects.
由于大多数现代游戏引擎使用的编程语言是 C++,因此我们应该简要考虑是否可以利用 C++ 的本机启动和关闭语义来启动和关闭引擎的子系统。在 C++ 中,全局和静态对象是在调用程序入口点( Windows 下的 main() 或 WinMain() )之前构造的。然而,这些构造函数的调用顺序是完全不可预测的。全局和静态类实例的析构函数在 main() (或 WinMain() )返回后调用,并且再次以不可预测的顺序调用它们。显然,对于初始化和关闭游戏引擎的子系统,或者实际上在其全局对象之间具有相互依赖性的任何软件系统,这种行为是不可取的。
This is somewhat unfortunate, because a common design pattern for implementing major subsystems such as the ones that make up a game engine is to define a singleton class (often called a manager) for each subsystem. If C++ gave us more control over the order in which global and static class instances were constructed and destroyed, we could define our singleton instances as globals, without the need for dynamic memory allocation. For example, we could write:
这有点不幸,因为实现主要子系统(例如组成游戏引擎的子系统)的常见设计模式是为每个子系统定义一个单例类(通常称为管理器)。如果 C++ 让我们能够更好地控制全局和静态类实例的构造和销毁顺序,我们就可以将单例实例定义为全局实例,而不需要动态内存分配。例如,我们可以写:
class RenderManager
{
public:
RenderManager()
{
// start up the manager…
}
~RenderManager()
{
// shut down the manager…
}
// …
};
// singleton instance
static RenderManager gRenderManager;
Alas, with no way to directly control construction and destruction order, this approach won’t work.
可惜的是,由于无法直接控制构建和销毁顺序,这种方法行不通。
There is one C++ “trick” we can leverage here. A static variable that is declared within a function will not be constructed before main() is called, but rather on the first invocation of that function. So if our global singleton is function-static, we can control the order of construction for our global singletons.
我们可以在这里利用一个 C++“技巧”。在函数内声明的静态变量不会在调用 main() 之前构造,而是在第一次调用该函数时构造。因此,如果我们的全局单例是函数静态的,我们就可以控制全局单例的构造顺序。
class RenderManager
{
public:
// Get the one and only instance.
static RenderManager& get()
{
// This function-static will be constructed on the
// first call to this function.
static RenderManager sSingleton;
return sSingleton;
}
RenderManager()
{
// Start up other managers we depend on, by
// calling their get() functions first…
VideoManager::get();
TextureManager::get();
// Now start up the render manager.
// …
}
~RenderManager()
{
// Shut down the manager.
// …
}
};
You’ll find that many software engineering textbooks suggest this design or a variant that involves dynamic allocation of the singleton as shown below.
您会发现许多软件工程教科书都建议这种设计或涉及单例动态分配的变体,如下所示。
static RenderManager& get()
{
static RenderManager* gpSingleton = nullptr;
if (gpSingleton == nullptr)
{
gpSingleton = new RenderManager;
}
ASSERT(gpSingleton);
return *gpSingleton;
}
Unfortunately, this still gives us no way to control destruction order. It is possible that C++ will destroy one of the managers upon which the RenderManager depends for its shut-down procedure, prior to the RenderManager’s destructor being called. In addition, it’s difficult to predict exactly when the RenderManager singleton will be constructed, because the construction will happen on the first call to RenderManager::get()—and who knows when that might be? Moreover, the programmers using the class may not be expecting an innocuous-looking get() function to do something expensive, like allocating and initializing a heavyweight singleton. This is an unpredictable and dangerous design. Therefore, we are prompted to resort to a more direct approach that gives us greater control.
不幸的是,这仍然让我们无法控制销毁顺序。在调用 RenderManager 的析构函数之前,C++ 可能会销毁 RenderManager 关闭过程所依赖的管理器之一。此外,很难准确预测 RenderManager 单例将在何时构建,因为构建将在第一次调用 RenderManager::get() 时发生 - 谁知道那会是什么时候?此外,使用该类的程序员可能不会期望看似无害的 get() 函数会执行一些昂贵的操作,例如分配和初始化重量级单例。这是一个不可预测且危险的设计。因此,我们被提示采取更直接的方法来给予我们更大的控制权。
Let’s presume that we want to stick with the idea of singleton managers for our subsystems. In this case, the simplest “brute-force” approach is to define explicit start-up and shut-down functions for each singleton manager class. These functions take the place of the constructor and destructor, and in fact we should arrange for the constructor and destructor to do absolutely nothing. That way, the start-up and shut-down functions can be explicitly called in the required order from within main() (or from some overarching singleton object that manages the engine as a whole). For example:
假设我们想坚持子系统的单例管理器的想法。在这种情况下,最简单的“强力”方法是为每个单例管理器类定义显式的启动和关闭函数。这些函数取代了构造函数和析构函数,实际上我们应该安排构造函数和析构函数绝对不执行任何操作。这样,就可以从 main() 内部(或从管理整个引擎的某个总体单例对象)以所需的顺序显式调用启动和关闭函数。例如:
class RenderManager
{
public:
RenderManager()
{
// do nothing
}
~RenderManager()
{
// do nothing
}
void startUp()
{
// start up the manager…
}
void shutDown()
{
// shut down the manager…
}
// …
};
class PhysicsManager { /* similar… */ };
class AnimationManager { /* similar… */ };
class MemoryManager { /* similar… */ };
class FileSystemManager { /* similar… */ };
// …
RenderManager gRenderManager;
PhysicsManager gPhysicsManager;
AnimationManager gAnimationManager;
TextureManager gTextureManager;
VideoManager gVideoManager;
MemoryManager gMemoryManager;
FileSystemManager gFileSystemManager;
// …
int main(int argc, const char* argv)
{
// Start up engine systems in the correct order.
gMemoryManager.startUp();
gFileSystemManager.startUp();
gVideoManager.startUp();
gTextureManager.startUp();
gRenderManager.startUp();
gAnimationManager.startUp();
gPhysicsManager.startUp();
// …
// Run the game.
gSimulationManager.run();
// Shut everything down, in reverse order.
// …
gPhysicsManager.shutDown();
gAnimationManager.shutDown();
gRenderManager.shutDown();
gFileSystemManager.shutDown();
gMemoryManager.shutDown();
return 0;
}
There are “more elegant” ways to accomplish this. For example, you could have each manager register itself into a global priority queue and then walk this queue to start up all the managers in the proper order. You could define the manager-to-manager dependency graph by having each manager explicitly list the other managers upon which it depends and then write some code to calculate the optimal start-up order given their interdependencies. You could use the construct-on-demand approach outlined above. In my experience, the brute-force approach always wins out, because of the following:
有“更优雅”的方法可以实现这一点。例如,您可以让每个管理器将自己注册到全局优先级队列中,然后遍历该队列以按正确的顺序启动所有管理器。您可以通过让每个管理器显式列出它所依赖的其他管理器来定义管理器到管理器的依赖关系图,然后编写一些代码来计算给定它们的相互依赖关系的最佳启动顺序。您可以使用上面概述的按需构建方法。根据我的经验,暴力方法总是胜出,原因如下:
One minor disadvantage to the brute-force manual start-up and shut-down method is that you might accidentally shut things down in an order that isn’t strictly the reverse of the start-up order. But I wouldn’t lose any sleep over it. As long as you can start up and shut down your engine’s subsystems successfully, you’re golden.
强力手动启动和关闭方法的一个小缺点是,您可能会意外地以与启动顺序不严格相反的顺序关闭设备。但我不会因此而失眠。只要您能够成功启动和关闭引擎的子系统,您就成功了。
Let’s take a brief look at some examples of engine start-up and shut-down taken from real game engines.
让我们简单看一下来自真实游戏引擎的引擎启动和关闭的一些示例。
OGRE is by its authors’ admission a rendering engine, not a game engine per se. But by necessity it provides many of the low-level features found in full-fledged game engines, including a simple and elegant start-up and shutdown mechanism. Everything in OGRE is controlled by the singleton object Ogre::Root. It contains pointers to every other subsystem in OGRE and manages their creation and destruction. This makes it very easy for a programmer to start up OGRE—just new an instance of Ogre::Root and you’re done.
作者承认 OGRE 是一个渲染引擎,而不是游戏引擎本身。但它必然提供成熟游戏引擎中的许多低级功能,包括简单而优雅的启动和关闭机制。 OGRE 中的一切都由单例对象 Ogre::Root 控制。它包含指向 OGRE 中每个其他子系统的指针并管理它们的创建和销毁。这使得程序员可以非常轻松地启动 OGRE — 只需 new Ogre::Root 的实例即可完成。
Here are a few excerpts from the OGRE source code so we can see what it’s doing:
以下是 OGRE 源代码的一些摘录,以便我们可以了解它的作用:
class _OgreExport Root : public Singleton<Root>
{
// <some code omitted…>
// Singletons
LogManager* mLogManager;
ControllerManager* mControllerManager;
SceneManagerEnumerator* mSceneManagerEnum;
SceneManager* mCurrentSceneManager;
DynLibManager* mDynLibManager;
ArchiveManager* mArchiveManager;
MaterialManager* mMaterialManager;
MeshManager* mMeshManager;
ParticleSystemManager* mParticleManager;
SkeletonManager* mSkeletonManager;
OverlayElementFactory* mPanelFactory;
OverlayElementFactory* mBorderPanelFactory;
OverlayElementFactory* mTextAreaFactory;
OverlayManager* mOverlayManager;
FontManager* mFontManager;
ArchiveFactory *mZipArchiveFactory;
ArchiveFactory *mFileSystemArchiveFactory;
ResourceGroupManager* mResourceGroupManager;
ResourceBackgroundQueue* mResourceBackgroundQueue;
ShadowTextureManager* mShadowTextureManager;
// etc.
};
OgreRoot.cpp
Root::Root(const String& pluginFileName,
const String& configFileName,
const String& logFileName) :
mLogManager(0),
mCurrentFrame(0),
mFrameSmoothingTime(0.0f),
mNextMovableObjectTypeFlag(1),
mIsInitialised(false)
{
// superclass will do singleton checking
String msg;
// Init
mActiveRenderer = 0;
mVersion
= StringConverter::toString(OGRE_VERSION_MAJOR)
+ StringConverter::toString(OGRE_VERSION_MINOR)
+ StringConverter::toString(OGRE_VERSION_PATCH)
+ OGRE_VERSION_SUFFIX + “ ”
+ “(” + OGRE_VERSION_NAME + “)”;
mConfigFileName = configFileName;
// create log manager and default log file if there
// is no log manager yet
if(LogManager::getSingletonPtr() == 0)
{
mLogManager = new LogManager();
mLogManager->createLog(logFileName, true, true);
}
// dynamic library manager
mDynLibManager = new DynLibManager();
mArchiveManager = new ArchiveManager();
// ResourceGroupManager
mResourceGroupManager = new ResourceGroupManager();
// ResourceBackgroundQueue
mResourceBackgroundQueue
= new ResourceBackgroundQueue();
// and so on…
}
OGRE provides a templated Ogre::Singleton base class from which all of its singleton (manager) classes derive. If you look at its implementation, you’ll see that Ogre::Singleton does not use deferred construction but instead relies on Ogre::Root to explicitly new each singleton. As we discussed above, this is done to ensure that the singletons are created and destroyed in a well-defined order.
OGRE 提供了一个模板化的 Ogre::Singleton 基类,它的所有单例(管理器)类均从该基类派生。如果您查看其实现,您会发现 Ogre::Singleton 不使用延迟构造,而是依赖 Ogre::Root 显式 new 每个单例。正如我们上面所讨论的,这样做是为了确保单例以明确定义的顺序创建和销毁。
The engine created by Naughty Dog, Inc. for its Uncharted and The Last of Us series of games uses a similar explicit technique for starting up its subsystems. You’ll notice by looking at the following code that engine start-up is not always a simple sequence of allocating singleton instances. A wide range of operating system services, third-party libraries and so on must all be started up during engine initialization. Also, dynamic memory allocation is avoided wherever possible, so many of the singletons are statically allocated objects (e.g., g_fileSystem, g_languageMgr, etc.) It’s not always pretty, but it gets the job done.
Naughty Dog, Inc. 为其《神秘海域》和《最后生还者》系列游戏创建的引擎使用了类似的显式技术来启动其子系统。通过查看以下代码,您会注意到引擎启动并不总是分配单例实例的简单序列。各种操作系统服务、第三方库等都必须在引擎初始化期间启动。此外,尽可能避免动态内存分配,因此许多单例都是静态分配的对象(例如 g_fileSystem, g_languageMgr 等)。这并不总是很漂亮,但它可以完成工作。
Err BigInit()
{
init_exception_handler();
U8* pPhysicsHeap = new(kAllocGlobal, kAlign16)
U8[ALLOCATION_GLOBAL_PHYS_HEAP];
PhysicsAllocatorInit(pPhysicsHeap,
ALLOCATION_GLOBAL_PHYS_HEAP);
g_textDb.Init();
g_textSubDb.Init();
g_spuMgr.Init();
g_drawScript.InitPlatform();
PlatformUpdate();
thread_t init_thr;
thread_create(&init_thr, threadInit, 0, 30,
64*1024, 0, “Init”);
char masterConfigFileName[256];
snprintf(masterConfigFileName,
sizeof(masterConfigFileName),
MASTER_CFG_PATH);
{
Err err = ReadConfigFromFile(
masterConfigFileName);
if (err.Failed())
{
MsgErr(“Config file not found (%s).\n”,
masterConfigFileName);
}
}
memset(&g_discInfo, 0, sizeof(BootDiscInfo));
int err1 = GetBootDiscInfo(&g_discInfo);
Msg(“GetBootDiscInfo() : 0x%x\n”, err1);
if(err1 == BOOTDISCINFO_RET_OK)
{
printf(“titleId : [%s]\n”,
g_discInfo.titleId);
printf(“parentalLevel : [%d]\n”,
g_discInfo.parentalLevel);
}
g_fileSystem.Init(g_gameInfo.m_onDisc);
g_languageMgr.Init();
if (g_shouldQuit) return Err::kOK;
// and so on…
As game developers, we are always trying to make our code run more quickly. The performance of any piece of software is dictated not only by the algorithms it employs, or the efficiency with which those algorithms are coded, but also by how the program utilizes memory (RAM). Memory affects performance in two ways:
作为游戏开发者,我们总是努力让我们的代码运行得更快。任何软件的性能不仅取决于它所采用的算法或这些算法的编码效率,还取决于程序如何利用内存 (RAM)。内存通过两种方式影响性能:
malloc() or C++’s global operator new is a very slow operation. We can improve the performance of our code by either avoiding dynamic allocation altogether or by making use of custom memory allocators that greatly reduce allocation costs.malloc() 或C++的全局 operator new 进行动态内存分配是一个非常慢的操作。我们可以通过完全避免动态分配或使用可大大降低分配成本的自定义内存分配器来提高代码的性能。In this section, we’ll learn how to optimize our code’s memory utilization along these two axes.
在本节中,我们将学习如何沿着这两个轴优化代码的内存利用率。
Dynamic memory allocation via malloc() and free() or C++’s global new and delete operators—also known as heap allocation—is typically very slow. The high cost can be attributed to two main factors. First, a heap allocator is a general-purpose facility, so it must be written to handle any allocation size, from one byte to one gigabyte. This requires a lot of management overhead, making the malloc() and free() functions inherently slow. Second, on most operating systems a call to malloc() or free() must first context-switch from user mode into kernel mode, process the request and then context-switch back to the program. These context switches can be extraordinarily expensive. One rule of thumb often followed in game development is:
通过 malloc() 和 free() 或 C++ 的全局 new 和 delete 运算符进行动态内存分配(也称为堆分配)通常非常慢。高成本可归因于两个主要因素。首先,堆分配器是一种通用工具,因此必须编写它来处理从 1 字节到 1 GB 的任何分配大小。这需要大量的管理开销,使得 malloc() 和 free() 函数本质上很慢。其次,在大多数操作系统上,对 malloc() 或 free() 的调用必须首先从用户模式上下文切换到内核模式,处理请求,然后上下文切换回程序。这些上下文切换可能非常昂贵。游戏开发中经常遵循的一条经验法则是:
Keep heap allocations to a minimum, and never allocate from the heap within a tight loop.
将堆分配保持在最低限度,并且切勿在紧密循环内从堆进行分配。
Of course, no game engine can entirely avoid dynamic memory allocation, so most game engines implement one or more custom allocators. A custom allocator can have better performance characteristics than the operating system’s heap allocator for two reasons. First, a custom allocator can satisfy requests from a preallocated memory block (itself allocated using malloc() or new, or declared as a global variable). This allows it to run in user mode and entirely avoid the cost of context-switching into the operating system. Second, by making various assumptions about its usage patterns, a custom allocator can be much more efficient than a general-purpose heap allocator.
当然,没有游戏引擎可以完全避免动态内存分配,因此大多数游戏引擎都实现一个或多个自定义分配器。自定义分配器比操作系统的堆分配器具有更好的性能特征,原因有两个。首先,自定义分配器可以满足来自预分配内存块(本身使用 malloc() 或 new 分配,或声明为全局变量)的请求。这允许它在用户模式下运行,并完全避免上下文切换到操作系统的成本。其次,通过对其使用模式进行各种假设,自定义分配器可以比通用堆分配器更有效。
In the following sections, we’ll take a look at some common kinds of custom allocators. For additional information on this topic, see Christian Gyrling’s excellent blog post, http://www.swedishcoding.com/2008/08/31/are-we-out-of-memory.
在下面的部分中,我们将了解一些常见类型的自定义分配器。有关此主题的更多信息,请参阅 Christian Gyrling 的精彩博客文章,http://www.swedishcoding.com/2008/08/31/are-we-out-of-memory。
Many games allocate memory in a stack-like fashion. Whenever a new game level is loaded, memory is allocated for it. Once the level has been loaded, little or no dynamic memory allocation takes place. At the conclusion of the level, its data is unloaded and all of its memory can be freed. It makes a lot of sense to use a stack-like data structure for these kinds of memory allocations.
许多游戏以类似堆栈的方式分配内存。每当加载新的游戏关卡时,都会为其分配内存。加载关卡后,很少或不会发生动态内存分配。在关卡结束时,其数据将被卸载,并且其所有内存都可以被释放。对于此类内存分配使用类似堆栈的数据结构非常有意义。
A stack allocator is very easy to implement. We simply allocate a large contiguous block of memory using malloc() or global new, or by declaring a global array of bytes (in which case the memory is effectively allocated out of the executable’s BSS segment). A pointer to the top of the stack is maintained. All memory addresses below this pointer are considered to be in use, and all addresses above it are considered to be free. The top pointer is initialized to the lowest memory address in the stack. Each allocation request simply moves the pointer up by the requested number of bytes. The most recently allocated block can be freed by simply moving the top pointer back down by the size of the block.
堆栈分配器非常容易实现。我们只需使用 malloc() 或全局 new 分配一个大的连续内存块,或者通过声明全局字节数组(在这种情况下,内存实际上是从可执行文件的 BSS 段中分配的) )。维护一个指向栈顶的指针。该指针下方的所有内存地址都被视为正在使用,而其上方的所有地址则被视为空闲。栈顶指针被初始化为栈中的最低内存地址。每个分配请求只需将指针向上移动请求的字节数。只需将顶部指针向下移动块的大小即可释放最近分配的块。
It is important to realize that with a stack allocator, memory cannot be freed in an arbitrary order. All frees must be performed in an order opposite to that in which they were allocated. One simple way to enforce these restrictions is to disallow individual blocks from being freed at all. Instead, we can provide a function that rolls the stack top back to a previously marked location, thereby freeing all blocks between the current top and the roll-back point.
重要的是要认识到,使用堆栈分配器时,不能以任意顺序释放内存。所有释放必须按照与分配顺序相反的顺序执行。执行这些限制的一种简单方法是根本不允许释放单个块。相反,我们可以提供一个函数,将堆栈顶部回滚到先前标记的位置,从而释放当前顶部和回滚点之间的所有块。
It’s important to always roll the top pointer back to a point that lies at the boundary between two allocated blocks, because otherwise new allocations would overwrite the tail end of the top-most block. To ensure that this is done properly, a stack allocator often provides a function that returns a marker representing the current top of the stack. The roll-back function then takes one of these markers as its argument. This is depicted in Figure 6.1. The interface of a stack allocator often looks something like this.
始终将顶部指针回滚到位于两个已分配块之间边界的点非常重要,因为否则新的分配将覆盖最顶部块的尾部。为了确保正确完成此操作,堆栈分配器通常提供一个函数,该函数返回表示当前堆栈顶部的标记。然后回滚函数将这些标记之一作为其参数。如图 6.1 所示。堆栈分配器的界面通常看起来像这样。
class StackAllocator
{
public:
// Stack marker: Represents the current top of the
// stack. You can only roll back to a marker, not to
// arbitrary locations within the stack
typedef U32 Marker;
// Constructs a stack allocator with the given total
// size.
explicit StackAllocator(U32 stackSize_bytes);
// Allocates a new block of the given size from stack
// top
void* alloc(U32 size_bytes);
// Returns a marker to the current stack top
Marker getMarker();
// Rolls the stack back to a previous marker
void freeToMarker(Marker marker);
// Clears the entire stack (rolls the stack back to
// zero).
void clear();
private:
// …
};
A single memory block can actually contain two stack allocators—one that allocates up from the bottom of the block and one that allocates down from the top of the block. A double-ended stack allocator is useful because it uses memory more efficiently by allowing a trade-off to occur between the memory usage of the bottom stack and the memory usage of the top stack. In some situations, both stacks may use roughly the same amount of memory and meet in the middle of the block. In other situations, one of the two stacks may eat up a lot more memory than the other stack, but all allocation requests can still be satisfied as long as the total amount of memory requested is not larger than the block shared by the two stacks. This is depicted in Figure 6.2.
单个内存块实际上可以包含两个堆栈分配器,一个从块的底部向上分配,一个从块的顶部向下分配。双端堆栈分配器非常有用,因为它允许在底部堆栈的内存使用量和顶部堆栈的内存使用量之间进行权衡,从而更有效地使用内存。在某些情况下,两个堆栈可能使用大致相同的内存量并在块的中间相遇。在其他情况下,两个堆栈之一可能比另一个堆栈占用更多内存,但只要请求的内存总量不大于两个堆栈共享的块,所有分配请求仍然可以得到满足。如图 6.2 所示。
In Midway’s Hydro Thunder arcade game, all memory allocations are made from a single large block of memory managed by a double-ended stack allocator. The bottom stack is used for loading and unloading levels (race tracks), while the top stack is used for temporary memory blocks that are allocated and freed every frame. This allocation scheme worked extremely well and ensured that Hydro Thunder never suffered from memory fragmentation problems (see Section 6.2.1.4). Steve Ranck, Hydro Thunder’s lead engineer, describes this allocation technique in depth in [8, Section 1.9].
在 Midway 的 Hydro Thunder 街机游戏中,所有内存分配都是由双端堆栈分配器管理的单个大内存块进行的。底部堆栈用于加载和卸载关卡(赛道),而顶部堆栈用于每帧分配和释放的临时内存块。这种分配方案运行得非常好,并确保 Hydro Thunder 永远不会遇到内存碎片问题(参见第 6.2.1.4 节)。 Hydro Thunder 的首席工程师 Steve Ranck 在 [8,第 1.9 节] 中深入描述了这种分配技术。
It’s quite common in game engine programming (and software engineering in general) to allocate lots of small blocks of memory, each of which are the same size. For example, we might want to allocate and free matrices, or iterators, or links in a linked list, or renderable mesh instances. For this type of memory allocation pattern, a pool allocator is often the perfect choice.
在游戏引擎编程(以及一般的软件工程)中分配大量小内存块(每个小块大小相同)是很常见的。例如,我们可能想要分配和释放矩阵、迭代器、链表中的链接或可渲染的网格实例。对于这种类型的内存分配模式,池分配器通常是最佳选择。
A pool allocator works by preallocating a large block of memory whose size is an exact multiple of the size of the elements that will be allocated. For example, a pool of 4 × 4 matrices would be an exact multiple of 64 bytes—that’s 16 elements per matrix times four bytes per element (assuming each element is a 32-bit float). Each element within the pool is added to a linked list of free elements; when the pool is first initialized, the free list contains all of the elements. Whenever an allocation request is made, we simply grab the next free element off the free list and return it. When an element is freed, we simply tack it back onto the free list. Both allocations and frees are O(1) operations, since each involves only a couple of pointer manipulations, no matter how many elements are currently free. (The notation O(1) is an example of “big O” notation. In this case it means that the execution times of both allocations and frees are roughly constant and do not depend on things like the number of elements currently in the pool. See Section 6.3.3 for an explanation of “big O” notation.)
池分配器的工作原理是预先分配一大块内存,其大小恰好是要分配的元素大小的倍数。例如,一个 4 × 4 矩阵池将是 64 字节的精确倍数,即每个矩阵 16 个元素乘以每个元素 4 个字节(假设每个元素是 32 位 float )。池中的每个元素都被添加到空闲元素的链表中;当池第一次初始化时,空闲列表包含所有元素。每当发出分配请求时,我们只需从空闲列表中获取下一个空闲元素并将其返回。当一个元素被释放时,我们只需将其重新添加到空闲列表中即可。分配和释放都是 O(1) 操作,因为无论当前有多少元素是空闲的,分配和释放都只涉及几个指针操作。 (符号 O(1) 是“big O”符号的一个示例。在这种情况下,它意味着分配和释放的执行时间大致恒定,并且不依赖于池中当前元素数量等因素。有关“大 O”符号的解释,请参见第 6.3.3 节。)
The linked list of free elements can be a singly-linked list, meaning that we need a single pointer (four bytes on 32-bit machines or eight bytes on 64-bit machines) for each free element. Where should we obtain the memory for all these pointers? Certainly they could be stored in a separate preallocated memory block, occupying (sizeof(void*) * numElementsInPool) bytes. However, this is unduly wasteful. The key is to realize that the memory blocks residing on the free list are, by definition, free memory blocks. So why not store each free list “next” pointer within the free block itself? This little “trick” works as long as elementSize >= sizeof(void*). We don’t waste any memory, because our free list pointers all reside inside the free memory blocks—in memory that wasn’t being used for anything anyway!
自由元素的链表可以是单链表,这意味着每个自由元素需要一个指针(在 32 位机器上为 4 个字节,在 64 位机器上为 8 个字节)。我们应该从哪里获取所有这些指针的内存?当然,它们可以存储在单独的预分配内存块中,占用 (sizeof(void*) * numElementsInPool) 字节。然而,这是过度浪费的。关键是要认识到,根据定义,驻留在空闲列表上的内存块是空闲内存块。那么为什么不在空闲块本身中存储每个空闲列表“下一个”指针呢?这个小“技巧”只要 elementSize >= sizeof(void*) 就有效。我们不会浪费任何内存,因为我们的空闲列表指针都驻留在空闲内存块内 - 无论如何都没有使用任何内存!
If each element is smaller than a pointer, then we can use pool element indices instead of pointers to implement our linked list. For example, if our pool contains 16-bit integers, then we can use 16-bit indices as the “next pointers” in our linked list. This works as long as the pool doesn’t contain more than 216 = 65,536 elements.
如果每个元素都小于指针,那么我们可以使用池元素索引而不是指针来实现我们的链表。例如,如果我们的池包含 16 位整数,那么我们可以使用 16 位索引作为链表中的“下一个指针”。只要池中不包含超过 2 16 = 65,536 个元素,此方法就有效。
As we saw in Section 3.3.7.1, every variable and data object has an alignment requirement. An 8-bit integer variable can be aligned to any address, but a 32-bit integer or floating-point variable must be 4-byte aligned, meaning its address can only end in the nibbles 0x0, 0x4, 0x8 or 0xC. A 128-bit SIMD vector value generally has a 16-byte alignment requirement, meaning that its memory address can end only in the nibble 0x0. On the PS3, memory blocks that are to be transferred to an SPU via the direct memory access (DMA) controller should be 128-byte aligned for maximum DMA throughput, meaning they can only end in the bytes 0x00 or 0x80.
正如我们在第 3.3.7.1 节中看到的,每个变量和数据对象都有对齐要求。 8 位整数变量可以对齐到任何地址,但 32 位整数或浮点变量必须是 4 字节对齐,这意味着其地址只能以半字节 0x0、0x4、0x8 或 0xC 结尾。 128 位 SIMD 向量值通常具有 16 字节对齐要求,这意味着其内存地址只能以半字节 0x0 结尾。在 PS3 上,要通过直接内存访问 (DMA) 控制器传输到 SPU 的内存块应为 128 字节对齐,以实现最大 DMA 吞吐量,这意味着它们只能以字节 0x00 或 0x80 结尾。
All memory allocators must be capable of returning aligned memory blocks. This is relatively straightforward to implement. We simply allocate a little bit more memory than was actually requested, shift the address of the memory block upward slightly so that it is aligned properly, and then return the shifted address. Because we allocated a bit more memory than was requested, the returned block will still be large enough, even with the slight upward shift.
所有内存分配器必须能够返回对齐的内存块。这实施起来相对简单。我们只需分配比实际请求多一点的内存,将内存块的地址稍微向上移动,使其正确对齐,然后返回移位后的地址。因为我们分配的内存比请求的多一点,所以返回的块仍然足够大,即使有轻微的向上移动。
In most implementations, the number of additional bytes allocated is equal to the alignment minus one, which is the worst-case alignment shift we can make. For example, if we want a 16-byte aligned memory block, the worst case would be to get back an unaligned pointer that ends in 0x1, because it would require us to apply a shift of 15 bytes to bring it to a 16-byte boundary.
在大多数实现中,分配的额外字节数等于对齐减一,这是我们可以进行的最坏情况对齐移位。例如,如果我们想要一个 16 字节对齐的内存块,最坏的情况是取回一个以 0x1 结尾的未对齐指针,因为这需要我们应用 15 字节的移位才能将其变为 16 字节边界。
Here’s one possible implementation of an aligned memory allocator:
这是对齐内存分配器的一种可能实现:
// Shift the given address upwards if/as necessary to
// ensure it is aligned to the given number of bytes
inline uintptr_t AlignAddress(uintptr_t addr, size_t align)
{
const size_t mask = align - 1;
assert((align & mask) == 0); // pwr of 2
return (addr + mask) & ~mask;
}
// Shift the given pointer upwards if/as necessary to
// ensure it is aligned to the given number of bytes
template<typename T>
inline T* AlignPointer(T* ptr, size_t align)
{
const uintptr_t addr = reinterpret_cast<uintptr_t>(ptr);
const uintptr_t addrAligned = AlignAddress(addr, align);
return reinterpret_cast<T*>(addrAligned);
}
// Aligned allocation function. IMPORTANT: ‘align’
// must be a power of 2 (typically 4, 8 or 16).
void* AllocAligned(size_t bytes, size_t align)
{
// Determine worst case number of bytes we’ll need.
size_t worstCaseBytes = bytes + align - 1;
// Allocate unaligned block.
U8* pRawMem = new U8[worstCaseBytes];
// Align the block.
return AlignPointer(pRawMem, align);
}
The alignment “magic” is performed by the function AlignAddress(). Here’s how it works: Given an address and a desired alignment L, we can align that address to an L-byte boundary by first adding L − 1 to it, and then stripping off the N least-significant bits of the resulting address, where N = log2(L). For example, to align any address to a 16-byte boundary, we shift it up by 15 bytes and then mask off the N = log2(16) = 4 least-significant bits.
对齐“魔法”是由函数 AlignAddress() 执行的。它的工作原理如下:给定一个地址和所需的对齐 L,我们可以通过首先向其添加 L − 1,然后剥离结果地址的 N 个最低有效位,将该地址与 L 字节边界对齐,其中N = log 2 (L)。例如,要将任何地址与 16 字节边界对齐,我们将其上移 15 个字节,然后屏蔽 N = log 2 (16) = 4 个最低有效位。
To strip off these bits, we need a mask that we can apply to the address using the bitwise AND operator. Because L is always a power of two, L − 1 will be a mask with binary 1s in the N least-significant bits and binary 0s everywhere else. So all we need to do is invert this mask and then AND it with the address (addr & ~mask).
为了去掉这些位,我们需要一个掩码,可以使用按位 AND 运算符将其应用于地址。因为 L 始终是 2 的幂,所以 L − 1 将是一个掩码,其中 N 个最低有效位中包含二进制 1,其他位置均包含二进制 0。所以我们需要做的就是反转这个掩码,然后将其与地址( addr & ~mask )相与。
When an aligned block is later freed, we will be passed the shifted address, not the original address that we allocated. But in order to free the memory, we need to free the address that was actually returned by new. How can we convert an aligned address back into its original, unaligned address?
当对齐的块稍后被释放时,我们将传递移位后的地址,而不是我们分配的原始地址。但是为了释放内存,我们需要释放 new 实际返回的地址。我们如何将对齐的地址转换回其原始的未对齐的地址?
One simple approach is to store the shift (i.e., the difference between the aligned address and the original address) some place where our free function will be able to find it. Recall that we actually allocate align - 1 extra bytes in AllocAligned(), in order to give us some room to align the pointer. Those extra bytes are a perfect place to store the shift value. The smallest shift we’ll ever make is one byte, so that’s the minimum space we’ll have to store the offset. Therefore, given an aligned pointer p, we can simply store the shift as a one-byte value at address p - 1.
一种简单的方法是将移位(即对齐地址和原始地址之间的差异)存储在我们的自由函数能够找到它的地方。回想一下,我们实际上在 AllocAligned() 中分配了 align - 1 额外字节,以便为我们提供一些空间来对齐指针。这些额外的字节是存储移位值的完美位置。我们所做的最小移位是一个字节,因此这是我们必须存储偏移量的最小空间。因此,给定一个对齐的指针 p ,我们可以简单地将移位存储为地址 p - 1 处的一字节值。
However, there’s one problem: It’s possible that the raw address returned by new will already be aligned. In that case, the code we presented above would not shift the raw address at all, meaning there’d be no extra bytes into which to store the offset. To overcome this, we simply allocate L extra bytes, instead of L − 1, and then we always shift the raw pointer up to the next L-byte boundary, even if it was already aligned. Now the maximum shift will be L bytes, and the minimum shift will be 1 byte. So there will always be at least one extra byte into which we can store our shift value.
然而,有一个问题: new 返回的原始地址可能已经对齐了。在这种情况下,我们上面提供的代码根本不会移动原始地址,这意味着不会有额外的字节来存储偏移量。为了克服这个问题,我们简单地分配 L 个额外字节,而不是 L − 1,然后我们总是将原始指针向上移动到下一个 L 字节边界,即使它已经对齐了。现在最大移位将为 L 字节,最小移位将为 1 字节。因此,总会有至少一个额外的字节可供我们存储移位值。
Storing the shift in a single byte works for alignments up to and including 128. We never ever shift a pointer by zero bytes, so we can make this scheme work for up to 256-byte alignment by interpreting the impossible shift value of zero as a 256-byte shift. (For larger alignments, we’d have to allocate even more bytes, and shift the pointer even further up, to make room for a wider “header”.)
将移位存储在单个字节中适用于最多 128 字节的对齐。我们永远不会将指针移位零字节,因此我们可以通过将不可能的移位值零解释为256 字节移位。 (对于更大的对齐,我们必须分配更多字节,并将指针进一步向上移动,为更宽的“标头”腾出空间。)
Here’s how the modified AllocAligned() function and its corresponding FreeAligned() function could be implemented. The process of allocating and freeing aligned blocks is illustrated in Figure 6.3.
以下是修改后的 AllocAligned() 函数及其相应的 FreeAligned() 函数的实现方式。分配和释放对齐块的过程如图 6.3 所示。
// Aligned allocation function. IMPORTANT: ‘align’
// must be a power of 2 (typically 4, 8 or 16).
void* AllocAligned(size_t bytes, size_t align)
{
// Allocate ‘align’ more bytes than we need.
size_t actualBytes = bytes + align;
// Allocate unaligned block.
U8* pRawMem = new U8[actualBytes];
// Align the block. If no alignment occurred,
// shift it up the full ‘align’ bytes so we
// always have room to store the shift
U8* pAlignedMem = AlignPointer(pRawMem, align);
if (pAlignedMem == pRawMem)
pAlignedMem += align;
// Determine the shift, and store it.
// (This works for up to 256-byte alignment.)
ptrdiff_t shift = pAlignedMem - pRawMem;
assert(shift > 0 && shift <= 256);
pAlignedMem[-1] = static_cast<U8>(shift & 0xFF);
return pAlignedMem;
}
void FreeAligned(void* pMem)
{
if (pMem)
{
// Convert to U8 pointer.
U8* pAlignedMem = reinterpret_cast<U8*>(pMem);
// Extract the shift.
ptrdiff_t shift = pAlignedMem[-1];
if (shift == 0)
shift = 256;
// Back up to the actual allocated address,
// and array-delete it.
U8* pRawMem = pAlignedMem - shift;
delete[] pRawMem;
}
}
Virtually all game engines allocate at least some temporary data during the game loop. This data is either discarded at the end of each iteration of the loop or used on the next frame and then discarded. This allocation pattern is so common that many engines support single-frame and double-buffered allocators.
事实上,所有游戏引擎都会在游戏循环期间至少分配一些临时数据。该数据要么在循环的每次迭代结束时被丢弃,要么在下一帧上使用然后被丢弃。这种分配模式非常常见,以至于许多引擎都支持单帧和双缓冲分配器。
A single-frame allocator is implemented by reserving a block of memory and managing it with a simple stack allocator as described above. At the beginning of each frame, the stack’s “top” pointer is cleared to the bottom of the memory block. Allocations made during the frame grow toward the top of the block. Rinse and repeat.
单帧分配器是通过保留一块内存并使用如上所述的简单堆栈分配器对其进行管理来实现的。在每个帧的开始,堆栈的“顶部”指针被清除到内存块的底部。在帧期间进行的分配向块的顶部增长。冲洗并重复。
StackAllocator g_singleFrameAllocator; // Main Game Loop
while (true)
{
// Clear the single-frame allocator’s buffer every
// frame.
g_singleFrameAllocator.clear();
// …
// Allocate from the single-frame buffer. We never
// need to free this data! Just be sure to use it
// only this frame.
void* p = g_singleFrameAllocator.alloc(nBytes);
// …
}
One of the primary benefits of a single-frame allocator is that allocated memory needn’t ever be freed—we can rely on the fact that the allocator will be cleared at the start of every frame. Single-frame allocators are also blindingly fast. The one big negative is that using a single-frame allocator requires a reasonable level of discipline on the part of the programmer. You need to realize that a memory block allocated out of the single-frame buffer will only be valid during the current frame. Programmers must never cache a pointer to a single-frame memory block across the frame boundary!
单帧分配器的主要好处之一是分配的内存永远不需要被释放——我们可以相信分配器将在每帧开始时被清除。单帧分配器也快得令人眼花缭乱。一个很大的缺点是使用单帧分配器需要程序员具有合理水平的纪律。您需要意识到,从单帧缓冲区分配的内存块仅在当前帧期间有效。程序员绝不能跨帧边界缓存指向单帧内存块的指针!
A double-buffered allocator allows a block of memory allocated on frame i to be used on frame (i + 1). To accomplish this, we create two single-frame stack allocators of equal size and then ping-pong between them every frame.
双缓冲分配器允许在帧 i 上分配的内存块在帧 (i + 1) 上使用。为了实现这一点,我们创建两个大小相等的单帧堆栈分配器,然后每帧在它们之间进行乒乓操作。
class DoubleBufferedAllocator
{
U32 m_curStack;
StackAllocator m_stack[2];
public:
void swapBuffers()
{
m_curStack = (U32)!m_curStack;
}
void clearCurrentBuffer()
{
m_stack[m_curStack].clear();
}
void* alloc(U32 nBytes)
{
return m_stack[m_curStack].alloc(nBytes);
}
// …
};
// …
DoubleBufferedAllocator g_doubleBufAllocator;
// Main Game Loop
while (true)
{
// Clear the single-frame allocator every frame as
// before.
g_singleFrameAllocator.clear();
// Swap the active and inactive buffers of the double-
// buffered allocator.
g_doubleBufAllocator.swapBuffers();
// Now clear the newly active buffer, leaving last
// frame’s buffer intact.
g_doubleBufAllocator.clearCurrentBuffer();
// …
// Allocate out of the current buffer, without
// disturbing last frame’s data. Only use this data
// this frame or next frame. Again, this memory never
// needs to be freed
void* p = g_doubleBufAllocator.alloc(nBytes);
// …
}
This kind of allocator is extremely useful for caching the results of asynchronous processing on a multicore game console like the Xbox 360, Xbox One, PlayStation 3 or PlayStation 4. On frame i, we can kick off an asynchronous job on one of the PS4’s cores, for example, handing it the address of a destination buffer that has been allocated from our double-buffered allocator. The job runs and produces its results some time before the end of frame i, storing them into the buffer we provided. On frame (i + 1), the buffers are swapped. The results of the job are now in the inactive buffer, so they will not be overwritten by any double-buffered allocations that might be made during this frame. As long as we use the results of the job before frame (i + 2), our data won’t be overwritten.
这种分配器对于缓存 Xbox 360、Xbox One、PlayStation 3 或 PlayStation 4 等多核游戏机上的异步处理结果非常有用。在第 i 帧上,我们可以在 PS4 的一个核心上启动异步作业例如,将已从双缓冲分配器分配的目标缓冲区的地址传递给它。该作业在第 i 帧结束之前运行并生成结果,并将它们存储到我们提供的缓冲区中。在帧 (i + 1) 上,交换缓冲区。作业的结果现在位于非活动缓冲区中,因此它们不会被此帧期间可能进行的任何双缓冲分配覆盖。只要我们使用第(i + 2)帧之前的作业结果,我们的数据就不会被覆盖。
Another problem with dynamic heap allocations is that memory can become fragmented over time. When a program first runs, its heap memory is entirely free. When a block is allocated, a contiguous region of heap memory of the appropriate size is marked as “in use,” and the remainder of the heap remains free. When a block is freed, it is marked as such, and adjacent free blocks are merged into a single, larger free block. Over time, as allocations and deallocations of various sizes occur in random order, the heap memory begins to look like a patchwork of free and used blocks. We can think of the free regions as “holes” in the fabric of used memory. When the number of holes becomes large, and/or the holes are all relatively small, we say the memory has become fragmented. This is illustrated in Figure 6.4.
动态堆分配的另一个问题是,随着时间的推移,内存可能会变得碎片化。当程序第一次运行时,它的堆内存是完全空闲的。分配块时,适当大小的堆内存的连续区域被标记为“正在使用”,而堆的其余部分保持空闲。当一个块被释放时,它会被标记为这样,并且相邻的空闲块被合并成一个更大的空闲块。随着时间的推移,随着各种大小的分配和释放以随机顺序发生,堆内存开始看起来像是空闲和已用块的拼凑而成。我们可以将空闲区域视为已用内存结构中的“洞”。当空洞的数量变大,和/或空洞都相对较小时,我们就说内存已经变得碎片化。如图 6.4 所示。
The problem with memory fragmentation is that allocations may fail even when there are enough free bytes to satisfy the request. The crux of the problem is that allocated memory blocks must always be contiguous. For example, in order to satisfy a request of 128 KiB, there must exist a free “hole” that is 128 KiB or larger. If there are two holes, each of which is 64 KiB in size, then enough bytes are available but the allocation fails because they are not contiguous bytes.
内存碎片的问题是,即使有足够的可用字节来满足请求,分配也可能会失败。问题的关键是分配的内存块必须始终是连续的。例如,为了满足 128 KiB 的请求,必须存在 128 KiB 或更大的空闲“孔”。如果有两个洞,每个洞的大小均为 64 KiB,则有足够的字节可用,但分配会失败,因为它们不是连续的字节。
Memory fragmentation is not as much of a problem on operating systems that support virtual memory. A virtual memory system maps discontiguous blocks of physical memory known as pages into a virtual address space, in which the pages appear to the application to be contiguous. Stale pages can be swapped to the hard disk when physical memory is in short supply and reloaded from disk when they are needed. For a detailed discussion of how virtual memory works, see http://en.wikipedia.org/wiki/Virtual_memory. Most embedded systems cannot afford to implement a virtual memory system. While some modern consoles do technically support it, most console game engines still do not make use of virtual memory due to the inherent performance overhead.
在支持虚拟内存的操作系统上,内存碎片并不是什么大问题。虚拟内存系统将称为页面的不连续物理内存块映射到虚拟地址空间,其中页面对于应用程序来说是连续的。当物理内存供应不足时,陈旧页面可以交换到硬盘,并在需要时从磁盘重新加载。有关虚拟内存如何工作的详细讨论,请参阅 http://en.wikipedia.org/wiki/Virtual_memory。大多数嵌入式系统无法实现虚拟内存系统。虽然一些现代游戏机在技术上确实支持它,但由于固有的性能开销,大多数游戏机游戏引擎仍然不使用虚拟内存。
The detrimental effects of memory fragmentation can be avoided by using stack and/or pool allocators.
通过使用堆栈和/或池分配器可以避免内存碎片的有害影响。
When differently sized objects are being allocated and freed in a random order, neither a stack-based allocator nor a pool-based allocator can be used. In such cases, fragmentation can be avoided by periodically defragmenting the heap. Defragmentation involves coalescing all of the free “holes” in the heap by shifting allocated blocks from higher memory addresses down to lower addresses (thereby shifting the holes up to higher addresses). One simple algorithm is to search for the first “hole” and then take the allocated block immediately above the hole and shift it down to the start of the hole. This has the effect of “bubbling up” the hole to a higher memory address. If this process is repeated, eventually all the allocated blocks will occupy a contiguous region of memory at the low end of the heap’s address space, and all the holes will have bubbled up into one big hole at the high end of the heap. This is illustrated in Figure 6.7.
当以随机顺序分配和释放不同大小的对象时,既不能使用基于堆栈的分配器,也不能使用基于池的分配器。在这种情况下,可以通过定期对堆进行碎片整理来避免碎片。碎片整理涉及通过将分配的块从较高的内存地址向下移动到较低的地址(从而将空洞向上移动到较高的地址)来合并堆中的所有空闲“空洞”。一种简单的算法是搜索第一个“洞”,然后取出紧邻该洞上方的已分配块并将其向下移动到该洞的开头。这具有将空洞“冒泡”到更高内存地址的效果。如果重复这个过程,最终所有分配的块将占据堆地址空间低端的一个连续的内存区域,并且所有的空洞将在堆的高端形成一个大空洞。如图 6.7 所示。
The shifting of memory blocks described above is not particularly tricky to implement. What is tricky is accounting for the fact that we’re moving allocated blocks of memory around. If anyone has a pointer into one of these allocated blocks, then moving the block will invalidate the pointer.
上述存储块的移位实现起来并不是特别棘手。棘手的是考虑到我们正在移动分配的内存块这一事实。如果任何人拥有指向这些已分配块之一的指针,则移动该块将使该指针无效。
The solution to this problem is to patch any and all pointers into a shifted memory block so that they point to the correct new address after the shift. This procedure is known as pointer relocation. Unfortunately, there is no general-purpose way to find all the pointers that point into a particular region of memory. So if we are going to support memory defragmentation in our game engine, programmers must either carefully keep track of all the pointers manually so they can be relocated, or pointers must be abandoned in favor of something inherently more amenable to relocation, such as smart pointers or handles.
此问题的解决方案是将所有指针修补到已移位的内存块中,以便它们在移位后指向正确的新地址。此过程称为指针重定位。不幸的是,没有通用的方法来查找指向特定内存区域的所有指针。因此,如果我们要在游戏引擎中支持内存碎片整理,程序员必须手动仔细跟踪所有指针,以便可以重新定位它们,或者必须放弃指针,转而采用本质上更适合重新定位的东西,例如智能指针或手柄。
A smart pointer is a small class that contains a pointer and acts like a pointer for most intents and purposes. But because a smart pointer is a class, it can be coded to handle memory relocation properly. One approach is to arrange for all smart pointers to add themselves to a global linked list. Whenever a block of memory is shifted within the heap, the linked list of all smart pointers can be scanned, and each pointer that points into the shifted block of memory can be adjusted appropriately.
智能指针是一个包含指针的小类,并且在大多数意图和目的上都像指针一样工作。但由于智能指针是一个类,因此可以对其进行编码以正确处理内存重定位。一种方法是安排所有智能指针将其自身添加到全局链表中。每当一块内存在堆内移动时,就可以扫描所有智能指针的链表,并且可以适当调整指向移动的内存块的每个指针。
A handle is usually implemented as an index into a non-relocatable table, which itself contains the pointers. When an allocated block is shifted in memory, the handle table can be scanned and all relevant pointers found and updated automatically. Because the handles are just indices into the pointer table, their values never change no matter how the memory blocks are shifted, so the objects that use the handles are never affected by memory relocation.
句柄通常被实现为不可重定位表的索引,该表本身包含指针。当分配的块在内存中移动时,可以扫描句柄表并自动找到并更新所有相关指针。因为句柄只是指针表的索引,所以无论内存块如何移动,它们的值都不会改变,因此使用句柄的对象永远不会受到内存重定位的影响。
Another problem with relocation arises when certain memory blocks cannot be relocated. For example, if you are using a third-party library that does not use smart pointers or handles, it’s possible that any pointers into its data structures will not be relocatable. The best way around this problem is usually to arrange for the library in question to allocate its memory from a special buffer outside of the relocatable memory area. The other option is to simply accept that some blocks will not be relocatable. If the number and size of the non-relocatable blocks are both small, a relocation system will still perform quite well.
当某些内存块无法重定位时,重定位的另一个问题就会出现。例如,如果您使用的第三方库不使用智能指针或句柄,则指向其数据结构的任何指针都可能无法重定位。解决此问题的最佳方法通常是安排相关库从可重定位内存区域之外的特殊缓冲区分配其内存。另一种选择是简单地接受某些块将不可重新定位。如果不可重定位块的数量和大小都很小,重定位系统仍然会表现得很好。
It is interesting to note that all of Naughty Dog’s engines have supported defragmentation. Handles are used wherever possible to avoid the need to relocate pointers. However, in some cases raw pointers cannot be avoided. These pointers are carefully tracked and relocated manually whenever a memory block is shifted due to defragmentation. A few of Naughty Dog’s game object classes are not relocatable for various reasons. However, as mentioned above, this doesn’t pose any practical problems, because the number of such objects is always very small, and their sizes are tiny when compared to the overall size of the relocatable memory area.
有趣的是,顽皮狗的所有引擎都支持碎片整理。尽可能使用句柄以避免重新定位指针。然而,在某些情况下,原始指针是无法避免的。每当内存块因碎片整理而移动时,都会仔细跟踪并手动重新定位这些指针。由于各种原因,顽皮狗的一些游戏对象类不可重定位。然而,如上所述,这不会造成任何实际问题,因为此类对象的数量总是很少,而且与可重定位内存区域的总体大小相比,它们的大小也很小。
Defragmentation can be a slow operation because it involves copying memory blocks. However, we needn’t fully defragment the heap all at once. Instead, the cost can be amortized over many frames. We can allow up to N allocated blocks to be shifted each frame, for some small value of N like 8 or 16. If our game is running at 30 frames per second, then each frame lasts 1/30 of a second (33 ms). So, the heap can usually be completely defragmented in less than one second without having any noticeable effect on the game’s frame rate. As long as allocations and deallocations aren’t happening at a faster rate than the defragmentation shifts, the heap will remain mostly defragmented at all times.
碎片整理可能是一个缓慢的操作,因为它涉及复制内存块。但是,我们不需要立即对堆进行完全碎片整理。相反,成本可以分摊到许多帧上。我们可以允许每帧移动最多 N 个分配的块,对于一些较小的 N 值,如 8 或 16。如果我们的游戏以每秒 30 帧的速度运行,则每帧持续 1/30 秒(33 毫秒) 。因此,堆通常可以在不到一秒的时间内完成碎片整理,而不会对游戏的帧速率产生任何明显的影响。只要分配和释放的速度不快于碎片整理的速度,堆将始终保持大部分碎片整理状态。
This approach is only valid when the size of each block is relatively small, so that the time required to move a single block does not exceed the time allotted to relocation each frame. If very large blocks need to be relocated, we can often break them up into two or more subblocks, each of which can be relocated independently. This hasn’t proved to be a problem in Naughty Dog’s engine, because relocation is only used for dynamic game objects, and they are never larger than a few kibibytes—and usually much smaller.
这种方法仅在每个块的大小相对较小时才有效,因此移动单个块所需的时间不会超过分配给每个帧重定位的时间。如果非常大的块需要重新定位,我们通常可以将它们分成两个或多个子块,每个子块都可以独立地重新定位。这在 Naughty Dog 的引擎中并未被证明是一个问题,因为重定位仅用于动态游戏对象,并且它们永远不会大于几 KB,而且通常要小得多。
Game programmers employ a wide variety of collection-oriented data structures, also known as containers or collections. The job of a container is always the same—to house and manage zero or more data elements; however, the details of how they do this vary greatly, and each type of container has its pros and cons. Common container data types include, but are certainly not limited to, the following.
游戏程序员使用各种面向集合的数据结构,也称为容器或集合。容器的工作始终相同——容纳和管理零个或多个数据元素;然而,他们如何做到这一点的细节差异很大,并且每种类型的容器都有其优点和缺点。常见的容器数据类型包括但当然不限于以下几种。
int a[5]).int a[5] )。std::vector).std::vector )。std::list).std::list )。std::stack).std::stack )。std::queue).std::queue )的容器。std::deque).std::deque )。std::priority_queue), but other implementations are possible. A priority queue is a bit like a list that stays sorted at all times, except that a priority queue only supports retrieval of the highest-priority element, and it is rarely implemented as a list under the hood.std::priority_queue ),但其他实现也是可能的。优先级队列有点像始终保持排序的列表,不同之处在于优先级队列仅支持检索最高优先级元素,并且很少在底层实现为列表。std::map, std::hash_map).std::map, std::hash_map )。Game engines that make use of container classes inevitably make use of various commonplace algorithms as well. Some examples include:
使用容器类的游戏引擎不可避免地也会使用各种常见的算法。一些例子包括:
An iterator is a little class that “knows” how to efficiently visit the elements in a particular kind of container. It acts like an array index or pointer—it refers to one element in the container at a time, it can be advanced to the next element, and it provides some sort of mechanism for testing whether or not all elements in the container have been visited. As an example, the first of the following two code snippets iterates over a C-style array using a pointer, while the second iterates over a linked list using almost identical syntax.
迭代器是一个小类,它“知道”如何有效地访问特定类型容器中的元素。它的作用类似于数组索引或指针 - 它一次引用容器中的一个元素,它可以前进到下一个元素,并且它提供了某种机制来测试容器中的所有元素是否已被访问。例如,以下两个代码片段中的第一个使用指针迭代 C 样式数组,而第二个代码片段使用几乎相同的语法迭代链表。
void processArray(int container[], int numElements)
{
int* pBegin = &container[0];
int* pEnd = &container[numElements];
for (int* p = pBegin; p != pEnd; p++)
{
int element = *p;
// process element…
}
}
void processList(std::list<int>& container)
{
std::list<int>::iterator pBegin = container.begin();
std::list<int>::iterator pEnd = container.end();
for (auto p = pBegin; p != pEnd; ++p)
{
int element = *p;
// process element…
}
}
The key benefits to using an iterator over attempting to access the container’s elements directly are as follows:
与尝试直接访问容器的元素相比,使用迭代器的主要优点如下:
Notice in the processArray() example that we are using C++’s postincrement operator, p++, rather than the preincrement operator, ++p. This is a subtle but sometimes important optimization. The preincrement operator increments the contents of the variable before its (now modified) value is used in the expression. The postincrement operator increments the contents of the variable after it has been used. This means that writing ++p introduces a data dependency into your code—the CPU must wait for the increment operation to be completed before its value can be used in the expression. On a deeply pipelined CPU, this introduces a stall. On the other hand, with p++ there is no data dependency. The value of the variable can be used immediately, and the increment operation can happen later or in parallel with its use. Either way, no stall is introduced into the pipeline.
请注意,在 processArray() 示例中,我们使用的是 C++ 的后自增运算符 p++ ,而不是预自增运算符 ++p 。这是一个微妙但有时很重要的优化。在表达式中使用变量(现在已修改)值之前,预递增运算符会递增变量的内容。后递增运算符在使用变量后递增其内容。这意味着编写 ++p 会在代码中引入数据依赖性 — CPU 必须等待增量操作完成才能在表达式中使用其值。在深度流水线的 CPU 上,这会导致停顿。另一方面,对于 p++ 不存在数据依赖性。变量的值可以立即使用,增量操作可以稍后发生或与其使用并行。无论哪种方式,都不会在管道中引入停顿。
Of course, within the “update” expression of a for loop, there should be no difference between pre- and postincrement. This is because any good compiler will recognize that the value of the variable isn’t used in update_expr. But in cases where the value is used, postincrement is preferable because it doesn’t introduce a stall in the CPU’s pipeline.
当然,在 for 循环的“update”表达式中,前后递增之间不应该有区别。这是因为任何好的编译器都会识别出 update_expr 中未使用该变量的值。但在使用该值的情况下,后增量更可取,因为它不会在 CPU 管道中引入停顿。
It can be wise to make an exception to this little rule of thumb for classes with overloaded increment operators, as is common practice in iterator classes. By definition, the postincrement operator must return an unmodified copy of the object on which it is called. Depending on the size and complexity of the data members of the class, the added cost of copying the iterator can tip the scales toward a preference for preincrement when using such classes in performance-critical loops. (Preincrement isn’t necessarily better than postincrement in such a simple example as the function processList() shown above, but I’ve implemented it with preincrement to highlight the difference.)
对于具有重载增量运算符的类,明智的做法是对这条小经验法则进行例外处理,这是迭代器类中的常见做法。根据定义,后递增运算符必须返回调用它的对象的未修改的副本。根据类的数据成员的大小和复杂性,在性能关键型循环中使用此类时,复制迭代器所增加的成本可能会倾向于预增量。 (在上面所示的函数 processList() 这样的简单示例中,预增量不一定比后增量更好,但我使用预增量来实现它以突出显示差异。)
The choice of which container type to use for a given application depends upon the performance and memory characteristics of the container being considered. For each container type, we can determine the theoretical performance of common operations such as insertion, removal, find and sort.
对给定应用程序使用哪种容器类型的选择取决于所考虑的容器的性能和内存特性。对于每种容器类型,我们可以确定插入、删除、查找和排序等常见操作的理论性能。
We usually indicate the amount of time T that an operation is expected to take as a function of the number of elements n in the container:
我们通常将操作预计花费的时间 T 表示为容器中元素数量 n 的函数:
Rather than try to find the exact function f, we concern ourselves only with finding the overall order of the function. For example, if the actual theoretical function were any of the following,
我们并不试图找到精确的函数 f,而是只关心找到函数的整体阶数。例如,如果实际的理论函数是以下任意一个,
we would, in all cases, simplify the expression down to its most relevant term—in this case n2. To indicate that we are only stating the order of the function, not its exact equation, we use “big O” notation and write
在所有情况下,我们都会将表达式简化为其最相关的术语 - 在本例中为 n 2 。为了表明我们只是陈述函数的阶数,而不是它的精确方程,我们使用“大 O”符号并写成
The order of an algorithm can usually be determined via an inspection of the pseudocode. If the algorithm’s execution time is not dependent upon the number of elements in the container at all, we say it is O(1) (i.e., it completes in constant time). If the algorithm performs a loop over the elements in the container and visits each element once, such as in a linear search of an unsorted list, we say the algorithm is O(n). If two loops are nested, each of which potentially visits each node once, then we say the algorithm is O(n2). If a divide-and-conquer approach is used, as in a binary search (where half of the list is eliminated at each step), then we would expect that only ⌊log2(n) + 1⌋ elements will actually be visited by the algorithm in the worst case, and hence we refer to it as an O(log n) operation. If an algorithm executes a subalgorithm n times, and the subalgorithm is O(log n), then the resulting algorithm would be O(n log n).
算法的顺序通常可以通过检查伪代码来确定。如果算法的执行时间根本不依赖于容器中元素的数量,我们称其为 O(1)(即,它在常数时间内完成)。如果算法对容器中的元素执行循环并访问每个元素一次,例如在未排序列表的线性搜索中,我们称该算法为 O(n)。如果两个循环嵌套,每个循环可能访问每个节点一次,那么我们说该算法为 O(n 2 )。如果使用分而治之的方法,如二分搜索(每一步消除一半列表),那么我们预计只有 ⌊log 2 (n) + 1⌋在最坏的情况下,算法实际上会访问元素,因此我们将其称为 O(log n) 操作。如果一个算法执行子算法 n 次,并且该子算法的复杂度为 O(log n),则生成的算法将为 O(n log n)。
To select an appropriate container class, we should look at the operations that we expect to be most common, then select the container whose performance characteristics for those operations are most favorable. The most common orders you’ll encounter are listed here from fastest to slowest: O(1), O(log n), O(n), O(n log n), O(n2), O(nk) for k > 2.
要选择合适的容器类,我们应该查看我们期望最常见的操作,然后选择那些操作的性能特征最有利的容器。这里列出了您将遇到的最常见的顺序,从最快到最慢:O(1)、O(log n)、O(n)、O(n log n)、O(n 2 ) , O(n k ) k > 2。
We should also take the memory layout and usage characteristics of our containers into account. For example, an array (e.g., int a[5] or std::vector) stores its elements contiguously in memory and requires no overhead storage for anything other than the elements themselves. (Note that a dynamic array does require a small fixed overhead.) On the other hand, a linked list (e.g., std::list) wraps each element in a “link” data structure that contains a pointer to the next element and possibly also a pointer to the previous element, for a total of up to 16 bytes of overhead per element on a 64-bit machine. Also, the elements in a linked list need not be contiguous in memory and often aren’t. A contiguous block of memory is usually much more cache-friendly than a set of disparate memory blocks. Hence, for highspeed algorithms, arrays are usually better than linked lists in terms of cache performance (unless the nodes of the linked list are themselves allocated from a small, contiguous block of memory). But a linked list is better for situations in which speed of inserting and removing elements is of prime importance.
我们还应该考虑容器的内存布局和使用特征。例如,数组(例如 int a[5] 或 std::vector )将其元素连续存储在内存中,并且除了元素本身之外不需要任何开销存储。 (请注意,动态数组确实需要少量固定开销。)另一方面,链表(例如 std::list )将每个元素包装在“链接”数据结构中,该结构包含指向下一个元素的指针元素,还可能是指向前一个元素的指针,在 64 位计算机上每个元素总共最多 16 字节的开销。此外,链表中的元素在内存中不需要连续,而且通常也不是连续的。连续的内存块通常比一组不同的内存块更适合缓存。因此,对于高速算法,在缓存性能方面,数组通常比链表更好(除非链表的节点本身是从一个小的、连续的内存块分配的)。但链表更适合插入和删除元素的速度至关重要的情况。
Many game engines provide their own custom implementations of the common container data structures. This practice is especially prevalent in console game engines and games targeted at mobile phone and PDA platforms. The reasons for building these classes yourself include:
许多游戏引擎提供了自己的通用容器数据结构的自定义实现。这种做法在控制台游戏引擎以及针对手机和 PDA 平台的游戏中尤其普遍。自己构建这些类的原因包括:
We cannot cover all possible data structures here, but let’s look at a few common ways in which game engine programmers tend to tackle containers.
我们无法在这里涵盖所有可能的数据结构,但让我们看看游戏引擎程序员处理容器的几种常见方法。
We will not discuss the details of how to implement all of these data types and algorithms here—a plethora of books and online resources are available for that purpose. However, we will concern ourselves with the question of where to obtain implementations of the types and algorithms that you need. As game engine designers, we have basically three choices:
我们不会在这里讨论如何实现所有这些数据类型和算法的细节——有大量的书籍和在线资源可用于此目的。然而,我们将关心从哪里获得您需要的类型和算法的实现的问题。作为游戏引擎设计师,我们基本上有三种选择:
The C++ standard library and third-party libraries like Boost are attractive options, because they provide a rich and powerful set of container classes covering pretty much every type of data structure imaginable. In addition, these packages provide a powerful suite of template-based generic algorithms—implementations of common algorithms, such as finding an element in a container, which can be applied to virtually any type of data object. However, these implementations may not be appropriate for some kinds of game engines. Let’s take a moment to investigate some of the pros and cons of each approach.
C++ 标准库和第三方库(例如 Boost)是很有吸引力的选择,因为它们提供了一组丰富而强大的容器类,涵盖了几乎所有可以想象的数据结构类型。此外,这些包还提供了一套功能强大的基于模板的通用算法——通用算法的实现,例如在容器中查找元素,它几乎可以应用于任何类型的数据对象。然而,这些实现可能不适合某些类型的游戏引擎。让我们花点时间研究一下每种方法的一些优缺点。
The benefits of the C++ standard library’s STL-style container classes include:
C++ 标准库的 STL 样式容器类的优点包括:
However, these container classes also have some drawbacks, including:
然而,这些容器类也有一些缺点,包括:
The Medal of Honor: Pacific Assault engine for the PC made heavy use of what was known at the time as the standard template library (STL). And while MOHPA did have its share of frame rate problems, the team was able to work around the performance problems caused by STL (primarily by carefully limiting and controlling its use). OGRE, the popular object-oriented rendering library that we use for some of the examples in this book, also makes heavy use of STL-style containers. However, at Naughty Dog we prohibit the use of STL containers in runtime game code (although we do permit their use in offline tools code). Your mileage may vary: Using the STL-style containers provided by the C++ standard library on a game engine project is certainly feasible, but it should be used with care.
PC 版《荣誉勋章:太平洋突击》引擎大量使用了当时称为标准模板库 (STL) 的内容。虽然 MOHPA 确实存在帧速率问题,但该团队能够解决 STL 引起的性能问题(主要是通过仔细限制和控制其使用)。 OGRE 是我们在本书的一些示例中使用的流行的面向对象渲染库,它也大量使用了 STL 风格的容器。然而,在 Naughty Dog,我们禁止在运行时游戏代码中使用 STL 容器(尽管我们允许在离线工具代码中使用它们)。您的情况可能会有所不同:在游戏引擎项目上使用 C++ 标准库提供的 STL 样式容器当然是可行的,但应谨慎使用。
The Boost project was started by members of the C++ Standards Committee Library Working Group, but it is now an open source project with many contributors from across the globe. The aim of the project is to produce libraries that extend and work together with the standard C++ library, for both commercial and non-commercial use. Many of the Boost libraries have already been included in the C++ standard library as of C++11, and more components are included in the Standards Committee’s Library Technical Report (TR2), which is a step toward becoming part of a future C++ standard. Here is a brief summary of what Boost brings to the table:
Boost 项目是由 C++ 标准委员会库工作组的成员发起的,但现在它是一个开源项目,有来自全球各地的许多贡献者。该项目的目标是生成可扩展并与标准 C++ 库一起工作的库,用于商业和非商业用途。自 C++11 起,许多 Boost 库已包含在 C++ 标准库中,并且更多组件包含在标准委员会的库技术报告 (TR2) 中,这是成为未来 C++ 标准一部分的一步。以下是 Boost 带来的简要总结:
If you are already using the C++ standard library, then Boost can serve as an excellent extension and/or alternative to many of its features. However, be aware of the following caveats:
如果您已经在使用 C++ 标准库,那么 Boost 可以作为其许多功能的优秀扩展和/或替代方案。但是,请注意以下注意事项:
Folly is an open source library developed by Andrei Alexandrescu and the engineers at Facebook. Its goal is to extend the standard C++ library and the Boost library (rather than to compete with these libraries), with an emphasis on ease of use and the development of high-performance software. You can read about it by searching online for the article entitled “Folly: The Facebook Open Source Library” which is hosted at https://www.facebook.com/. The library itself is available on GitHub here: https://github.com/facebook/folly.
Folly 是一个开源库,由 Andrei Alexandrescu 和 Facebook 的工程师开发。其目标是扩展标准 C++ 库和 Boost 库(而不是与这些库竞争),重点是易用性和高性能软件的开发。您可以通过在线搜索题为“Folly:Facebook 开源库”的文章来阅读相关内容,该文章托管在 https://www.facebook.com/ 上。该库本身可以在 GitHub 上找到:https://github.com/facebook/folly。
There is a rather esoteric branch of C++ programming known as template metaprogramming. The core idea is to use the compiler to do a lot of the work that would otherwise have to be done at runtime by exploiting the template feature of C++ and in effect “tricking” the compiler into doing things it wasn’t originally designed to do. This can lead to some startlingly powerful and useful programming tools.
C++ 编程有一个相当深奥的分支,称为模板元编程。其核心思想是利用 C++ 的模板功能,利用编译器来完成许多原本必须在运行时完成的工作,实际上是“欺骗”编译器做它最初没有设计做的事情。这可能会带来一些非常强大和有用的编程工具。
By far the most well-known and probably most powerful template metaprogramming library for C++ is Loki, a library designed and written by Andrei Alexandrescu (whose home page is at http://www.erdani.org). The library can be obtained from SourceForge at http://loki-lib.sourceforge.net.
迄今为止,最知名且可能最强大的 C++ 模板元编程库是 Loki,它是由 Andrei Alexandrescu 设计和编写的库(其主页位于 http://www.erdani.org)。该库可以从 SourceForge 获取,网址为 http://loki-lib.sourceforge.net。
Loki is extremely powerful; it is a fascinating body of code to study and learn from. However, its two big weaknesses are of a practical nature: (a) its code can be daunting to read and use, much less truly understand, and (b) some of its components are dependent upon exploiting “side-effect” behaviors of the compiler that require careful customization in order to be made to work on new compilers. So Loki can be somewhat tough to use, and it is not as portable as some of its “less-extreme” counterparts. Loki is not for the faint of heart. That said, some of Loki’s concepts such as policy-based programming can be applied to any C++ project, even if you don’t use the Loki library per se. I highly recommend that all software engineers read Andrei’s ground-breaking book, Modern C++ Design [3], from which the Loki library was born.
洛基非常强大;这是一组令人着迷的代码,值得研究和学习。然而,它的两个大弱点是实用性的:(a)它的代码读起来和使用起来可能令人畏惧,更不用说真正理解了,(b)它的一些组件依赖于利用系统的“副作用”行为。需要仔细定制才能在新编译器上工作的编译器。因此,Loki 可能有点难以使用,而且它不像一些“不太极端”的同类产品那么便携。洛基不适合胆小的人。也就是说,Loki 的一些概念(例如基于策略的编程)可以应用于任何 C++ 项目,即使您本身不使用 Loki 库。我强烈建议所有软件工程师阅读 Andrei 的开创性著作《Modern C++ Design》[3],Loki 库就是由此诞生的。
Fixed size C-style arrays are used quite a lot in game programming, because they require no memory allocation, are contiguous and hence cache-friendly, and support many common operations such as appending data and searching very efficiently.
固定大小的 C 风格数组在游戏编程中被大量使用,因为它们不需要内存分配,是连续的,因此对缓存友好,并且支持许多常见操作,例如附加数据和非常有效的搜索。
When the size of an array cannot be determined a priori, programmers tend to turn either to linked lists or dynamic arrays. If we wish to maintain the performance and memory characteristics of fixed-length arrays, then the dynamic array is often the data structure of choice.
当数组的大小无法预先确定时,程序员往往会转向链表或动态数组。如果我们希望保持固定长度数组的性能和内存特性,那么动态数组通常是首选的数据结构。
The easiest way to implement a dynamic array is to allocate an n-element buffer initially and then grow the list only if an attempt is made to add more than n elements to it. This gives us the favorable characteristics of a fixed size array but with no upper bound. Growing is implemented by allocating a new larger buffer, copying the data from the original buffer into the new buffer, and then freeing the original buffer. The size of the buffer is increased in some orderly manner, such as adding n to it on each grow, or doubling it on each grow. Most of the implementations I’ve encountered never shrink the array, only grow it (with the notable exception of clearing the array to zero size, which might or might not free the buffer). Hence the size of the array becomes a sort of “high water mark.” The std::vector class works in this manner.
实现动态数组的最简单方法是首先分配一个 n 元素缓冲区,然后仅在尝试向其中添加超过 n 个元素时才增长列表。这为我们提供了固定大小数组的有利特性,但没有上限。增长是通过分配一个新的更大的缓冲区,将数据从原始缓冲区复制到新缓冲区,然后释放原始缓冲区来实现的。缓冲区的大小以某种有序的方式增加,例如在每次增长时向其添加 n,或在每次增长时加倍。我遇到的大多数实现都不会缩小数组,只会增大数组(值得注意的例外是将数组清除为零大小,这可能会也可能不会释放缓冲区)。因此,数组的大小变成了一种“高水位线”。 std::vector 类以这种方式工作。
Of course, if you can establish a high water mark for your data, then you’re probably better off just allocating a single buffer of that size when the engine starts up. Growing a dynamic array can be incredibly costly due to reallocation and data copying costs. The impact of these things depends on the sizes of the buffers involved. Growing can also lead to fragmentation when discarded buffers are freed. So, as with all data structures that allocate memory, caution must be exercised when working with dynamic arrays. Dynamic arrays are probably best used during development, when you are as yet unsure of the buffer sizes you’ll require. They can always be converted into fixed size arrays once suitable memory budgets have been established.
当然,如果您可以为数据建立一个高水位线,那么您最好在引擎启动时只分配该大小的单个缓冲区。由于重新分配和数据复制成本,增长动态数组的成本可能非常高。这些因素的影响取决于所涉及的缓冲区的大小。当释放废弃的缓冲区时,增长还会导致碎片。因此,与所有分配内存的数据结构一样,在使用动态数组时必须小心谨慎。当您还不确定所需的缓冲区大小时,动态数组可能最好在开发过程中使用。一旦建立了合适的内存预算,它们总是可以转换为固定大小的数组。
A dictionary is a table of key-value pairs. A value in the dictionary can be looked up quickly, given its key. The keys and values can be of any data type. This kind of data structure is usually implemented either as a binary search tree or as a hash table.
字典是一个键值对表。给定其键,可以快速查找字典中的值。键和值可以是任何数据类型。这种数据结构通常以二叉搜索树或哈希表的形式实现。
In a binary tree implementation, the key-value pairs are stored in the nodes of the binary tree, and the tree is maintained in key-sorted order. Looking up a value by key involves performing an O(log n) binary search.
在二叉树实现中,键值对存储在二叉树的节点中,并且树按键排序顺序维护。按键查找值涉及执行 O(log n) 二分搜索。
In a hash table implementation, the values are stored in a fixed size table, where each slot in the table represents one or more keys. To insert a key-value pair into a hash table, the key is first converted into integer form via a process known as hashing (if it is not already an integer). Then an index into the hash table is calculated by taking the hashed key modulo the size of the table. Finally, the key-value pair is stored in the slot corresponding to that index. Recall that the modulo operator (% in C/C++) finds the remainder of dividing the integer key by the table size. So if the hash table has five slots, then a key of 3 would be stored at index 3 (3 % 5 == 3), while a key of 6 would be stored at index 1 (6 % 5 == 1). Finding a key-value pair is an O(1) operation in the absence of collisions.
在哈希表实现中,值存储在固定大小的表中,其中表中的每个槽代表一个或多个键。要将键值对插入哈希表,首先通过称为哈希的过程将键转换为整数形式(如果它还不是整数)。然后通过将散列键对表的大小取模来计算散列表的索引。最后,将键值对存储到该索引对应的槽中。回想一下,模运算符(C/C++ 中的 %)查找整数键除以表大小的余数。因此,如果哈希表有五个槽,则键 3 将存储在索引 3 ( 3 % 5 == 3 ) 处,而键 6 将存储在索引 1 ( 6 % 5 == 1 ) 处。在没有冲突的情况下,查找键值对的操作时间复杂度为 O(1)。
Sometimes two or more keys end up occupying the same slot in the hash table. This is known as a collision. There are two basic ways to resolve a collision, giving rise to two different kinds of hash tables:
有时两个或多个键最终会占用哈希表中的同一个槽。这称为碰撞。有两种基本方法可以解决冲突,从而产生两种不同类型的哈希表:
Confusingly, closed hash tables are sometimes said to use open addressing, while open hash tables are said to use an addressing method known as chaining, so named due to the linked lists at each slot in the table.
令人困惑的是,封闭哈希表有时被称为使用开放寻址,而开放哈希表被称为使用称为链接的寻址方法,如此命名是由于表中每个槽处的链表。
Hashing is the process of turning a key of some arbitrary data type into an integer, which can be used modulo the table size as an index into the table. Mathematically, given a key k, we want to generate an integer hash value h using the hash function H and then find the index i into the table as follows:
散列是将某种任意数据类型的键转换为整数的过程,可以使用该整数对表大小进行模作为表的索引。从数学上来说,给定一个键 k,我们希望使用哈希函数 H 生成一个整数哈希值 h,然后在表中找到索引 i,如下所示:
where N is the number of slots in the table, and the symbol mod represents the modulo operation, i.e., finding the remainder of the quotient h/N.
其中N是表中的槽数,符号mod表示取模运算,即求商h/N的余数。
If the keys are unique integers, the hash function can be the identity function, H(k) = k. If the keys are unique 32-bit floating-point numbers, a hash function might simply reinterpret the bit pattern of the 32-bit float as if it were a 32-bit integer.
如果键是唯一整数,则哈希函数可以是恒等函数,H(k) = k。如果键是唯一的 32 位浮点数,则哈希函数可能会简单地将 32 位浮点数的位模式重新解释为 32 位整数。
U32 hashFloat(float f)
{
union
{
float m_asFloat;
U32 m_asU32;
} u;
u.m_asFloat = f;
return u.m_asU32;
}
If the key is a string, we can employ a string hashing function, which combines the ASCII or UTF codes of all the characters in the string into a single 32-bit integer value.
如果键是字符串,我们可以使用字符串哈希函数,它将字符串中所有字符的 ASCII 或 UTF 代码组合成一个 32 位整数值。
The quality of the hashing function H(k) is crucial to the efficiency of the hash table. A “good” hashing function is one that distributes the set of all valid keys evenly across the table, thereby minimizing the likelihood of collisions. A hash function must also be reasonably quick to calculate, and deterministic in the sense that it must produce the exact same output every time it is called with an indentical input.
哈希函数H(k)的质量对于哈希表的效率至关重要。 “好的”散列函数是将所有有效键的集合均匀地分布在表中,从而最大限度地减少冲突的可能性。哈希函数还必须计算得相当快,并且具有确定性,因为每次使用相同的输入调用它时,它都必须产生完全相同的输出。
Strings are probably the most prevalent type of key you’ll encounter, so it’s particularly helpful to know a “good” string hashing function. Table 6.1 lists a number of well-known hashing algorithms, their throughput ratings (based on benchmark measurements and then converted into a rating of Low, Medium or High) and their score on the SMHasher test (https://github.com/aappleby/smhasher). Please note that the relative throughputs listed in the table are for rough comparison purposes only. Many factors contribute to the throughput of a hash function, including the hardware on which it is run and the properties of the input data. Cryptographic hashes are deliberately slow, as their focus is on producing a hash that is extremely unlikely to collide with the hashes of other input strings, and for which the task of determining a string that would produce a given hash value is extremely computationally difficult.
字符串可能是您遇到的最常见的密钥类型,因此了解“好的”字符串哈希函数特别有帮助。表 6.1 列出了一些著名的哈希算法、它们的吞吐量评级(基于基准测试,然后转换为低、中或高评级)以及它们在 SMHasher 测试中的得分 (https://github.com/aappleby /smhasher)。请注意,表中列出的相对吞吐量仅用于粗略比较。许多因素都会影响哈希函数的吞吐量,包括运行该函数的硬件和输入数据的属性。加密哈希故意缓慢,因为它们的重点是生成极不可能与其他输入字符串的哈希发生冲突的哈希,并且确定将生成给定哈希值的字符串的任务在计算上极其困难。
Name |
Throughput |
Score |
Cryptographic? |
|---|---|---|---|
xxHash |
High |
10 |
No |
MurmurHash 3a |
High |
10 |
No |
SBox |
Medium |
9 |
No‡ |
Lookup3 |
Medium |
9 |
No |
CityHash64 |
Medium |
10 |
No |
CRC32 |
Low |
9 |
No |
MD5-32 |
Low |
10 |
Yes |
SHA1-32 |
Low |
10 |
Yes |
Table 6.1. Comparison of well-known hashing algorithms in terms of their relative throughput and their scores on the SMHasher test. ‡Note that SBox is not itself a cryptographic hash, but it is one component of the symmetric key algorithms used in cryptography.
表 6.1。比较著名的哈希算法的相对吞吐量及其在 SMHasher 测试中的得分。 ‡请注意,SBox 本身并不是加密哈希,但它是加密中使用的对称密钥算法的一个组成部分。
For more information on hash functions, see the excellent article by Paul Hsieh available at http://www.azillionmonkeys.com/qed/hash.html.
有关哈希函数的更多信息,请参阅 Paul Hsieh 撰写的优秀文章,网址为:http://www.azillionmonkeys.com/qed/hash.html。
In a closed hash table, the key-value pairs are stored directly in the table, rather than in a linked list at each table entry. This approach allows the programmer to define a priori the exact amount of memory that will be used by the hash table. A problem arises when we encounter a collision—two keys that end up wanting to be stored in the same slot in the table. To address this, we use a process known as probing.
在封闭哈希表中,键值对直接存储在表中,而不是存储在每个表条目的链表中。这种方法允许程序员预先定义哈希表将使用的确切内存量。当我们遇到冲突时就会出现问题——两个键最终想要存储在表中的同一个槽中。为了解决这个问题,我们使用了一种称为探测的过程。
The simplest approach is linear probing. Imagine that our hashing function has yielded a table index of i, but that slot is already occupied; we simply try slots (i + 1), (i + 2) and so on until an empty slot is found (wrapping around to the start of the table when i = N). Another variation on linear probing is to alternate searching forwards and backwards, (i + 1), (i − 1), (i + 2), (i − 2) and so on, making sure to modulo the resulting indices into the valid range of the table.
最简单的方法是线性探测。想象一下,我们的散列函数生成了表索引 i,但该槽已被占用;我们只需尝试插槽 (i + 1)、(i + 2) 等,直到找到空插槽(当 i = N 时,环绕到表的开头)。线性探测的另一种变化是交替向前和向后搜索,(i + 1)、(i − 1)、(i + 2)、(i − 2) 等,确保将结果索引取模为有效表的范围。
Linear probing tends to cause key-value pairs to “clump up.” To avoid these clusters, we can use an algorithm known as quadratic probing. We start at the occupied table index i and use the sequence of probes ij = (i ± j2) for j = 1, 2, 3, …. In other words, we try (i + 12), (i − 12), (i + 22), (i − 22) and so on, remembering to always modulo the resulting index into the valid range of the table.
线性探测往往会导致键值对“聚集”。为了避免这些聚类,我们可以使用一种称为二次探测的算法。我们从占用的表索引 i 开始,并使用探针序列 i = (i ± j 2 ) for j = 1, 2, 3, …。换句话说,我们尝试 (i + 1 2 ), (i − 1 2 ), (i + 2 2 ), (i − 2 < b4> )等等,记住始终将结果索引取模到表的有效范围内。
When using closed hashing, it is a good idea to make your table size a prime number. Using a prime table size in conjunction with quadratic probing tends to yield the best coverage of the available table slots with minimal clustering. See http://stackoverflow.com/questions/1145217/why-should-hash-functions-use-a-prime-number-modulus for a good discussion of why prime hash table sizes are preferable.
使用封闭散列时,最好将表大小设置为素数。将素数表大小与二次探测结合使用往往会以最小的聚类产生可用表槽的最佳覆盖。请参阅http://stackoverflow.com/questions/1145217/why-should-hash-functions-use-a-prime-number-modulus,了解为什么素数哈希表大小更可取的良好讨论。
Robin Hood hashing is another probing method for closed hash tables that has gained popularity recently. This probing scheme improves the performance of a closed hash table, even when the table is nearly full. For a good discussion of how Robin Hood hashing works, see https://www.sebastiansylvan.com/post/robin-hood-hashing-should-be-your-default-hash-table-implementation/.
Robin Hood 哈希是另一种最近流行的封闭哈希表探测方法。这种探测方案提高了封闭哈希表的性能,即使该表几乎已满。有关 Robin Hood 哈希如何工作的详细讨论,请参阅 https://www.sebastiansylvan.com/post/robin-hood-hashing-should-be-your-default-hash-table-implementation/。
Strings are ubiquitous in almost every software project, and game engines are no exception. On the surface, the string may seem like a simple, fundamental data type. But, when you start using strings in your projects, you will quickly discover a wide range of design issues and constraints, all of which must be carefully accounted for.
字符串几乎在每个软件项目中都无处不在,游戏引擎也不例外。从表面上看,字符串似乎是一种简单的基本数据类型。但是,当您开始在项目中使用字符串时,您很快就会发现各种设计问题和限制,所有这些都必须仔细考虑。
The most fundamental question about strings is how they should be stored and managed in your program. In C and C++, strings aren’t even an atomic type—they are implemented as arrays of characters. The variable length of strings means we either have to hard-code limitations on the sizes of our strings, or we need to dynamically allocate our string buffers.
关于字符串的最基本问题是它们应该如何在程序中存储和管理。在 C 和 C++ 中,字符串甚至不是原子类型——它们被实现为字符数组。字符串的可变长度意味着我们要么必须对字符串的大小进行硬编码限制,要么需要动态分配字符串缓冲区。
Another big string-related problem is that of localization—the process of adapting your software for release in other languages. This is also known as internationalization, or I18N for short. Any string that you display to the user in English must be translated into whatever languages you plan to support. (Strings that are used internally to the program but are never displayed to the user are exempt from localization, of course.) This not only involves making sure that you can represent all the character glyphs of all the languages you plan to support (via an appropriate set of fonts), but it also means ensuring that your game can handle different text orientations. For example, traditional Chinese text is oriented vertically instead of horizontally (although modern Chinese and Japanese are commonly written horizontally and left-to-right), and some languages like Hebrew read right-to-left. Your game also needs to gracefully deal with the possibility that a translated string will be either much longer or much shorter than its English counterpart.
另一个与字符串相关的大问题是本地化——调整软件以其他语言发布的过程。这也称为国际化,简称 I18N。您以英语向用户显示的任何字符串都必须翻译成您计划支持的任何语言。 (当然,程序内部使用但从未向用户显示的字符串可以免于本地化。)这不仅涉及确保您可以表示您计划支持的所有语言的所有字符字形(通过适当的字体集),但这也意味着确保您的游戏可以处理不同的文本方向。例如,繁体中文文本是垂直方向而不是水平方向(尽管现代中文和日语通常是水平方向和从左到右书写),而希伯来语等一些语言是从右到左阅读。您的游戏还需要优雅地处理翻译后的字符串比其英文字符串长得多或短得多的可能性。
Finally, it’s important to realize that strings are used internally within a game engine for things like resource file names and object ids. For example, when a game designer lays out a level, it’s highly convenient to permit him or her to identify the objects in the level using meaningful names, like “Player-Camera,” “enemy-tank-01” or “explosionTrigger.”
最后,重要的是要认识到字符串在游戏引擎内部用于资源文件名和对象 ID 等内容。例如,当游戏设计师设计关卡时,允许他或她使用有意义的名称来识别关卡中的对象是非常方便的,例如“Player-Camera”、“enemy-tank-01”或“explosionTrigger”。
How our engine deals with these internal strings often has pervasive ramifications on the performance of the game. This is because strings are inherently expensive to work with at runtime. Comparing or copying ints or floats can be accomplished via simple machine language instructions. On the other hand, comparing strings requires an O(n) scan of the character arrays using a function like strcmp() (where n is the length of the string). Copying a string requires an O(n) memory copy, not to mention the possibility of having to dynamically allocate the memory for the copy. During one project I worked on, we profiled our game’s performance only to discover that strcmp() and strcpy() were the top two most expensive functions! By eliminating unnecessary string operations and using some of the techniques outlined in this section, we were able to all but eliminate these functions from our profile and increase the game’s frame rate significantly. (I’ve heard similar stories from developers at a number of different studios.)
我们的引擎如何处理这些内部字符串通常会对游戏的性能产生普遍的影响。这是因为在运行时使用字符串本质上是昂贵的。比较或复制 int 或 float 可以通过简单的机器语言指令来完成。另一方面,比较字符串需要使用 strcmp() 之类的函数(其中 n 是字符串的长度)对字符数组进行 O(n) 扫描。复制字符串需要 O(n) 内存副本,更不用说必须为副本动态分配内存的可能性。在我从事的一个项目中,我们分析了游戏的性能,结果发现 strcmp() 和 strcpy() 是最昂贵的两个函数!通过消除不必要的字符串操作并使用本节中概述的一些技术,我们几乎能够从配置文件中消除这些功能,并显着提高游戏的帧速率。 (我从许多不同工作室的开发人员那里听到过类似的故事。)
Many C++ programmers prefer to use a string class, such as the C++ standard library’s std::string, rather than deal directly with character arrays. Such classes can make working with strings much more convenient for the programmer. However, a string class can have hidden costs that are difficult to see until the game is profiled. For example, passing a string to a function using a C-style character array is fast because the address of the first character is typically passed in a hardware register. On the other hand, passing a string object might incur the overhead of one or more copy constructors, if the function is not declared or used properly. Copying strings might involve dynamic memory allocation, causing what looks like an innocuous function call to end up costing literally thousands of machine cycles.
许多 C++ 程序员更喜欢使用字符串类,例如 C++ 标准库的 std::string ,而不是直接处理字符数组。这些类可以使程序员更方便地处理字符串。然而,字符串类可能具有隐藏成本,在对游戏进行分析之前很难看到这些成本。例如,使用 C 样式字符数组将字符串传递给函数速度很快,因为第一个字符的地址通常在硬件寄存器中传递。另一方面,如果函数未正确声明或使用,则传递字符串对象可能会产生一个或多个复制构造函数的开销。复制字符串可能涉及动态内存分配,导致看似无害的函数调用最终会花费数千个机器周期。
Because of the myriad issues with string classes, I generally prefer to avoid them in runtime game code. However, if you feel a strong urge to use a string class, make sure you pick or implement one that has acceptable runtime performance characteristics—and be sure all programmers that use it are aware of its costs. Know your string class: Does it treat all string buffers as read-only? Does it utilize the copy on write optimization? (See http://en.wikipedia.org/wiki/Copy-on-write.) In C++11, does it provide a move constructor? Does it own the memory associated with the string, or can it reference memory that it does not own? (See http://www.boost.org/doc/libs/1_57_0/libs/utility/doc/html/string_ref.html for more on the issue of memory ownership in string classes.) As a rule of thumb, always pass string objects by reference, never by value (as the latter often incurs string-copying costs). Profile your code early and often to ensure that your string class isn’t becoming a major source of lost frame rate!
由于字符串类存在无数问题,我通常更愿意在运行时游戏代码中避免使用它们。但是,如果您强烈渴望使用字符串类,请确保选择或实现具有可接受的运行时性能特征的字符串类,并确保所有使用它的程序员都了解其成本。了解您的字符串类:它是否将所有字符串缓冲区视为只读?它是否利用写时复制优化? (请参阅http://en.wikipedia.org/wiki/Copy-on-write。)在C++11中,它是否提供移动构造函数?它是否拥有与字符串关联的内存,或者它是否可以引用它不拥有的内存? (有关字符串类中内存所有权问题的更多信息,请参阅http://www.boost.org/doc/libs/1_57_0/libs/utility/doc/html/string_ref.html。)根据经验,始终传递通过引用而不是通过值的字符串对象(因为后者通常会产生字符串复制成本)。尽早并经常分析您的代码,以确保您的字符串类不会成为丢失帧率的主要来源!
One situation in which a specialized string class does seem justifiable to me is when storing and managing file system paths. Here, a hypothetical Path class could add significant value over a raw C-style character array. For example, it might provide functions for extracting the filename, file extension or directory from the path. It might hide operating system differences by automatically converting Windows-style backslashes to UNIX-style forward slashes or some other operating system’s path separator. Writing a Path class that provides this kind of functionality in a cross-platform way could be highly valuable within a game engine context. (See Section 7.1.1.4 for more details on this topic.)
在我看来,专门的字符串类确实合理的一种情况是在存储和管理文件系统路径时。这里,假设的 Path 类可以为原始 C 风格字符数组添加重要的价值。例如,它可能提供从路径中提取文件名、文件扩展名或目录的函数。它可能会通过自动将 Windows 样式的反斜杠转换为 UNIX 样式的正斜杠或某些其他操作系统的路径分隔符来隐藏操作系统差异。编写一个以跨平台方式提供此类功能的 Path 类在游戏引擎上下文中可能非常有价值。 (有关此主题的更多详细信息,请参阅第 7.1.1.4 节。)
The objects in any virtual game world need to be uniquely identified in some way. For example, in Pac Man we might encounter game objects named “pac_man,” “blinky,” “pinky,” “inky” and “clyde.” Unique object identifiers allow game designers to keep track of the myriad objects that make up their game worlds and also permit those objects to be found and operated on at runtime by the engine. In addition, the assets from which our game objects are constructed—meshes, materials, textures, audio clips, animations and so on—all need unique identifiers as well.
任何虚拟游戏世界中的对象都需要以某种方式进行唯一标识。例如,在《吃豆人》中,我们可能会遇到名为“pac_man”、“blinky”、“pinky”、“inky”和“clyde”的游戏对象。独特的对象标识符使游戏设计者能够跟踪构成游戏世界的无数对象,并允许引擎在运行时找到并操作这些对象。此外,构建游戏对象的资产(网格、材质、纹理、音频剪辑、动画等)也都需要唯一的标识符。
Strings seem like a natural choice for such identifiers. Assets are often stored in individual files on disk, so they can usually be identified uniquely by their file paths, which of course are strings. And game objects are created by game designers, so it is natural for them to assign their objects understandable string names, rather than have to remember integer object indices, or 64- or 128-bit globally unique identifiers (GUIDs). However, the speed with which comparisons between unique identifiers can be made is of paramount importance in a game, so strcmp() simply doesn’t cut it. We need a way to have our cake and eat it too—a way to get all the descriptiveness and flexibility of a string, but with the speed of an integer.
字符串似乎是此类标识符的自然选择。资产通常存储在磁盘上的单独文件中,因此通常可以通过文件路径(当然是字符串)唯一地标识它们。游戏对象是由游戏设计者创建的,因此他们很自然地为对象分配可理解的字符串名称,而不必记住整数对象索引或 64 位或 128 位全局唯一标识符 (GUID)。然而,唯一标识符之间的比较速度在游戏中至关重要,因此 strcmp() 根本无法降低速度。我们需要一种鱼与熊掌兼得的方法——一种既能获得字符串的所有描述性和灵活性,又具有整数速度的方法。
One good solution is to hash our strings. As we’ve seen, a hash function maps a string onto a semi-unique integer. String hash codes can be compared just like any other integers, so comparisons are fast. If we store the actual strings in a hash table, then the original string can always be recovered from the hash code. This is useful for debugging purposes and to permit hashed strings to be displayed on-screen or in log files. Game programmers sometimes use the term string id to refer to such a hashed string. The Unreal engine uses the term name instead (implemented by class FName).
一种好的解决方案是对字符串进行哈希处理。正如我们所见,哈希函数将字符串映射到半唯一整数。字符串哈希码可以像任何其他整数一样进行比较,因此比较速度很快。如果我们将实际的字符串存储在哈希表中,那么总是可以从哈希码中恢复原始字符串。这对于调试目的非常有用,并允许散列字符串显示在屏幕上或日志文件中。游戏程序员有时使用术语“字符串 id”来指代此类哈希字符串。 Unreal 引擎使用术语 name (由类 FName 实现)。
As with any hashing system, collisions are a possibility (i.e., two different strings might end up with the same hash code). However, with a suitable hash function, we can all but guarantee that collisions will not occur for all reasonable input strings we might use in our game. After all, a 32-bit hash code represents more than four billion possible values. So, if our hash function does a good job of distributing strings evenly throughout this very large range, we are unlikely to collide. At Naughty Dog, we started out using a variant of the CRC-32 algorithm to hash our strings, and we encountered only a handful of collisions during many years of development on Uncharted and The Last of Us. And when a collision did occur, fixing it was a simple matter of slightly altering one of the strings (e.g., append a “2” or a “b” to one of the strings, or use a totally different but synonymous string). That being said, Naughty Dog has moved to a 64-bit hashing function for The Last of Us Part II and all of our future game titles; this should essentially eliminate the possibility of hash collisions, given the quantity and typical lengths of the strings we use in any one game.
与任何哈希系统一样,冲突是可能的(即,两个不同的字符串可能最终得到相同的哈希码)。然而,通过合适的哈希函数,我们几乎可以保证我们在游戏中可能使用的所有合理输入字符串都不会发生冲突。毕竟,32 位哈希码代表超过 40 亿个可能的值。因此,如果我们的哈希函数能够很好地将字符串均匀地分布在这个非常大的范围内,那么我们就不太可能发生冲突。在顽皮狗,我们开始使用 CRC-32 算法的变体来散列我们的字符串,在《神秘海域》和《最后生还者》的多年开发过程中,我们只遇到了少数冲突。当确实发生冲突时,修复它只需稍微改变其中一个字符串(例如,在其中一个字符串后附加“2”或“b”,或者使用完全不同但同义的字符串)。话虽如此,顽皮狗已经为《最后生还者 第二部》和我们未来的所有游戏标题转向了 64 位哈希函数;考虑到我们在任何一款游戏中使用的字符串的数量和典型长度,这应该基本上消除哈希冲突的可能性。
Conceptually, it’s easy enough to run a hash function on your strings in order to generate string ids. Practically speaking, however, it’s important to consider when the hash will be calculated. Most game engines that use string ids do the hashing at runtime. At Naughty Dog, we permit runtime hashing of strings, but we also use C++11’s user-defined literals feature to transform the syntax “any_string”_sid directly into a hashed integer value at compile time. This permits string ids to be used anywhere that an integer manifest constant can be used, including the constant case labels of a switch statement. (The result of a function call that generates a string id at runtime is not a constant, so it cannot be used as a case label.)
从概念上讲,在字符串上运行哈希函数来生成字符串 ID 是很容易的。然而,实际上来说,考虑何时计算哈希值很重要。大多数使用字符串 ID 的游戏引擎都会在运行时进行哈希处理。在 Naughty Dog,我们允许对字符串进行运行时哈希,但我们也使用 C++11 的用户定义文字功能在编译时将语法 “any_string”_sid 直接转换为哈希整数值。这允许在可以使用整数清单常量的任何地方使用字符串 id,包括 switch 语句的常量 case 标签。 (在运行时生成字符串 id 的函数调用结果不是常量,因此不能用作 case 标签。)
The process of generating a string id from a string is sometimes called interning the string, because in addition to hashing it, the string is typically also added to a global string table. This allows the original string to be recovered from the hash code later. You may also want your tools to be capable of hashing strings into string ids. That way, when the tool generates data for consumption by your engine, the strings will already have been hashed.
从字符串生成字符串 id 的过程有时称为字符串实习,因为除了对其进行散列之外,该字符串通常还会添加到全局字符串表中。这允许稍后从哈希码中恢复原始字符串。您可能还希望您的工具能够将字符串哈希为字符串 id。这样,当该工具生成供引擎使用的数据时,字符串就已经被散列了。
The main problem with interning a string is that it is a slow operation. The hashing function must be run on the string, which can be an expensive proposition, especially when a large number of strings are being interned. In addition, memory must be allocated for the string, and it must be copied into the lookup table. As a result (if you are not generating string ids at compile-time), it is usually best to intern each string only once and save off the result for later use. For example, it would be preferable to write code like this because the latter implementation causes the strings to be unnecessarily re-interned every time the function f() is called.
驻留字符串的主要问题是它的操作速度很慢。哈希函数必须在字符串上运行,这可能是一个昂贵的提议,特别是当大量字符串被保留时。此外,必须为字符串分配内存,并且必须将其复制到查找表中。因此(如果您不在编译时生成字符串 ID),通常最好只对每个字符串进行一次实习,并保存结果以供以后使用。例如,最好编写这样的代码,因为后一种实现会导致每次调用函数 f() 时都不必要地重新保留字符串。
static StringId sid_foo = internString(“foo”);
static StringId sid_bar = internString(“bar”);
// …
void f(StringId id)
{
if (id == sid_foo)
{
// handle case of id == “foo”
}
else if (id == sid_bar)
{
// handle case of id == “bar”
}
}
The following approach is less efficient:
以下方法效率较低:
void f(StringId id)
{
if (id == internString(“foo”))
{
// handle case of id == “foo”
}
else if (id == internString(“bar”))
{
// handle case of id == “bar”
}
}
Here’s one possible implementation of internString().
这是 internString() 的一种可能的实现。
typedef U32 StringId; extern StringId internString(const char* str);
static HashTable<StringId, const char*> gStringIdTable;
StringId internString(const char* str)
{
StringId sid = hashCrc32(str);
HashTable<StringId, const char*>::iterator it
= gStringIdTable.find(sid);
if (it == gStringTable.end())
{
// This string has not yet been added to the
// table. Add it, being sure to copy it in case
// the original was dynamically allocated and
// might later be freed
gStringTable[sid] = strdup(str);
}
return sid;
}
Another idea employed by the Unreal Engine is to wrap the string id and a pointer to the corresponding C-style character array in a tiny class. In the Unreal Engine, this class is called FName. At Naughty Dog we do the same and wrap our string ids in a StringId class. We define a macro so that SID(“any_string”) produces an instance of this class, with its hashed value produced by our user-defined string literal syntax “any_string”_sid.
虚幻引擎采用的另一个想法是将字符串 id 和指向相应 C 风格字符数组的指针包装在一个小类中。在虚幻引擎中,此类称为 FName 。在 Naughty Dog 中,我们也做了同样的事情,并将字符串 id 包装在 StringId 类中。我们定义一个宏,以便 SID(“any_string”) 生成此类的实例,其哈希值由用户定义的字符串文字语法 “any_string”_sid 生成。
When using string ids, the strings themselves are only kept around for human consumption. When you ship your game, you almost certainly won’t need the strings—the game itself should only ever use the ids. As such, it’s a good idea to store your string table in a region of memory that won’t exist in the retail game. For example, a PS3 development kit has 256 MiB of retail memory, plus an additional 256 MiB of “debug” memory that is not available on a retail unit. If we store our strings in debug memory, we needn’t worry about their impact on the memory footprint of the final shipping game. (We just need to be careful never to write production code that depends on the strings being available!)
当使用字符串 id 时,字符串本身仅保留供人类使用。当你发布你的游戏时,你几乎肯定不需要这些字符串——游戏本身应该只使用 id。因此,最好将字符串表存储在零售游戏中不存在的内存区域中。例如,PS3 开发套件具有 256 MiB 零售内存,加上零售设备上不提供的额外 256 MiB“调试”内存。如果我们将字符串存储在调试内存中,我们就不必担心它们对最终发布游戏的内存占用的影响。 (我们只需要小心,不要编写依赖于可用字符串的生产代码!)
Localization of a game (or any software project) is a big undertaking. It is a task best handled by planning for it from day one and accounting for it at every step of development. However, this is not done as often as we all would like. Here are some tips that should help you plan your game engine project for localization. For an in-depth treatment of software localization, see [34].
游戏(或任何软件项目)的本地化是一项艰巨的任务。这项任务的最佳处理方式是从第一天起就对其进行规划,并在开发的每一步中对其进行说明。然而,这种情况并没有像我们大家希望的那样经常发生。以下一些提示可以帮助您规划游戏引擎项目的本地化。有关软件本地化的深入处理,请参阅[34]。
The problem for most English-speaking software developers is that they are trained from birth (or thereabouts!) to think of strings as arrays of eight-bit ASCII character codes (i.e., characters following the ANSI standard). ANSI strings work great for a language with a simple alphabet, like English. But, they just don’t cut it for languages with complex alphabets containing a great many more characters, sometimes totally different glyphs than English’s 26 letters. To address the limitations of the ANSI standard, the Unicode character set system was devised.
大多数英语软件开发人员面临的问题是,他们从出生(或大约出生后!)就接受了将字符串视为八位 ASCII 字符代码数组(即遵循 ANSI 标准的字符)的训练。 ANSI 字符串非常适合具有简单字母的语言,例如英语。但是,他们只是不适合具有复杂字母表的语言,其中包含更多的字符,有时甚至与英语的 26 个字母完全不同的字形。为了解决 ANSI 标准的局限性,设计了 Unicode 字符集系统。
The basic idea behind Unicode is to assign every character or glyph from every language in common use around the globe to a unique hexadecimal code known as a code point. When storing a string of characters in memory, we select a particular encoding—a specific means of representing the Unicode code points for each character—and following those rules, we lay down a sequence of bits in memory that represent the string. UTF-8 and UTF-16 are two common encodings. You should select the specific encoding standard that best suits your needs.
Unicode 背后的基本思想是将全球常用语言中的每个字符或字形分配给一个称为代码点的唯一十六进制代码。当在内存中存储一串字符时,我们选择一种特定的编码(一种表示每个字符的 Unicode 代码点的特定方法),并遵循这些规则,我们在内存中放置表示该字符串的一系列位。 UTF-8和UTF-16是两种常见的编码。您应该选择最适合您需要的特定编码标准。
Please set down this book right now and read the article entitled, “The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)” by Joel Spolsky. You can find it here: http://www.joelonsoftware.com/articles/Unicode.html. (Once you’ve done that, please pick up the book again!)
请立即放下这本书,阅读 Joel Spolsky 撰写的题为“每个软件开发人员绝对必须了解 Unicode 和字符集的绝对最低要求(没有任何借口!)”的文章。您可以在这里找到它:http://www.joelonsoftware.com/articles/Unicode.html。 (完成后,请再次拿起这本书!)
The simplest Unicode encoding is UTF-32. In this encoding, each Unicode code point is encoded into a 32-bit (4-byte) value. This encoding wastes a lot of space, for two reasons: First, most strings in Western European languages do not use any of the highest-valued code points, so an average of at least 16 bits (2 bytes) is usually wasted per character. Second, the highest Unicode code point is 0x10FFFF, so even if we wanted to create a string that uses every possible Unicode glyph, we’d still only need 21 bits per character, not 32.
最简单的 Unicode 编码是 UTF-32。在此编码中,每个 Unicode 代码点都被编码为 32 位(4 字节)值。这种编码浪费了大量空间,原因有两个:首先,西欧语言中的大多数字符串不使用任何最高值的代码点,因此每个字符通常浪费至少 16 位(2 字节)。其次,最高的 Unicode 代码点是 0x10FFFF,因此即使我们想要创建一个使用所有可能的 Unicode 字形的字符串,每个字符仍然只需要 21 位,而不是 32 位。
That said, UTF-32 does have simplicity in its favor. It is a fixed-length encoding, meaning that every character occupies the same number of bits in memory (32 bits to be precise). As such, we can determine the length of any UTF-32 string by taking its length in bytes and dividing by four.
也就是说,UTF-32 确实具有简单性的优势。它是一种定长编码,意味着每个字符在内存中占用相同的位数(准确地说是 32 位)。因此,我们可以通过以字节为单位的长度除以四来确定任何 UTF-32 字符串的长度。
In the UTF-8 encoding scheme, the code points for each character in a string are stored using eight-bit (one-byte) granularity, but some code points occupy more than one byte. Hence the number of bytes occupied by a UTF-8 character string is not necessarily the length of the string in characters. This is known as a variable-length encoding, or a multibyte character set (MBCS), because each character in a string may take one or more bytes of storage.
在UTF-8编码方案中,字符串中每个字符的码点使用八位(一字节)粒度存储,但有些码点占用超过一个字节。因此,UTF-8 字符串占用的字节数不一定是该字符串的字符长度。这称为可变长度编码或多字节字符集 (MBCS),因为字符串中的每个字符可能占用一个或多个字节的存储空间。
One of the big benefits of the UTF-8 encoding is that it is backwards-compatible with the ANSI encoding. This works because the first 127 Unicode code points correspond numerically to the old ANSI character codes. This means that every ANSI character will be represented by exactly one byte in UTF-8, and a string of ANSI characters can be interpreted as a UTF-8 string without modification.
UTF-8 编码的一大好处是它向后兼容 ANSI 编码。这是可行的,因为前 127 个 Unicode 代码点在数字上对应于旧的 ANSI 字符代码。这意味着每个 ANSI 字符都将由 UTF-8 中的一个字节表示,并且一串 ANSI 字符可以不加修改地解释为 UTF-8 字符串。
To represent higher-valued code points, the UTF-8 standard uses multibyte characters. Each multibyte character starts with a byte whose most-significant bit is 1 (i.e., its value lies in the range 128–255, inclusive). Such high-valued bytes will never appear in an ANSI string, so there is no ambiguity when distinguishing between single-byte characters and multibyte characters.
为了表示更高值的代码点,UTF-8 标准使用多字节字符。每个多字节字符都以最高有效位为 1 的字节开头(即,其值在 128–255 范围内(含))。如此高值的字节永远不会出现在 ANSI 字符串中,因此在区分单字节字符和多字节字符时不会出现歧义。
The UTF-16 encoding employs a somewhat simpler, albeit more expensive approach. Each character in a UTF-16 string is represented by either one or two 16-bit values. The UTF-16 encoding is known as a wide character set (WCS) because each character is at least 16 bits wide, instead of the eight bits used by “regular” ANSI chars and their UTF-8 counterparts.
UTF-16 编码采用更简单但更昂贵的方法。 UTF-16 字符串中的每个字符都由一个或两个 16 位值表示。 UTF-16 编码被称为宽字符集 (WCS),因为每个字符至少有 16 位宽,而不是“常规”ANSI char 及其 UTF-8 对应项使用的 8 位宽。
In UTF-16, the set of all possible Unicode code points is divided into 17 planes containing 216 code points each. The first plane is known as the basic multilingual plane (BMP). It contains the most commonly used code points across a wide range of languages. As such, many UTF-16 strings can be represented entirely by code points within the first plane, meaning that each character in such a string is represented by only one 16-bit value. However, if a character from one of the other planes (known as supplementary planes) is required in a string, it is represented by two consecutive 16-bit values.
在 UTF-16 中,所有可能的 Unicode 代码点的集合被分为 17 个平面,每个平面包含 2 个 16 代码点。第一个平面称为基本多语言平面 (BMP)。它包含多种语言中最常用的代码点。因此,许多 UTF-16 字符串可以完全由第一平面内的代码点表示,这意味着此类字符串中的每个字符仅由一个 16 位值表示。然而,如果字符串中需要来自其他平面之一(称为补充平面)的字符,则它由两个连续的 16 位值表示。
The UCS-2 (2-byte universal character set) encoding is a limited subset of the UTF-16 encoding, utilizing only the basic multilingual page. As such, it cannot represent characters whose Unicode code points are numerically higher than 0xFFFF. This simplifies the format, because every character is guaranteed to occupy exactly 16 bits (two bytes). In other words, UCS-2 is a fixed-length character encoding, while in general UTF-8 and UTF-16 are variable-length encodings.
UCS-2(2 字节通用字符集)编码是 UTF-16 编码的有限子集,仅使用基本的多语言页面。因此,它不能表示 Unicode 代码点在数字上高于 0xFFFF 的字符。这简化了格式,因为每个字符都保证恰好占用 16 位(两个字节)。换句话说,UCS-2是定长字符编码,而一般情况下UTF-8和UTF-16是变长编码。
If we know a priori that a UTF-16 string only utilizes code points from the BMP (or if we are dealing with a UCS-2 encoded string), we can determine the number of characters in the string by simply dividing the number of bytes by two. Of course, if supplemental planes are used in a UTF-16 string, this simple “trick” no longer works.
如果我们先验知道 UTF-16 字符串仅使用 BMP 中的代码点(或者如果我们正在处理 UCS-2 编码的字符串),我们可以通过简单地除以字节数来确定字符串中的字符数两个。当然,如果在 UTF-16 字符串中使用补充平面,这个简单的“技巧”就不再起作用。
Note that a UTF-16 encoding can be little-endian or big-endian (see Section 3.3.2.1), depending on the native endianness of your target CPU. When storing UTF-16 text on-disc, it’s common to precede the text data with a byte order mark (BOM) indicating whether the individual 16-bit characters are stored in little- or big-endian format. (This is true of UTF-32 encoded string data as well, of course.)
请注意,UTF-16 编码可以是小端字节序或大端字节序(请参阅第 3.3.2.1 节),具体取决于目标 CPU 的本机字节序。在光盘上存储 UTF-16 文本时,通常会在文本数据前面加上字节顺序标记 (BOM),指示各个 16 位字符是以小端还是大端格式存储。 (当然,UTF-32 编码的字符串数据也是如此。)
char versus wchar_tchar 与 wchar_t The standard C/C++ library defines two data types for dealing with character strings—char and wchar_t. The char type is intended for use with legacy ANSI strings and with multibyte character sets (MBCS), including (but not limited to) UTF-8. The wchar_t type is a “wide” character type, intended to be capable of representing any valid code point in a single integer. As such, its size is compiler- and system-specific. It could be eight bits on a system that does not support Unicode at all. It could be 16 bits if the UCS-2 encoding is assumed for all wide characters, or if a multi-word encoding like UTF-16 is being employed. Or it could be 32 bits if UTF-32 is the “wide” character encoding of choice.
标准 C/C++ 库定义了两种处理字符串的数据类型 - char 和 wchar_t 。 char 类型旨在与旧版 ANSI 字符串和多字节字符集 (MBCS) 一起使用,包括(但不限于)UTF-8。 wchar_t 类型是“宽”字符类型,旨在能够表示单个整数中的任何有效代码点。因此,它的大小是特定于编译器和系统的。在根本不支持 Unicode 的系统上它可能是八位。如果假定所有宽字符都采用 UCS-2 编码,或者采用 UTF-16 等多字编码,则它可能是 16 位。或者,如果 UTF-32 是选择的“宽”字符编码,则它可以是 32 位。
Because of this inherent ambiguity in the definition of wchar_t, if you need to write truly portable string-handling code, you’ll need to define your own character data type(s) and provide a library of functions for dealing with whatever Unicode encoding(s) you need to support. However, if you are targeting a specific platform and compiler, you can write your code within the limits of that particular implementation, at the loss of some portability.
由于 wchar_t 定义中固有的模糊性,如果您需要编写真正可移植的字符串处理代码,您需要定义自己的字符数据类型并提供一个函数库处理您需要支持的任何 Unicode 编码。但是,如果您针对特定平台和编译器,则可以在该特定实现的限制内编写代码,但会损失一些可移植性。
The following article does a good job of outlining the pros and cons of using the wchar_t data type: http://icu-project.org/docs/papers/unicode_wchar_t.html.
以下文章很好地概述了使用 wchar_t 数据类型的优缺点:http://icu-project.org/docs/papers/unicode_wchar_t.html。
Under Windows, the wchar_t data type is used exclusively for UTF-16 encoded Unicode strings, and the char type is used for ANSI strings and legacy Windows code page string encodings. When reading the Windows API docs, the term “Unicode” is therefore always synonymous with “wide character set” (WCS) and UTF-16 encoding. This is a bit confusing, because of course Unicode strings can in general be encoded in the “non-wide” multibyte UTF-8 format.
在 Windows 下, wchar_t 数据类型专门用于 UTF-16 编码的 Unicode 字符串, char 类型用于 ANSI 字符串和旧版 Windows 代码页字符串编码。因此,在阅读 Windows API 文档时,术语“Unicode”始终与“宽字符集”(WCS) 和 UTF-16 编码同义。这有点令人困惑,因为 Unicode 字符串通常可以用“非宽”多字节 UTF-8 格式进行编码。
The Windows API defines three sets of character/string manipulation functions: one set for single-byte character set ANSI strings (SBCS), one set for multibyte character set (MBCS) strings, and one set for wide character set strings. The ANSI functions are essentially the old-school “C-style” string functions we all grew up with. The MBCS string functions handle a variety of multibyte encodings and are primarily designed for dealing with legacy Windows code pages encodings. The WCS functions handle Unicode UTF-16 strings.
Windows API 定义了三组字符/字符串操作函数:一组用于单字节字符集 ANSI 字符串 (SBCS),一组用于多字节字符集 (MBCS) 字符串,一组用于宽字符集字符串。 ANSI 函数本质上是伴随我们长大的老式“C 风格”字符串函数。 MBCS 字符串函数处理各种多字节编码,主要设计用于处理旧版 Windows 代码页编码。 WCS 函数处理 Unicode UTF-16 字符串。
Throughout the Windows API, a prefix or suffix of “w,” “wcs” or “W” indicates a wide character set (UTF-16) encoding; a prefix or suffix of “mb” indicates a multibyte encoding; and a prefix or suffix of “a” or “A,” or the lack of any prefix or suffix, indicates an ANSI or Windows code pages encoding. The C++ standard library uses a similar convention—for example, std::string is its ANSI string class, while std::wstring is its wide character equivalent. Unfortunately, the names of the functions aren’t always 100% consistent. This all leads to some confusion among programmers who aren’t in the know. (But you aren’t one of those programmers!) Table 6.2 lists some examples.
在整个 Windows API 中,前缀或后缀“w”、“wcs”或“W”表示宽字符集 (UTF-16) 编码;前缀或后缀“mb”表示多字节编码;前缀或后缀“a”或“A”,或者缺少任何前缀或后缀,表示 ANSI 或 Windows 代码页编码。 C++ 标准库使用类似的约定 - 例如, std::string 是其 ANSI 字符串类,而 std::wstring 是其等效的宽字符。不幸的是,函数的名称并不总是 100% 一致。这一切都会导致不了解情况的程序员感到困惑。 (但你不是那些程序员中的一员!)表 6.2 列出了一些示例。
Windows also provides functions for translating between ANSI character strings, multibyte strings and wide UTF-16 strings. For example, wcstombs() converts a wide UTF-16 string into a multibyte string according to the currently active locale setting.
Windows 还提供了在 ANSI 字符串、多字节字符串和宽 UTF-16 字符串之间进行转换的函数。例如, wcstombs() 根据当前活动的区域设置将宽 UTF-16 字符串转换为多字节字符串。
The Windows API uses a little preprocessor trick to allow you to write code that is at least superficially portable between wide (Unicode) and non-wide (ANSI/MBCS) string encodings. The generic character data type TCHAR is defined to be a typedef to char when building your application in “ANSI mode,” and it’s defined to be a typedef to wchar_t when building your application in “Unicode mode.” The macro _T() is used to convert an eight-bit string literal (e.g., char* s = “this is a string”;) into a wide string literal (e.g., wchar_t* s = L“this is a string”;) when compiling in “Unicode mode.” Likewise, a suite of “fake” API functions are provided that “automagically” morph into their appropriate 8-bit or 16-bit variant, depending on whether you are building in “Unicode mode” or not. These magic character-set-independent functions are either named with no prefix or suffix, or with a “t,” “tcs” or “T” prefix or suffix.
Windows API 使用一些预处理器技巧来允许您编写至少表面上可以在宽 (Unicode) 和非宽 (ANSI/MBCS) 字符串编码之间移植的代码。在“ANSI 模式”下构建应用程序时,通用字符数据类型 TCHAR 被定义为 typedef 到 char ,并且它被定义为 typedef 到 wchar_t 在“Unicode 模式”下构建应用程序时。宏 _T() 用于在编译时将八位字符串文字(例如 char* s = “this is a string”; )转换为宽字符串文字(例如 wchar_t* s = L“this is a string”; ) Unicode 模式。”同样,提供了一套“假”API 函数,可以“自动”变形为适当的 8 位或 16 位变体,具体取决于您是否在“Unicode 模式”下构建。这些与字符集无关的神奇函数的命名要么没有前缀或后缀,要么带有“t”、“tcs”或“T”前缀或后缀。
ANSI |
WCS |
MBCS |
|---|---|---|
|
|
|
|
|
|
|
|
|
Table 6.2. Variants of some common C standard library string functions for use with ANSI, wide and multibyte character sets.
表 6.2。一些常见 C 标准库字符串函数的变体,用于 ANSI、宽字符集和多字节字符集。
Complete documentation for all of these functions can be found on Microsoft’s MSDN website. Here’s a link to the documentation for strcmp() and its ilk, from which you can quite easily navigate to the other related string-manipulation functions using the tree view on the left-hand side of the page, or via the search bar: http://msdn2.microsoft.com/en-us/library/kk6xf663(VS.80).aspx.
所有这些功能的完整文档可以在 Microsoft 的 MSDN 网站上找到。这是 strcmp() 及其同类文档的链接,您可以使用页面左侧的树视图或通过搜索栏:http://msdn2.microsoft.com/en-us/library/kk6xf663(VS.80).aspx。
The Xbox 360 software development kit (XDK) uses WCS strings pretty much exclusively, for all strings—even for internal strings like file paths. This is certainly one valid approach to the localization problem, and it makes for very consistent string handling throughout the XDK. However, the UTF-16 encoding is a bit wasteful on memory, so different game engines may employ different conventions. At Naughty Dog, we use eight-bit char strings throughout our engine, and we handle foreign languages via a UTF-8 encoding. The choice of encoding is not particularly important, as long as you select one as early in the project as possible and stick with it consistently.
Xbox 360 软件开发工具包 (XDK) 几乎专门对所有字符串使用 WCS 字符串,甚至对于文件路径等内部字符串也是如此。这无疑是解决本地化问题的一种有效方法,并且它可以在整个 XDK 中实现非常一致的字符串处理。然而,UTF-16 编码有点浪费内存,因此不同的游戏引擎可能采用不同的约定。在 Naughty Dog,我们在整个引擎中使用八位 char 字符串,并通过 UTF-8 编码处理外语。编码的选择并不是特别重要,只要您在项目中尽早选择一种编码并始终坚持下去即可。
Even once you have adapted your software to use Unicode characters, there is still a host of other localization problems to contend with. For one thing, strings aren’t the only place where localization issues arise. Audio clips including recorded voices must be translated. Textures may have English words painted into them that require translation. Many symbols have different meanings in different cultures. Even something as innocuous as a no-smoking sign might be misinterpreted in another culture. In addition, some markets draw the boundaries between the various game-rating levels differently. For example, in Japan a Teen-rated game is not permitted to show blood of any kind, whereas in North America small red blood spatters are considered acceptable.
即使您已经将软件调整为使用 Unicode 字符,仍然需要解决许多其他本地化问题。一方面,字符串并不是唯一出现本地化问题的地方。包括录音在内的音频剪辑必须进行翻译。纹理中可能画有需要翻译的英文单词。许多符号在不同的文化中具有不同的含义。即使像禁止吸烟标志这样无害的东西在另一种文化中也可能会被误解。此外,一些市场对各种游戏评级级别之间的界限有不同的规定。例如,在日本,青少年级游戏不允许出现任何形式的血液,而在北美,少量的红色血液飞溅被认为是可以接受的。
Id |
English |
French |
|---|---|---|
p1score |
“Player 1 Score” |
“Joueur 1 Score” |
p2score |
“Player 2 Score” |
“Joueur 2 Score” |
p1wins |
“Player one wins!” |
“Joueur un gagne!” |
p2wins |
“Player two wins!” |
“Joueur deux gagne!” |
Table 6.3. Example of a string database used for localization.
表 6.3。用于本地化的字符串数据库示例。
For strings, there are other details to worry about as well. You will need to manage a database of all human-readable strings in your game, so that they can all be reliably translated. The software must display the proper language given the user’s installation settings. The formatting of the strings may be totally different in different languages—for example, Chinese is sometimes written vertically, and Hebrew reads right-to-left. The lengths of the strings will vary greatly from language to language. You’ll also need to decide whether to ship a single DVD or Blu-ray disc that contains all languages or ship different discs for particular territories.
对于字符串,还有其他细节需要担心。您需要管理游戏中所有人类可读字符串的数据库,以便可以可靠地翻译它们。软件必须根据用户的安装设置显示正确的语言。不同语言中字符串的格式可能完全不同,例如中文有时是垂直书写的,而希伯来语是从右到左阅读的。不同语言的字符串长度会有很大差异。您还需要决定是运送包含所有语言的单张 DVD 或蓝光光盘,还是针对特定地区运送不同的光盘。
The most crucial components in your localization system will be the central database of human-readable strings and an in-game system for looking up those strings by id. For example, let’s say you want a heads-up display that lists the score of each player with “Player 1 Score:” and “Player 2 Score:” labels and that also displays the text “Player 1 Wins” or “Player 2 Wins” at the end of a round. These four strings would be stored in the localization database under unique ids that are understandable to you, the developer of the game. So our database might use the ids “p1score,” “p2score,” “p1wins” and “p2wins,” respectively. Once our game’s strings have been translated into French, our database would look something like the simple example shown in Table 6.3. Additional columns can be added for each new language your game supports.
本地化系统中最重要的组件是人类可读字符串的中央数据库以及用于通过 ID 查找这些字符串的游戏内系统。例如,假设您想要一个平视显示器,其中列出了每个玩家的得分,并带有“玩家 1 得分:”和“玩家 2 得分:”标签,并且还显示文本“玩家 1 获胜”或“玩家 2 获胜” ” 在一轮结束时。这四个字符串将以您(游戏开发者)可以理解的唯一 ID 存储在本地化数据库中。因此,我们的数据库可能分别使用 ids“p1score”、“p2score”、“p1wins”和“p2wins”。一旦我们的游戏字符串被翻译成法语,我们的数据库将类似于表 6.3 中所示的简单示例。可以为您的游戏支持的每种新语言添加其他列。
The exact format of this database is up to you. It might be as simple as a Microsoft Excel worksheet that can be saved as a comma-separated values (CSV) file and parsed by the game engine or as complex as a full-fledged Oracle database. The specifics of the string database are largely unimportant to the game engine, as long as it can read in the string ids and the corresponding Unicode strings for whatever language(s) your game supports. (However, the specifics of the database may be very important from a practical point of view, depending upon the organizational structure of your game studio. A small studio with in-house translators can probably get away with an Excel spreadsheet located on a network drive. But a large studio with branch offices in Britain, Europe, South America and Japan would probably find some kind of distributed database a great deal more amenable.)
该数据库的确切格式由您决定。它可能像 Microsoft Excel 工作表一样简单,可以保存为逗号分隔值 (CSV) 文件并由游戏引擎解析,也可能像成熟的 Oracle 数据库一样复杂。字符串数据库的细节对于游戏引擎来说基本上并不重要,只要它可以读取游戏支持的任何语言的字符串 id 和相应的 Unicode 字符串即可。 (但是,从实际角度来看,数据库的细节可能非常重要,具体取决于游戏工作室的组织结构。拥有内部翻译人员的小型工作室可能可以使用位于网络驱动器上的 Excel 电子表格。但在英国、欧洲、南美和日本设有分支机构的大型工作室可能会发现某种分布式数据库更适合。)
At runtime, you’ll need to provide a simple function that returns the Unicode string in the “current” language, given the unique id of that string. The function might be declared like this:
在运行时,您需要提供一个简单的函数,在给定该字符串的唯一 ID 的情况下,返回“当前”语言的 Unicode 字符串。该函数可以这样声明:
wchar_t getLocalizedString(const char* id);
and it might be used like this:
它可以这样使用:
void drawScoreHud(const Vector3& score1Pos,
const Vector3& score2Pos)
{
renderer.displayTextOrtho(getLocalizedString(“p1score”),
score1Pos);
renderer.displayTextOrtho(getLocalizedString(“p2score”),
score2Pos);
// …
}
Of course, you’ll need some way to set the “current” language globally. This might be done via a configuration setting, which is fixed during the installation of the game. Or you might allow users to change the current language on the fly via an in-game menu. Either way, the setting is not difficult to implement; it can be as simple as a global integer variable specifying the index of the column in the string table from which to read (e.g., column one might be English, column two French, column three Spanish and so on).
当然,您需要某种方法来全局设置“当前”语言。这可以通过配置设置来完成,该配置设置在游戏安装期间固定。或者您可以允许用户通过游戏内菜单即时更改当前语言。不管怎样,这个设定实施起来并不困难;它可以像全局整数变量一样简单,指定要从中读取的字符串表中的列的索引(例如,第一列可能是英语,第二列可能是法语,第三列可能是西班牙语等)。
Once you have this infrastructure in place, your programmers must remember to never display a raw string to the user. They must always use the id of a string in the database and call the look-up function in order to retrieve the string in question.
一旦你有了这个基础设施,你的程序员必须记住永远不要向用户显示原始字符串。他们必须始终使用数据库中字符串的 ID 并调用查找函数才能检索有问题的字符串。
At Naughty Dog, we use a localization database that we developed in-house. The localization tool’s back end consists of a MySQL database located on a server that is accessible both to the developers within Naughty Dog and also to the various external companies with which we work to translate our text and speech audio clips into the various languages our games support. The front end is a web interface that “speaks” to the database, allowing users to view all of the text and audio assets, edit their contents, provide translations for each asset, search for assets by id or by content and so on.
在顽皮狗,我们使用我们内部开发的本地化数据库。本地化工具的后端由位于服务器上的 MySQL 数据库组成,顽皮狗的开发人员以及与我们合作将文本和语音音频剪辑翻译成我们游戏支持的各种语言的各个外部公司都可以访问该数据库。前端是一个与数据库“对话”的 Web 界面,允许用户查看所有文本和音频资产、编辑其内容、为每个资产提供翻译、按 ID 或内容搜索资产等。
MENU_NEWGAME.MENU_NEWGAME 的资产。
MENU_NEWGAME string.MENU_NEWGAME 字符串。In Naughty Dog’s localization tool, each asset is either a string (for use in the menus or HUD) or a speech audio clip with optional subtitle text (for use as in-game dialog or within cinematics). Each asset has a unique identifier, which is represented as a hashed string id (see Section 6.4.3.1). If a string is required for use in the menus or HUD, we look it up by its id and get back a Unicode (UTF-8) string suitable for display on-screen. If a line of dialog must be played, we likewise look up the audio clip by its id and use the data inengine to look up its corresponding subtitle (if any). The subtitle is treated just like a menu or HUD string, in that it is returned by the localization tool’s API as a UTF-8 string suitable for display.
在 Naughty Dog 的本地化工具中,每个资源要么是一个字符串(用于菜单或 HUD),要么是带有可选字幕文本的语音音频剪辑(用作游戏内对话或过场动画)。每个资产都有一个唯一的标识符,它表示为散列字符串 id(请参阅第 6.4.3.1 节)。如果菜单或 HUD 中需要使用字符串,我们会通过其 id 查找它并返回适合在屏幕上显示的 Unicode (UTF-8) 字符串。如果必须播放一行对话,我们同样可以通过其 id 查找音频剪辑,并使用引擎中的数据查找其相应的字幕(如果有)。副标题就像菜单或 HUD 字符串一样,由本地化工具的 API 作为适合显示的 UTF-8 字符串返回。
Figure 6.10 shows the main interface of the localization tool, in this case displayed in the Chrome web browser. In this image, you can see that the user has typed in the id MENU_NEWGAME in order to look up the string “NEW GAME” (used on the game’s main menu for launching a new game). Figure 6.11 shows the detailed view of the MENU_NEWGAME asset. If the user hits the “Text Translations” button in the upper-left corner of the asset details window, the screen shown in Figure 6.12 comes up, allowing the user to enter or edit the various translations of the string. Figure 6.13 shows another tab on the localization tool’s main page, this time listing audio speech assets. Finally, Figure 6.14 depicts the detailed asset view for the speech asset BADA_GAM_MIL_ESCAPE_OVERPASS_001 (“We missed all the action”), showing translations of this line of dialog into some of the supported languages.
图 6.10 显示了本地化工具的主界面,在本例中显示在 Chrome 网络浏览器中。在此图中,您可以看到用户输入了 ID MENU_NEWGAME 来查找字符串“NEW GAME”(在游戏的主菜单上用于启动新游戏)。图 6.11 显示了 MENU_NEWGAME 资源的详细视图。如果用户点击资产详细信息窗口左上角的“文本翻译”按钮,则会出现图 6.12 所示的屏幕,允许用户输入或编辑字符串的各种翻译。图 6.13 显示了本地化工具主页上的另一个选项卡,这次列出了音频语音资产。最后,图 6.14 描述了语音资产 BADA_GAM_MIL_ESCAPE_OVERPASS_001 的详细资产视图(“我们错过了所有操作”),显示了这行对话到一些受支持语言的翻译。
Game engines are complex beasts, and they invariably end up having a large number of configurable options. Some of these options are exposed to the player via one or more options menus in-game. For example, a game might expose options related to graphics quality, the volume of music and sound effects, or controller configuration. Other options are created for the benefit of the game development team only and are either hidden or stripped out of the game completely before it ships. For example, the player character’s maximum walk speed might be exposed as an option so that it can be fine-tuned during development, but it might be changed to a hard-coded value prior to ship.
游戏引擎是复杂的野兽,它们最终总是有大量的可配置选项。其中一些选项是通过游戏中的一个或多个选项菜单向玩家展示的。例如,游戏可能会公开与图形质量、音乐和声音效果的音量或控制器配置相关的选项。其他选项仅是为了游戏开发团队的利益而创建的,并且在游戏发布之前被隐藏或完全从游戏中删除。例如,玩家角色的最大步行速度可能会作为一个选项公开,以便可以在开发过程中对其进行微调,但它可能会在发布之前更改为硬编码值。
A configurable option can be implemented trivially as a global variable or a member variable of a singleton class. However, configurable options are not particularly useful unless their values can be configured, stored on a hard disk, memory card or other storage medium, and later retrieved by the game. There are a number of simple ways to load and save configuration options:
可配置选项可以简单地实现为全局变量或单例类的成员变量。然而,可配置选项并不是特别有用,除非它们的值可以被配置、存储在硬盘、存储卡或其他存储介质上并且随后由游戏检索。有许多简单的方法可以加载和保存配置选项:
BADA_GAM_MIL_ESCAPE_OVERPASS_001 (“We missed all the action”).BADA_GAM_MIL_ESCAPE_OVERPASS_001 的录制翻译的详细资产视图(“我们错过了所有操作”)。
Most game engines differentiate between global options and per-user options. This is necessary because most games allow each player to configure the game to his or her liking. It is also a useful concept during development of the game, because it allows each programmer, artist and designer to customize his or her work environment without affecting other team members.
大多数游戏引擎都会区分全局选项和每用户选项。这是必要的,因为大多数游戏允许每个玩家根据自己的喜好配置游戏。这在游戏开发过程中也是一个有用的概念,因为它允许每个程序员、美工和设计师自定义他或她的工作环境,而不会影响其他团队成员。
Obviously care must be taken to store per-user options in such a way that each player “sees” only his or her options and not the options of other players on the same computer or console. In a console game, the user is typically allowed to save his or her progress, along with per-user options such as controller preferences, in “slots” on a memory card or hard disk. These slots are usually implemented as files on the media in question.
显然,必须注意以这样的方式存储每个用户的选项,即每个玩家只能“看到”他或她的选项,而不是同一台计算机或控制台上其他玩家的选项。在主机游戏中,用户通常可以将他或她的进度以及每个用户的选项(例如控制器首选项)保存在存储卡或硬盘的“插槽”中。这些槽通常作为相关介质上的文件来实现。
On a Windows machine, each user has a folder under C:\Users containing information such as the user’s desktop, his or her My Documents folder, his or her Internet browsing history and temporary files and so on. A hidden subfolder named AppData is used to store per-user information on a per-application basis; each application creates a folder under AppData and can use it to store whatever per-user information it requires.
在 Windows 计算机上,每个用户在 C:\Users 下都有一个文件夹,其中包含用户的桌面、他或她的“我的文档”文件夹、他或她的 Internet 浏览历史记录和临时文件等信息。名为 AppData 的隐藏子文件夹用于存储每个应用程序的每个用户信息;每个应用程序都会在 AppData 下创建一个文件夹,并可以使用它来存储所需的任何每个用户信息。
Windows games sometimes store per-user configuration data in the registry. The registry is arranged as a tree, and one of the top-level children of the root node, called HKEY_CURRENT_USER, stores settings for whichever user happens to be logged on. Every user has his or her own subtree in the registry (stored under the top-level subtree HKEY_USERS), and HKEY_CURRENT_USER is really just an alias to the current user’s subtree. So games and other applications can manage per-user configuration options by simply reading and writing them to keys under the HKEY_CURRENT_USER subtree.
Windows 游戏有时会将每个用户的配置数据存储在注册表中。注册表被排列为一棵树,根节点的顶级子节点之一(称为 HKEY_CURRENT_USER )存储恰好登录的用户的设置。每个用户在注册表中都有自己的子树(存储在顶级子树 HKEY_USERS 下),而 HKEY_CURRENT_USER 实际上只是当前用户子树的别名。因此,游戏和其他应用程序可以通过简单地读取并将其写入 HKEY_CURRENT_USER 子树下的键来管理每个用户的配置选项。
In this section, we’ll take a brief look at how some real game engines manage their configuration options.
在本节中,我们将简要介绍一些真实的游戏引擎如何管理其配置选项。
The Quake family of engines uses a configuration management system known as console variables, or cvars for short. A cvar is just a floating-point or string global variable whose value can be inspected and modified from within Quake’s in-game console. The values of some cvars can be saved to disk and later reloaded by the engine.
Quake 系列引擎使用称为控制台变量(简称 cvar)的配置管理系统。 cvar 只是一个浮点或字符串全局变量,可以在 Quake 的游戏控制台中检查和修改其值。某些 cvar 的值可以保存到磁盘,稍后由引擎重新加载。
At runtime, cvars are stored in a global linked list. Each cvar is a dynamically allocated instance of struct cvar_t, which contains the variable’s name, its value as a string or float, a set of flag bits, and a pointer to the next cvar in the linked list of all cvars. Cvars are accessed by calling Cvar_Get(), which creates the variable if it doesn’t already exist and modified by calling Cvar_Set(). One of the bit flags, CVAR_ARCHIVE, controls whether or not the cvar will be saved into a configuration file called config.cfg. If this flag is set, the value of the cvar will persist across multiple runs of the game.
在运行时,cvar 存储在全局链表中。每个 cvar 都是动态分配的 struct cvar_t 实例,其中包含变量的名称、字符串或浮点数形式的值、一组标志位以及指向所有 cvar 的链接列表中下一个 cvar 的指针。 Cvar 通过调用 Cvar_Get() 来访问,如果变量尚不存在,则创建该变量并通过调用 Cvar_Set() 进行修改。位标志之一 CVAR_ARCHIVE 控制 cvar 是否保存到名为 config.cfg 的配置文件中。如果设置了此标志,则 cvar 的值将在游戏的多次运行中保持不变。
The OGRE rendering engine uses a collection of text files in Windows INI format for its configuration options. By default, the options are stored in three files, each of which is located in the same folder as the executable program:
OGRE 渲染引擎使用 Windows INI 格式的文本文件集合作为其配置选项。默认情况下,选项存储在三个文件中,每个文件与可执行程序位于同一文件夹中:
Out of the box, OGRE provides no mechanism for storing per-user configuration options. However, the OGRE source code is freely available, so it would be quite easy to change it to search for its configuration files in the user’s home directory instead of in the folder containing the executable. The Ogre::ConfigFile class makes it easy to write code that reads and writes brand new configuration files as well.
OGRE 开箱即用,不提供存储每用户配置选项的机制。然而,OGRE 源代码是免费提供的,因此很容易将其更改为在用户主目录中搜索其配置文件,而不是在包含可执行文件的文件夹中搜索。 Ogre::ConfigFile 类可以轻松编写读取和写入全新配置文件的代码。
Naughty Dog’s engine makes use of a number of configuration mechanisms.
Naughty Dog 的引擎使用了多种配置机制。
The Naughty Dog engine supports a powerful in-game menu system, allowing developers to control global configuration options and invoke commands. The data types of the configurable options must be relatively simple (primarily Boolean, integer and floating-point variables), but this limitation has not prevented the developers at Naughty Dog from creating literally hundreds of useful menu-driven options.
顽皮狗引擎支持强大的游戏内菜单系统,允许开发人员控制全局配置选项并调用命令。可配置选项的数据类型必须相对简单(主要是布尔型、整数和浮点变量),但这种限制并没有阻止 Naughty Dog 的开发人员创建数百个有用的菜单驱动选项。
Each configuration option is implemented as a global variable, or a member of a singleton struct or class. When the menu option that controls an option is created, the address of the variable is provided, and the menu item directly controls its value. As an example, the following function creates a submenu item containing some options for Naughty Dog’s rail vehicles (simple vehicles that ride on splines which have been used in pretty much every Naughty Dog game, from the “Out of the Frying Pan” jeep chase level in Uncharted: Drake’s Fortune to the truck convoy / jeep chase sequence in Uncharted 4). It defines menu items controlling three global variables: two Booleans and one floating-point value. The items are collected onto a menu, and a special item is returned that will bring up the menu when selected. Presumably the code calling this function adds this item to the parent menu that it is building.
每个配置选项都作为全局变量或单例结构或类的成员实现。当创建控制选项的菜单选项时,提供了变量的地址,菜单项直接控制其值。例如,以下函数创建一个子菜单项,其中包含顽皮狗轨道车辆的一些选项(骑在样条曲线上的简单车辆,几乎所有顽皮狗游戏中都使用了这些车辆,来自“Out of the Frying Pan”吉普车追逐关卡) 《神秘海域:德雷克的财富》中的卡车车队/《神秘海域 4》中的吉普车追逐场景)。它定义了控制三个全局变量的菜单项:两个布尔值和一个浮点值。这些项目被收集到菜单中,并且返回一个特殊项目,当选择该项目时将弹出菜单。据推测,调用此函数的代码会将此项目添加到它正在构建的父菜单中。
DMENU::ItemSubmenu * CreateRailVehicleMenu()
{
extern bool g_railVehicleDebugDraw2D;
extern bool g_railVehicleDebugDrawCameraGoals;
extern float g_railVehicleFlameProbability;
DMENU::Menu * pMenu
= new DMENU::Menu(“RailVehicle”);
pMenu->PushBackItem(
new DMENU::ItemBool(“Draw 2D Spring Graphs”,
DMENU::ToggleBool,
&g_railVehicleDebugDraw2D));
pMenu->PushBackItem(
new DMENU::ItemBool(“Draw Goals (Untracked)”, DMENU::ToggleBool,
&g_railVehicleDebugDrawCameraGoals));
DMENU::ItemFloat * pItemFloat;
pItemFloat = new DMENU::ItemFloat(
“FlameProbability”,
DMENU::EditFloat, 5, “%5.2f”,
&g_railVehicleFlameProbability);
pItemFloat->SetRangeAndStep(0.0f, 1.0f, 0.1f, 0.01f);
pMenu->PushBackItem(pItemFloat);
DMENU::ItemSubmenu * pSubmenuItem;
pSubmenuItem = new DMENU::ItemSubmenu(
“RailVehicle…”, pMenu);
return pSubmenuItem;
}
The value of any option can be saved by simply marking it with the circle button on the Dualshock joypad when the corresponding menu item is selected. The menu settings are saved in an INI-style text file, allowing the saved global variables to retain the values across multiple runs of the game. The ability to control which options are saved on a per-menu-item basis is highly useful, because any option that is not saved will take on its programmer-specified default value. If a programmer changes a default, all users will “see” the new value, unless of course a user has saved a custom value for that particular option.
当选择相应的菜单项时,只需用 Dualshock 手柄上的圆形按钮标记任何选项的值即可保存。菜单设置保存在 INI 样式的文本文件中,允许保存的全局变量在游戏的多次运行中保留值。控制按菜单项保存哪些选项的能力非常有用,因为任何未保存的选项都将采用程序员指定的默认值。如果程序员更改默认值,所有用户都将“看到”新值,除非用户已为该特定选项保存了自定义值。
The Naughty Dog engine scans the command line for a predefined set of special options. The name of the level to load can be specified, along with a number of other commonly used arguments.
顽皮狗引擎扫描命令行以查找一组预定义的特殊选项。可以指定要加载的级别的名称以及许多其他常用参数。
The vast majority of engine and game configuration information in the Naughty Dog engine (used to produce the Uncharted and The Last of Us series) is specified using a Lisp-like language called Scheme. Using a proprietary data compiler, data structures defined in the Scheme language are transformed into binary files that can be loaded by the engine. The data compiler also spits out header files containing C struct declarations for every data type defined in Scheme. These header files allow the engine to properly interpret the data contained in the loaded binary files. The binary files can even be recompiled and reloaded on the fly, allowing developers to alter the data in Scheme and see the effects of their changes immediately (as long as data members are not added or removed, as that would require a recompile of the engine).
Naughty Dog 引擎(用于制作《神秘海域》和《最后生还者》系列)中的绝大多数引擎和游戏配置信息都是使用一种名为“Scheme”的类 Lisp 语言指定的。使用专有的数据编译器,Scheme 语言中定义的数据结构被转换为可由引擎加载的二进制文件。数据编译器还为Scheme 中定义的每种数据类型生成包含C struct 声明的头文件。这些头文件允许引擎正确解释加载的二进制文件中包含的数据。二进制文件甚至可以动态重新编译和重新加载,允许开发人员更改Scheme中的数据并立即查看更改的效果(只要不添加或删除数据成员,因为这需要重新编译引擎) )。
The following example illustrates the creation of a data structure specifying the properties of an animation. It then exports three unique animations to the game. You may have never read Scheme code before, but for this relatively simple example it should be pretty self-explanatory. One oddity you’ll notice is that hyphens are permitted within Scheme symbols, so simple-animation is a single symbol (unlike in C/C++ where simple-animation would be the subtraction of two variables, simple and animation).
以下示例说明了指定动画属性的数据结构的创建。然后它将三个独特的动画导出到游戏中。您以前可能从未阅读过Scheme 代码,但对于这个相对简单的示例来说,它应该是非常不言自明的。您会注意到的一个奇怪之处是,Scheme 符号中允许使用连字符,因此 simple-animation 是单个符号(与 C/C++ 不同,其中 simple-animation 是两个变量的减法, simple )。
simple-animation.scm ;; Define a new data type called simple-animation. (deftype simple-animation () ( (name string) (speed float :default 1.0) (fade-in-seconds float :default 0.25) (fade-out-seconds float :default 0.25) ) ) ;; Now define three instances of this data structure… (define-export anim-walk (new simple-animation :name “walk” :speed 1.0 ) ) (define-export anim-walk-fast (new simple-animation :name “walk” :speed 2.0 ) ) (define-export anim-jump (new simple-animation :name “jump” :fade-in-seconds 0.1 :fade-out-seconds 0.1 ) )
This Scheme code would generate the following C/C++ header file:
此方案代码将生成以下 C/C++ 头文件:
simple-animation.h
// WARNING: This file was automatically generated from
// Scheme. Do not hand-edit.
struct SimpleAnimation
{
const char* m_name;
float m_speed;
float m_fadeInSeconds;
float m_fadeOutSeconds;
};
In-game, the data can be read by calling the LookupSymbol() function, which is templated on the data type returned, as follows:
在游戏中,可以通过调用 LookupSymbol() 函数来读取数据,该函数根据返回的数据类型进行模板化,如下所示:
#include “simple-animation.h”
void someFunction()
{
SimpleAnimation* pWalkAnim
= LookupSymbol<SimpleAnimation*>(
SID(“anim-walk”)); SimpleAnimation* pFastWalkAnim
= LookupSymbol<SimpleAnimation*>(
SID(“anim-walk-fast”));
SimpleAnimation* pJumpAnim
= LookupSymbol<SimpleAnimation*>(
SID(“anim-jump”));
// use the data here…
}
This system gives the programmers a great deal of flexibility in defining all sorts of configuration data—from simple Boolean, floating-point and string options all the way to complex, nested, interconnected data structures. It is used to specify detailed animation trees, physics parameters, player mechanics and so on.
该系统为程序员定义各种配置数据提供了极大的灵活性——从简单的布尔、浮点和字符串选项一直到复杂、嵌套、互连的数据结构。它用于指定详细的动画树、物理参数、玩家机制等。