2021 Edition
EVEN THOUGH .NET was announced in the year 2000, it is not becoming a grandfather technology. Instead, .NET keeps increasing developer traction since it has become open source and is available not only on Windows but also on Linux platforms. .NET can also run within the browser on the client—without the need to install a plugin—by using the WebAssembly standard.
As new enhancements for C# and .NET are coming, a focus lies not only on performance gains but also on ease of use. .NET more and more is a choice for new developers.
C# is also attractive for long-term developers. Every year, Stack Overflow asks developers about the most loved, dreaded, and wanted programming languages and frameworks. For several years, C# has been within the top 10 of the most loved programming languages. ASP.NET Core now holds the top position as the most loved web framework. .NET Core is number one in the most loved other frameworks/libraries/tools category. See
for details.https://insights.stackoverflow.com/survey/2020
When you use C# and ASP.NET Core, you can create web applications and services (including microservices) that run on Windows, Linux, and Mac. You can use the Windows Runtime to create native Windows apps using C#, XAML, and .NET. You can create libraries that you share between ASP.NET Core, Windows apps, and .NET MAUI. You can also create traditional Windows Forms and WPF applications.
Most of the samples of this book are built to run on a Windows or Linux system. Exceptions are the Windows app samples that run only on the Windows platform. You can use Visual Studio, Visual Studio Code, or Visual Studio for the Mac as the developer environment; only the Windows app samples require Visual Studio.
.NET has a long history; the first version was released in the year 2002. The new .NET generation with a complete rewrite of .NET (.NET Core 1.0 in the year 2016) is very young. Recently, many features from the old .NET version have been brought to .NET Core to ease the migration experience.
When creating new applications, there is no reason not to move to the new .NET versions. Whether old applications should stay with the old version of .NET or be migrated to the new one depends on the features used, how difficult the migration is, and what advantages you gain after the application is migrated. The best options here need to be considered with an application-by-application analysis.
The new .NET provides easy ways to create Windows and web applications and services. You can create microservices running in Docker containers in a Kubernetes cluster; create web applications; use the new OpenTelemetry standard to analyze distributed traces in a vendor-independent manner; create web applications returning HTML, JavaScript, and CSS; and create web applications returning HTML, JavaScript, and .NET binaries that run in the client's browser in a safe and standard way using WebAssembly. You can create Windows applications in traditional ways using WPF and Windows Forms and make use of modern XAML features and controls that support the fluent design with WinUI and mobile applications with .NET MAUI.
.NET uses modern patterns. Dependency injection is built into core services, such as ASP.NET Core and EF Core, which not only makes unit testing easier but also allows developers to easily enhance and change features from these technologies.
.NET runs on multiple platforms. Besides Windows and macOS, many Linux environments are supported, such as Alpine, CentOS, Debian, Fedora, openSUSE, Red Hat, SLES, and Ubuntu.
.NET is open source (
) and freely available. You can find meeting notes for the C# compiler (https://github.com/dotnet
), the source code for the C# compiler (https://github.com/dotnet/csharplang
), the .NET runtime and libraries (https://github.com/dotnet/Roslyn
), and ASP.NET Core (https://github.com/dotnet/runtime
) with Razor Pages, Blazor, and SignalR.https://github.com/dotnet/aspnetcore
Here's a summary of some of the features of the new .NET:
When C# was released in the year 2002, it was a language developed for the .NET Framework. C# was designed with ideas from C++, Java, and Pascal. Anders Hejlsberg had come to Microsoft from Borland and brought experience from the language development of Delphi. At Microsoft, Hejlsberg worked on Microsoft's version of Java, named J++, before creating C#.
C# started not only as an object-oriented general-purpose programming language but was a component-based programming language that supported properties, events, attributes (annotations), and building assemblies (binaries including metadata).
Over time, C# was enhanced with generics, Language Integrated Query (LINQ), lambda expressions, dynamic features, and easier asynchronous programming. C# is not an easy programming language because of the many features it offers, but it's continuously evolving with features that are practical to use. With this, C# is more than an object-oriented or component-based language; it also includes ideas of functional programming—things that are of practical use for a general-purpose language developing all kinds of applications.
Nowadays, a new version of C# is released every year. C# 8 added nullable reference types, and C# 9 added records and more. C# 10 is releasing with .NET 6 in 2021 and C# 11 will be released with .NET 7 in 2022. Because of the frequency of changes nowadays, check the GitHub repository for the book (read more in the section “Source Code”) for continuous updates.
Every year, a new version of C# is released, with many new features available in each version. The latest versions include features such as nullable reference types to reduce exceptions of type NullableReferenceException
and instead let the compiler help more; features to increase productivity such as indices and ranges; switch
expressions that make the switch
statement look old; features for using declarations; and enhancements with pattern matching. Top-level statements allow reducing the number of source code lines with small applications and records—classes where the compiler creates boilerplate code for equality comparison, deconstruction, and with
expressions. Code generators allow creating code automatically while the compiler runs. All these new features are covered in this book.
ASP.NET Core now contains new technology for creating web applications: Blazor Server and Blazor WebAssembly. With Blazor, you have a full-stack option to write C# code both for the client and for the server. With Blazor Server, the Razor components you create containing HTML and C# code run on the server. With Blazor WebAssembly, Razor components written with C# and HTML run on the client using the HTML 5 standard WebAssembly that allows you to run binary code in the browser, which is supported by all modern web browsers.
For creating services, you can now use gRPC with ASP.NET Core for binary communication between services. This is a great option for service-to-service communication to reduce the bandwidth needed, as well as CPU and memory usage if a lot of data transfer is needed.
For developing applications for Windows, a new technology combines the features of the Universal Windows Platform and desktop applications: WinUI 3. WinUI is the native UI platform for Windows 10 applications. With WinUI 3, you can use modern XAML code that includes compiled binding to create desktop applications. New controls with Microsoft's fluent design system are available. These controls are not delivered with the Windows Runtime as was previously the case with the Universal Windows Platform (UWP). These controls are developed independently of the Windows 10 version that allows you to use the newest controls with Windows 10 versions 1809 and above. As the roadmap available with WinUI shows, these new controls will be usable from WPF applications as well.
.NET runs on Windows, Linux, and Mac operating systems. You can create and build your programs on any of these operating systems using Visual Studio Code (
). You can build and run most of the samples on Windows or Linux and use the .NET development tools of your choice. Only the WinUI applications require you to use the Windows platform, and here, Visual Studio is the best option to use. The minimum version required to build and run the WinUI application is version 16.10.https://code.visualstudio.com
The command line plays an important part when using the .NET CLI and the Azure CLI; you can use the new Windows Terminal. With the newest Windows 10 versions, this terminal is delivered as part of Windows. With older versions, you can download it from the Microsoft Store.
Most .NET developers use the Windows platform as their development machine. When using the Windows Subsystem for Linux (WSL 2), you can build and run your .NET applications in a Linux environment, and you can install different Linux distributions from your Windows environment and access the same files. Visual Studio even allows debugging your .NET applications while they run in a Linux environment on WSL 2.
With some samples of the book, Microsoft Azure is shown as an optional hosting environment to run your web applications, use Azure Functions, and use Entity Framework Core to access SQL Server and Azure Cosmos DB. For this, you can use a free trial offering from Microsoft Azure; visit
to register.https://azure.microsoft.com/free
This book covers these four major parts:
Let's get into the different parts and all the chapters in more detail.
The first part of this book covers all the aspects of the C# programming language. You learn the syntax options and see how the C# syntax integrates with classes and interfaces from .NET. This part gives good grounding in the C# language. This section doesn't presume knowledge of any particular programming language, but it's assumed you are an experienced programmer. You start looking at C#'s basic syntax and data types before getting into advanced C# features.
switch
expressions.Span
type to access arrays, and use the new index and range operators to access arrays.async
and await
in action— not only with the task-based async pattern but also with async streams, which is a new feature since C# 8.IDisposable
interface with the using
statement and the new using
declaration but also demonstrates using the Span
type with managed and unmanaged memory. You can read about using Platform Invoke both with Windows and with Linux environments.Part II starts with creating custom libraries and NuGet packages, but the major topics covered with Part II are for using .NET libraries that are important for all application types.
Host
class is used to configure a dependency injection container and the built-in options to retrieve configuration information from a .NET application with different configuration providers, including Azure App Configuration and user secrets.Host
class to configure logging options. You also learn about reading metric information that's offered from some NET providers, using Visual Studio App Center, and extending logging for distributed tracing with OpenTelemetry.Task
class. In Chapter 17, more of the Task
class is shown, such as forming task hierarchies and using value tasks. The chapter goes into issues of parallel programming such as race conditions and deadlocks, and for synchronization, you learn about different features available with the lock
keyword, the Monitor
, SpinLock
, Mutex
, Semaphore
classes, and more.Span
type but also covers the new .NET JSON serializer with classes in the System.Text.Json
namespace.Socket
class and how to create applications using TCP and UDP. You also use the HttpClient
factory pattern to create HttpClient
objects with automatic retries if transient errors occur.Microsoft.Identity
platform for user authentication, and provides information on web security and what you need to be aware of with encoding issues as well as cross-site request forgery attacks.Part III of this book is dedicated to ASP.NET Core technologies for creating web applications and services, no matter whether you run these applications and services in your on-premises environment or in the cloud making use of Azure App Services, Azure Static Web Apps, or Azure Functions.
Part IV of this book is dedicated to XAML code and creating Windows applications with the native UI platform for Windows 10: WinUI. Much of the information you get here can also be applied to WPF applications and to .NET MAUI and developing XAML-based applications for mobile platforms.
To help you get the most from the text and keep track of what's happening, I use some conventions throughout the book.
As for styles in the text:
persistence.properties
.We present code in two different ways:
We use a monofont type with no highlighting for most code examples.
We use bold to emphasize code that's particularly important in the present context or to show changes from a previous code snippet.
As you work through the examples in this book, you may choose either to type all the code manually or to use the source code files that accompany the book. All the source code used in this book is available for download at www.wiley.com
. When at the site, simply locate the book's title (either by using the Search box or by using one of the title lists) and click the Download Code link on the book's detail page to obtain all the source code for the book.
After you download the code, just decompress it with your favorite compression tool.
The source code is also available on GitHub at
. With GitHub, you can also open each source code file with a web browser. When you use the website, you can download the complete source code in a zip file. You can also clone the source code to a local directory on your system. Just install the Git tools, which you can do with Visual Studio or by downloading the Git tools from https://www.github.com/ProfessionalCSharp/ProfessionalCSharp2021
for Windows, Linux, and Mac. To clone the source code to a local directory, use https://git-scm.com/downloads
git clone
:
> git clone https://www.github.com/ProfessionalCSharp/ProfessionalCSharp2021
With this command, the complete source code is copied to the subdirectory ProfessionalCSharp2021
. From there, you can start working with the source files.
As updates of .NET become available (until the next edition of the book will be released), the source code will be updated on GitHub. Check the readme.md
file in the GitHub repo for updates. If the source code changes after you cloned it, you can pull the latest changes after changing your current directory to the directory of the source code:
> git pull
In case you've made some changes on the source code, git pull
might result in an error. If this happens, you can stash away your changes and pull again:
> git stash
> git pull
The complete list of git commands is available at
.https://git-scm.com/docs
In case you have questions on the source
code, use discussions with the GitHub repository. If you find an error with the source code, create an issue. Open
in the browser, click the Issues tab, and click the New Issue button. This opens an editor. Just be as descriptive as possible to describe your issue.https://github.com/ProfessionalCSharp/ProfessionalCSharp2021
For reporting issues, you need a GitHub account. If you have a GitHub account, you can also fork the source code repository to your account. For more information on using GitHub, check
.https://guides.github.com/activities/hello-world
We make every effort to ensure that there are no errors in the text or in the code. However, no one is perfect, and mistakes do occur. If you find an error in one of our books, like a spelling mistake or faulty piece of code, we would be grateful for your feedback. By sending in errata, you may save another reader hours of frustration, and at the same time you can help provide even higher-quality information.
To find the errata page for this book, go to www.wiley.com
and locate the title using the Search box or one of the title lists. Then, on the book details page, click the Book Errata link. On this page, you can view all errata that have been submitted for this book and posted by the book's editors.
If you don't spot “your” error on the Book Errata page, go to
for information about how to send us the error you have found. We'll check the information and, if appropriate, post a message to the book's errata page and fix the problem in subsequent editions of the book.https://support.wiley.com/s/article/reporting-a-wiley-book-error
The first version of .NET was released in 2002. Since the first version, many things have changed. The first era of .NET was the .NET Framework that offered Windows Forms for Windows desktop development and Web Forms to create web applications. This version of .NET was available only for Microsoft Windows. At that time, Microsoft also invented a standard for C# at ECMA (https://www.ecma-international.org/publications/standards/Ecma-334.htm
).
Later, Silverlight used a subset of this technology with a limited library and runtime running in browsers using a browser add-in. At that time, the company Ximian developed the Mono runtime. This runtime was available for Linux and Android and offered a subset of Microsoft .NET’s functionality. Later, Novell bought Ximian, and Novell was later bought by The Attachmate Group. As the new organization lost interest in .NET, Miguel de Icaza (the founder of Ximian) started Xamarin and took the interesting .NET parts into his new organization to start .NET for Android and iOS. Nowadays, Xamarin belongs to Microsoft, and the Mono runtime is part of the dotnet runtime repo (https://github.com/dotnet/runtime
).
Silverlight started .NET development for other devices with different form factors, which have different needs for .NET. Silverlight was not successful in the long term because HTML5 offered features that previously only were available by using browser add-ins. However, Silverlight started moving .NET in other directions that resulted in .NET Core.
.NET Core was the biggest change to .NET since its inception. .NET code became open-source, you could create apps for other platforms, and the new code base of .NET is using modern design patterns. The next step is a logical move: the version of .NET after .NET Core 3.1 is .NET 5. The Core name is removed, and version 4 was skipped to send a message to .NET Framework developers that there's a higher version than .NET Framework 4.8, and it's time to move to .NET 5 for creating new applications.
For developers using .NET Core, the move is an easy one. With existing applications, usually all that needs to be changed is the version number of the target framework. Moving applications from the .NET Framework is not that easy and might require bigger changes. Depending on the application type, more or less change is needed. .NET Core 3.x supports WPF and Windows Forms applications. With these application types, the change can be easy. However, existing .NET Framework WPF applications may have features that cannot be moved that easily to the new .NET. For example, application domains are not supported with .NET Core and .NET 5. Moving Windows Communication Foundation (WCF) services to .NET 5 is not at all easy. The server part of WCF is not supported in the new .NET era. The WCF part of the application needs to be rewritten to ASP.NET Core Web API, gRPC, or another communication technology that fulfills the needs.
With existing applications, it can be useful to stay with the .NET Framework instead of changing to the new .NET because the old framework will still be maintained for many years to come. The .NET Framework is installed with Windows 10, and support for the .NET Framework has a long target that is bound to the support of the Windows 10 versions.
The new .NET and NuGet packages allow Microsoft to provide faster update cycles for delivering new features. It's not easy to decide what technology should be used for creating applications. This chapter helps you with that decision. It gives you information about the different technologies available for creating Windows and web apps and services, offers guidance on what to choose for database access, and helps with moving from old technologies to new ones. You'll also read about the .NET tooling that you can use with the code samples through all the chapters of this book.
Before digging deeper, you should understand concepts and some important .NET terms, such as what's in the .NET SDK and what the .NET runtime is. You also should get a better understanding of the .NET Framework and .NET, when to use the .NET Standard, and the NuGet packages and .NET namespaces.
For developing .NET applications, you need to install the .NET SDK. The SDK contains the .NET command-line interface (CLI), tools, libraries, and the runtime. With the .NET CLI, you can create new applications based on templates, restore packages, build and test the application, and create deployment packages. Later in this chapter in the section “.NET CLI,” you will see how to create and build applications.
If you use Visual Studio 2019, the .NET SDK is installed as part of Visual Studio. If you don't have Visual Studio, you can install the SDK from https://dot.net
. Here, you can find instructions on how to install the SDK on Windows, Mac, and Linux systems.
You can install multiple versions of the .NET SDK in parallel. The command
> dotnet --list-sdks
shows all the different SDK versions that are installed on the system. By default, the latest version is used.
You can create a global.json
file if you do not want to use the latest version of the SDK. The command
> dotnet new globaljson
creates the file global.json
in the current directory. This file contains the version element with the version number currently used. You can change the version number to one of the other SDK versions that is installed:
{
"sdk": {
"version": "5.0.202"
}
}
In the directory of global.json
and its subdirectories, the specified SDK version is used. You can verify this with
> dotnet --version
On the target system, the .NET SDK is not required. Here you just need to install the .NET runtime. The runtime includes all the core libraries and the dotnet driver.
The dotnet driver is used to run the application—for example, the Hello, World application with
> dotnet hello-world.dll
At https://dot.net
, you can find not only instructions to download and install the SDK on different platforms but also the runtime.
Instead of installing the runtime on the target system, you also can deliver the runtime as part of the application (which is known as self-contained deployment). This technique is very different from older .NET Framework applications and is covered later in the chapter in the “Using the .NET CLI” section.
To see which runtimes are installed, you can use
> dotnet --list-runtimes
The C# compiler compiles C# code to Microsoft Intermediate Language (IL) code. This code is a little bit like assembly code, but it has more object-oriented features. The IL code is run by the Common Language Runtime (CLR). What's done by a CLR?
The IL code is compiled to native code by the CLR. The IL code available in .NET assemblies is compiled by a Just-In-Time (JIT) compiler. This compiler creates platform-specific native code. The runtime includes a JIT compiler named RyuJIT. This compiler is not only faster than the previous one, but it also has better support for using Edit & Continue while you're debugging the application with Visual Studio.
The runtime also includes a type system with a type loader that is responsible for loading types from assemblies. Security infrastructure with the type system verifies whether certain type system structures are permitted—for example, with inheritance.
After instances of types are created, they also need to be destroyed, and memory needs to be recycled. Another feature of the runtime is the garbage collector. The garbage collector cleans up memory from objects that are no longer referenced in the managed heap.
The runtime is also responsible for threading. When you are creating a managed thread from C#, it is not necessarily a thread from the underlying operating system. Threads are virtualized and managed by the runtime.
The C# compiler that's installed as part of the SDK belongs to the .NET Compiler Platform, which is also known by the code name Roslyn. Roslyn allows you to interact with the compilation process, work with syntax trees, and access the semantic model that is defined by language rules. You can use Roslyn to write code analyzers and refactoring features. You also can use Roslyn with a new feature of C# 9, code generators, which are discussed in Chapter 12, “Reflection, Metadata, and Source Generators.”
The .NET Framework is the name of the old .NET. The last version available is .NET Framework 4.8. It's not that useful to create new applications with this framework, but of course you can maintain existing applications because this technology will still be supported for many years to come. If existing applications don't get any advantages by moving to new technologies and there's not a lot of maintenance going on, there's no need to switch in the short term.
Depending on the technologies used with existing applications, the switch to .NET can be easy. WPF and Windows Forms have been offered with newer technologies since .NET Core 3. However, WPF and Windows applications could have used features where the application architecture might need a change.
Examples of technologies that are no longer offered with new versions of .NET are ASP.NET Web Forms, Windows Communication Foundation (WCF), and Windows Workflow Foundation (WF). Instead of ASP.NET Web Forms, you can rewrite applications using ASP.NET Blazor. Instead of WCF, you can use ASP.NET Core Web API or gRPC. Instead of WF, moving to Azure Logic Apps might be useful.
.NET Core is the new .NET that is used by all new technologies and is a main focus of this book (with the new name .NET). This framework is open source, and you can find it at http://www.github.com/dotnet
. The runtime is the CoreCLR repository; the framework containing collection classes, file system access, console, XML, and a lot more is in the CoreFX repository.
Unlike the .NET Framework, where the specific version you needed for the application had to be installed on the system, with .NET Core 1.0, the framework, including the runtime, is delivered with the application. Previously, there were times when you might have had problems deploying an ASP.NET web application to a shared server because the provider had older versions of .NET installed; those times are gone. Now you can deliver the runtime with the application, and you are not dependent on the version installed on the server.
.NET Core is designed in a modular approach. The framework is split into a large list of NuGet packages. So that you don't have to deal with all the packages, metapackages reference the smaller packages that work together. This even improved with .NET Core 2.0 and ASP.NET Core 2.0. With ASP.NET Core 2.0, you just need to reference Microsoft.AspNetCore.All
to get all the packages you typically need with ASP.NET Core web applications.
.NET Core can be updated at a fast pace. Even updating the runtime doesn't influence existing applications because the runtime can be installed with the applications. Now Microsoft can improve .NET Core, including the runtime, with faster release cycles.
Starting with .NET 5, .NET Core has a new name: .NET. Removing “Core” from the name should tell developers who still use the .NET Framework that there's not a new version of the .NET Framework from now on. The .NET Framework is no longer receiving new features. For new applications, you should use .NET.
.NET Standard is an important specification when creating and using libraries. .NET Standard offers a contract rather than an implementation. With this contract, available APIs are listed. With every new version of .NET Standard, new APIs are added. APIs are never removed. For example, .NET Standard 2.1 lists more APIs than .NET Standard 1.6.
When you're creating a library, you probably want to use as many APIs as possible, so I suggest you choose the most recent .NET Standard version. However, the highest standard version also means the lowest number of platforms that support this standard, so you may need to take that into consideration.
A table at https://docs.microsoft.com/dotnet/standard/net-standard
gives you the details on what platform supports which version of the standard. For example, .NET Framework 4.6.1 and later support up to .NET Standard 2.0. In addition, .NET Core 3.0 and later (which includes .NET 5 and later) support .NET Standard 2.1. The Universal Windows Platform build 10.0.16299 supports .NET Standard 2.0. Xamarin.Android 10.0 supports .NET Standard 2.1.
As of .NET 5, the .NET Standard becomes irrelevant. If you're creating libraries with .NET 5, you can use libraries from .NET 5, .NET 6, and later applications. Similarly, when you're creating libraries with .NET 7, you can use libraries from applications written with .NET 7 and later.
However, we can't expect that the .NET Framework, Mono, and other older technologies will just fade away, so .NET Standard will still be needed for many years to come. If you need to support older technologies with your libraries, you'll still need .NET Standard.
In the early days, assemblies were reusable units with applications. That use is still possible (and necessary with some assemblies) when you're adding a reference to an assembly for using the public types and methods from your own code. However, using libraries can mean a lot more than just adding a reference and using it. Using libraries can also mean making some configuration changes or using scripts to take advantage of some features. The target framework determines which binaries you can use. These are reasons to package assemblies within NuGet packages, which are zip files that contain the assembly (or multiple assemblies) as well as configuration information and PowerShell scripts.
Another reason for using NuGet packages is that they can be found easily; they're available not only from Microsoft but also from third parties. NuGet packages are easily accessible on the NuGet server at https://www.nuget.org
.
You can add NuGet packages to applications with the .NET CLI:
> dotnet add package <package-name>
From the references within a Visual Studio project, you can open the NuGet Package Manager (see Figure 1-1). There you can search for packages and add them to the application. This tool enables you to search for packages that are not yet released (including prerelease options) and define the NuGet server that should be searched for packages. One place to search for packages can be your own shared directory where you've placed your internal packages that you've used.
The classes available with .NET are organized in namespaces. Most of these namespaces start with the name System
or Microsoft
. The following table describes a few of the namespaces to give you an idea about the hierarchy:
NAMESPACE | DESCRIPTION |
---|---|
System.Collections |
This is the root namespace for collections. Collections are also found within subnamespaces such as System.Collections.Concurrent and System.Collections.Generic . |
System.Diagnostics |
This is the root namespace for diagnostics information, such as event logging and tracing (in the namespace System.Diagnostics.Tracing ). |
System.Globalization |
This is the namespace that contains classes for globalization and localization of applications. |
System.IO |
This is the namespace for File input/output (I/O), which includes classes that access files and directories. Readers, writers, and streams are here. |
System.Net |
This is the namespace for core networking, such as accessing DNS servers and creating sockets with System.Net.Sockets . |
System.Threading |
This is the root namespace for threads and tasks. Tasks are defined within System.Threading.Tasks . |
Microsoft.Data |
This is the namespace for accessing databases. Microsoft.Data.SqlClient contains classes that access the SQL Server. The previous classes from System.Data have been repackaged into Microsoft.Data . |
Microsoft.Extensions.DependencyInjection |
This is the namespace for the Microsoft DI container that is part of .NET. |
Microsoft.EntityFrameworkCore |
To access relational and NoSQL databases, Entity Framework Core can be used. Types are defined in this namespace. |
When you're working in the new era of .NET, you should know about versions with different support cycles. .NET releases differ based on a Current or Long-Term Support LTS moniker. LTS versions are supported at least three years, or for one year after the next LTS version is available. If for example, the next LTS version is available 2.5 years after the previous one was released, and the previous one has a support length of 3.5 years. Current versions are supported for only three months after the next version is available. At the time of this writing, .NET Core 2.2 and 3.0 are current versions that are already no longer supported with security and hot fixes, whereas .NET Core 2.1 and 3.1 are LTS versions that still have support. The following table lists the .NET Core and .NET versions with their release dates, support level, and end-of-life dates:
.NET CORE/.NET VERSION | RELEASE DATE | SUPPORT LEVEL | END OF LIFE |
---|---|---|---|
1.0 | June 27, 2016 | LTS | June 27, 2019 |
1.1 | Nov. 16, 2016 | LTS* | June 27, 2019 |
2.0 | Aug. 14, 2017 | Current | Oct. 1, 2018 |
2.1 | May 30, 2018 | LTS | Aug. 21, 2021 |
2.2 | Dec. 4, 2018 | Current | Dec. 23, 2019 |
3.0 | Sep. 23, 2019 | Current | Mar. 3, 2020 |
3.1 | Dec. 3, 2019 | LTS | Dec. 3, 2022 |
5.0 | Nov. 10, 2020 | Current | around Feb. 2022 |
6.0 | Nov. 2021 | LTS | Nov. 2024 |
7.0 | Nov. 2022 | Current | Feb. 2024 or earlier in case minor versions are released |
8.0 | Nov. 2023 | LTS | Nov. 2026 |
Starting with .NET 5, the versions become more predictable. Every year in November, a new major release is available. Every second year, the release is an LTS version.
Depending on the environment you're working in, you might decide to use LTS or Current versions. With current versions, you get new features faster, but you need to upgrade to newer versions more often. While the application is in its active development stage, you might decide to use the current version. As your application is becoming more stable, you can switch to the next LTS version.
If you already started development with continuous integration/continuous delivery (CI/CD), it can be an easy task to use only current versions and receive new features faster.
You can use C# to create console applications; with most code samples in the first chapters of this book, you'll do that exact thing. For many programs, console applications are not used that often. You can use C# to create applications that use many of the technologies associated with .NET. This section gives you an overview of the different types of applications that you can write in C#.
Before taking a look at the application types themselves, let's look at technologies that are used by all application types for access to data.
Files and directories can be accessed by using simple API calls; however, the simple API calls are not flexible enough for some scenarios. With the Stream API, you have a lot of flexibility, and the streams offer many more features, such as encryption or compression. Readers and writers make using streams easier. All of the different options available here are covered in Chapter 18, “Files and Streams.”
To read and write to databases, you can use an abstraction layer, Entity Framework Core (Chapter 21, “Entity Framework Core”). Entity Framework Core offers a mapping of object hierarchies to the relations of a database. EF Core not only offers using different relational databases but also has support for NoSQL databases, such as Azure Cosmos DB.
For creating Windows apps, you can use the new UI control WinUI 3.0 to create either Universal Windows Platform (UWP) or Windows desktop applications. UWP applications make use of a sandboxed environment where the application needs to request permissions from the user depending on the APIs used. The desktop application version can be compared to a WPF and Windows Forms application where nearly all .NET 5 APIs can be used. WPF and Windows Forms applications can also be updated to use new modern WinUI controls.
Creating WinUI applications with XAML code using the MVVM pattern is covered in Chapter 30, “Patterns with XAML Apps,” and the chapters that follow it.
For creating web applications with .NET, several options are available. A technology that implements the Model-View-Controller (MVC) pattern with the application structure is ASP.NET Core MVC. If you have an existing .NET Framework ASP.NET MVC application, the move to ASP.NET Core MVC shouldn't be too hard.
ASP.NET Core Razor Pages provide an easier option compared to the MVC pattern. Razor Pages can use code-behind or mix the C# code with the HTML page. This solution is easier to start with, and it also can be used with MVC. The dependency injection features of Razor Pages make it easy to create reusable code.
ASP.NET Core Blazor is a new technology that is used to get rid of JavaScript code. With a server-side variant, user interface events are handled on the server. The client and server are continuously connected using SignalR behind the scenes. Another variant of Blazor is using WebAssembly on the client. With this, you can use C#, HTML, and CSS to write code running binary in the client. Because WebAssembly is an HTML 5 standard, Blazor runs in all modern browsers without the need for an add-in.
The original introduction of ASP.NET fundamentally changed the web programming model. ASP.NET Core changed it again. ASP.NET Core allows the use of .NET Core for high performance and scalability, and it runs not only on Windows but also on Linux systems.
With ASP.NET Core, ASP.NET Web Forms is no longer covered. (ASP.NET Web Forms can still be used and is updated with .NET 4.7.)
ASP.NET Core MVC is based on the well-known MVC pattern for easier unit testing. It also allows a clear separation for writing user interface code with HTML, CSS, and JavaScript, and it uses C# on the back end.
SOAP and WCF fulfilled their duties in the past. Modern apps make use of Representational State Transfer (REST) and the Web API. Using ASP.NET Core to create a Web API is an option that is a lot easier for communication and fulfills more than 90 percent of requirements by distributed applications. This technology is based on REST, which defines guidelines and best practices for stateless and scalable web services.
The client can receive JSON or XML data. JSON and XML can also be formatted in a way to make use of the Open Data (OData) specification.
The features of this new API make it easy to consume from web clients using JavaScript, .NET, and other technologies.
Creating a Web API is a good approach for creating microservices. The approach to build microservices defines smaller services that can run and be deployed independently and have their own control of a data store.
To describe the services, a new standard has been developed—the OpenAPI (https://www.openapis.org
), which has its roots with Swagger (https://swagger.io/
).
For remote procedure calls (RPC) like communication, you can use gRPC, which offers a binary communication based on HTTP/2 that can be used across different platforms.
For real-time web functionality and bidirectional communication between the client and the server, SignalR is an ASP.NET Core technology. SignalR allows pushing information to connected clients as soon as information is available. SignalR makes use of the WebSocket technology to push information.
Nowadays, you can't ignore the cloud when considering the development picture. Although this book doesn't include a dedicated chapter on cloud technologies, Microsoft Azure is referenced in several chapters in this book.
Microsoft Azure offers software as a service (SaaS), infrastructure as a service (IaaS), platform as a service (PaaS), and functions as a service (FaaS), and sometimes offerings are in between these categories. Let's take a look at some Microsoft Azure offerings.
SaaS offers complete software; you don't have to deal with management of servers, updates, and so on. Office 365 is one of the SaaS offerings for using email and other services via a cloud offering. A SaaS offering that's relevant for developers is Azure DevOps Services. Azure DevOps Services is the cloud version of Azure DevOps Server (previously known as Team Foundation Server) that can be used for private and public code repository, for tracking bugs and work items, and for building and testing services. Another offering from Microsoft in this category is GitHub, which is just enhanced to receive many features from Azure DevOps.
Another service offering is IaaS. Virtual machines are included in this service offering. You are responsible for managing the operating system and maintaining updates. When you create virtual machines, you can decide between different hardware offerings starting with shared cores up to 416 cores (at the time of this writing, but things change quickly). The M-Series of machines include 416 cores, 11.4TB RAM, and 8TB local SSD.
With preinstalled operating systems, you can decide between Windows, Windows Server, Linux, and operating systems that come preinstalled with SQL Server, BizTalk Server, SharePoint, Oracle, and many other products.
I use virtual machines often for environments that I need only for several hours a week because the virtual machines are paid on an hourly basis. If you want to try compiling and running .NET Core programs on Linux but don't have a Linux machine, installing such an environment on Microsoft Azure is an easy task.
For developers, the most relevant part of Microsoft Azure is platform as a service (PaaS). You can access services for storing and reading data, use computing and networking capabilities of app services, and integrate developer services within the application.
For storing data in the cloud, you can use a relational data store SQL Database. SQL Database is nearly the same as the on-premise version of SQL Server. There are also some NoSQL solutions, such as Cosmos DB, with different store options such as JSON data, relationships, or table storage, and Azure Storage that stores blobs (for example, for images or videos).
App Services can be used to host your web apps and API apps that you create with ASP.NET Core.
Along with the previously introduced Visual Studio Team Services, another part of the Developer Services in Microsoft Azure is Application Insights. With faster release cycles, it's becoming more and more important to get information about how the user uses the app. What menus are never used because the users probably can't find them? What paths in the app does the user take to accomplish tasks? With Application Insights, you can get good anonymous user information to find out the issues users have with the application, and, with DevOps in place, you can do quick fixes.
You can also use Cognitive Services that offer functionality to process images, use Bing Search APIs, understand what users say with Language services, and more.
FaaS, also known with the category name Azure serverless, is a new concept for cloud service. Of course, behind the scenes there's always a server. You just don't pay for reserved CPU and memory because they're handled with AppServices that are used from web apps. Instead, you pay based on consumption—the number of calls done with some limitations on the memory and time needed for the activity. Azure Functions is one technology that can be deployed using FaaS.
For development, you need an SDK to build your applications and test them, and you need a code editor. Some other tools can help, such as a Linux environment on your Windows system and an environment to run Docker images. Let's get into some practical tools.
For development, you need the .NET SDK. If you're using Visual Studio for development, the .NET SDK is installed with Visual Studio. If you're using a different environment or you want to install different versions that are not part of the Visual Studio installation, you can get downloads for the SDK from https://dot.net
. Here you can download and install distributions of the SDK for different platforms.
Part of the SDK is the .NET CLI—the command-line interface to develop .NET applications. You can use the .NET CLI to create new applications, compile applications, run unit tests, create NuGet packages, and create the files you need for publishing. Other than that, you can use any editor such as Notepad to write the code. Of course, if you have access to other tools that offer IntelliSense, using them makes it easier to run and debug your applications.
A tour of the .NET CLI is given later in this chapter in the section “Using the .NET CLI.”
Visual Studio Code is a lightweight editor available not only on Windows but also on Linux and macOS. The community created a huge number of extensions that make Visual Studio Code the preferred environment for many technologies.
With many chapters of this book, you can use Visual Studio Code as your development editor. What you currently can't do is create WinUI and Xamarin applications. You can use Visual Studio Code for .NET Core console applications and ASP.NET Core web applications.
You can download Visual Studio Code from http://code.visualstudio.com
.
This edition of Visual Studio is a free edition with features that the Professional edition previously had, but there's a license restriction for when it can be used. It's free for open-source projects and training and to academic and small professional teams. Unlike the Express editions of Visual Studio that previously have been the free editions, this product allows using extensions with Visual Studio.
Visual Studio Professional includes more features than the Community edition, such as the CodeLens and Team Foundation Server for source code management and team collaboration. With this edition, you also get a subscription that includes several server products from Microsoft for development and testing, as well as a free amount that you can use with Microsoft Azure for development and testing.
Unlike the Professional edition, Visual Studio Enterprise contains a lot of tools for testing, such as Live Unit Testing, Microsoft Fakes (unit test isolation), and IntelliTest (unit testing is part of all Visual Studio editions). With Code Clone you can find similar code in your solution. Visual Studio Enterprise also contains architecture and modeling tools to analyze and validate the solution architecture.
Visual Studio for Mac originated in the Xamarin Studio, but now it has a lot more than the earlier product. The actual version of Visual Studio for Mac is using the same source code for the editor that is available with the Windows version of Visual Studio. With Visual Studio for Mac, you can create not only Xamarin apps but also ASP.NET Core apps that run on Windows, Linux, and Mac. With many chapters of this book, you can use Visual Studio for Mac. Exceptions are the chapters that cover WinUI (Chapters 29 through 31), which require Windows to run and develop the app.
After so many years without changes to the Windows command prompt, now there's a completely new one. The source code is public at https://github.com/Microsoft/terminal
, and it offers many features that are useful for development. This terminal offers multiple tabs and different shells, such as the Windows PowerShell, a command prompt, the Azure Cloud Shell, and WSL 2 environments. You can have the terminal full screen, open different tabs to keep different folders easily accessible, and also split panes to have different folders open in a single screen for easy comparison. New features are added on a monthly basis, and you can install the terminal from the Microsoft Store.
WSL 2 is the second generation of the Windows Subsystem for Linux. With this, the subsystem to run Linux is not only faster, but it also offers practically all Linux APIs.
Using WSL 2, you can install different Linux distributions from the Microsoft Store. If you use the Windows Terminal, different tabs can be opened for every Linux distribution installed.
WSL 2 gives you an easy way to build and run .NET applications on a Linux environment from your Windows system. You can even use Visual Studio to debug your .NET applications while they run in the Linux environment. You just need to install the extension .NET Core Debugging with WSL 2. When you run a debug session from Visual Studio, the .NET SDK gets automatically installed in your WSL 2 environment.
The Docker Desktop for Linux (which you can install from https://hub.docker.com/editions/community/docker-ce-desktop-windows
) allows running Docker containers for Linux or Windows. Using Docker allows creating images that include your application code based on images containing the .NET runtime. The .NET runtime itself is based on Linux or Windows images.
You can use Docker to create a solution using many .NET services running in multiple Docker containers. Docker containers are running instances of Docker images that you can built with support from Visual Studio or dotnet tools such as tye (https://github.com/dotnet/tye
).
With many chapters in this book, you don't need Visual Studio. Instead, you can use any editor and a command line, such as the .NET CLI. Let's take a look at how to set up your system and how you can use this tool. This works the same on all platforms.
Nowadays, having a focus on the command line is also due to CI/CD. You can create a pipeline in which compiling, testing, and deployment happens automatically in the background.
If you install .NET CLI tools, you have what you need as an entry point to start all these tools. Use the command
> dotnet --help
to see all the different options of the dotnet tools available. Many of the options have a shorthand notation. For help, you can also type
> dotnet -h
The dotnet tools offer an easy way to create a “Hello World!” application. Just enter this command to create a console application:
> dotnet new console --output HelloWorld
This command creates a new HelloWorld
directory and adds the source code file Program.cs
and the project file HelloWorld.csproj
. The command dotnet new
also includes the functionality of dotnet restore
where all needed NuGet packages are downloaded. To see a list of dependencies and versions of libraries used by the application, you can check the file project.assets.json
in the obj
subdirectory. Without using the option --output
(or -o
as shorthand), the files would be generated in the current directory.
The generated source code looks like the following code snippet:
using System;
namespace HelloWorld
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Hello World!");
}
}
}
Let's get into the syntax of this program. The Main
method is the entry point for a .NET application. The CLR invokes a static Main
method on startup. The Main
method needs to be put into a class. Here, the class is named Program
, but you could call it by any name.
Console.WriteLine
invokes the WriteLine
method of the Console
class. The Console
class can be found in the System
namespace. To avoid writing System.Console.WriteLine
to invoke this method, the System
namespace is opened with the using
declaration on top of the source file.
After writing the source code, you need to compile the code to run it. How you can do this is explained soon in the section “Building the Application.”
The created project configuration file is named HelloWorld.csproj
. This file contains the project configuration, such as the target framework, and the type of binary to create. An important piece of information in this file is the reference to the SDK (project file HelloWorld/HelloWorld.csproj
):
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>net5.0</TargetFramework>
</PropertyGroup>
</Project>
C# 9 allows you to simplify the code for the “Hello World!” application. With top-level statements, the namespace, class, and Main
method declarations can be removed to write only top-level statements. The application can look like the “Hello World!” application code shown here (code file HelloWorld/Program.cs
):
using System;
Console.WriteLine("Hello World!");
If you prefix the invocation of the WriteLine
method to add the namespace, you can write the program in a single code line:
System.Console.WriteLine("Hello World!");
Instead of building a binary for just one framework version, you can replace the TargetFramework
element with TargetFrameworks
, and you can specify multiple frameworks as shown with .NET 5 and .NET Framework 4.8. The LangVersion
element is added because the sample application uses the C# 9 code (top-level statements). Without using this attribute, the C# version is defined by the framework version. .NET 5 by default is using C# 9, and .NET Framework 4.8 is using C# 7.3 (project file HelloWorld/HelloWorld.csproj
):
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFrameworks>net5.0;net48</TargetFrameworks>
<LangVersion>9.0</LangVersion>
</PropertyGroup>
</Project>
The Sdk
attribute specifies the SDK that is used by the project. Microsoft ships different SDKs: Microsoft.NET.Sdk
for console applications, Microsoft.NET.Sdk.Web
for ASP.NET Core web applications, and Microsoft.NET.Sdk.BlazorWebAssembly
for web applications with Blazor and WebAssembly.
You don't need to add source files to the project. Files with the .cs
extension in the same directory and subdirectories are automatically added for compilation. Resource files with the .resx
extension are automatically added for embedding resources. You can change the default behavior and exclude/include files explicitly.
You also don't need to add the .NET Core package. When you specify the target framework net5.0
, the metapackage Microsoft.NETCore.App
that references many other packages is automatically included.
To build the application, you need to change the current directory to the directory of the application and start dotnet build
. You can see output like the following, which is compiled for .NET 5.0 and .NET Framework 4.8:
> dotnet build
Microsoft (R) Build Engine version 16.8.0 for .NET Copyright (C)
Microsoft Corporation. All rights reserved.
Determining projects to restore…
Restored C:\procsharp\Intro\HelloWorld\HelloWorld.csproj (in 308 ms).
HelloWorld -> C:\procsharp\Intro\HelloWorld\bin\Debug\net48\HelloWorld.exe
HelloWorld -> C:\procsharp\Intro\HelloWorld\bin\Debug\net5.0\HelloWorld.dll
Build succeeded.
0 Warning(s)
0 Error(s)
Time Elapsed 00:00:02.82
As a result of the compilation process, you find the assembly containing the IL code of the Program
class within the bin/debug/[net5.0|net48]
folders. If you compare the build of .NET Core with .NET 4.8, you will find a DLL containing the IL code with .NET Core and an EXE containing the IL code with .NET 4.8. The assembly generated for .NET Core has a dependency on the System.Console
assembly, whereas the .NET 4.8 assembly includes the Console
class in the mscorlib
assembly.
To build release code, you need to specify the option --configuration Release
(shorthand -c Release
):
> dotnet build --configuration Release
To run the application, you can use the following dotnet
command:
> dotnet run
If the project file targets multiple frameworks, you need to tell the dotnet run
command which framework to use to run the app by adding the option --framework
. This framework must be configured with the csproj
file. With the sample application, you should get the following output of the application after the restore information:
> dotnet run ––framework net5.0
Hello World!
On a production system, you don't use dotnet run
to run the application; instead, you just use dotnet
with the name of the library:
> dotnet bin/debug/net5.0/HelloWorld.dll
The compiler also creates an executable, which does nothing more than load and start the library. You can start the executable as well. How executables are built for publishing is shown in the next steps.
Similarly to creating a console application, you can also use the .NET CLI to create a web application. As you enter dotnet new
, you can see a list of templates available.
The command
> dotnet new webapp -o WebApp
creates a new ASP.NET Core web application with Razor Pages.
The created project file now contains a reference to the Microsoft.NET.Sdk.Web
SDK. This SDK contains tools and extensions for the project file that are needed to create web applications and services:
<Project Sdk="Microsoft.NET.Sdk.Web">
<PropertyGroup>
<TargetFramework>net5.0</TargetFramework>
</PropertyGroup>
</Project>
Now using
> dotnet build
> dotnet run
starts the Kestrel server of ASP.NET Core to listen on port 5000 and 5001. You can open a browser to access the pages returned from this server, as shown in Figure 1-2.
If you start this for the first time, you're giving a security warning to trust the developer certificate. As you trust the certificate, the warnings will no longer occur.
To stop the application, just press Ctrl+C to send the cancel command.
With the dotnet tool, you also can create a NuGet package and publish the application for deployment. Let's first create a framework-dependent deployment of the application. This reduces the number of files you need for publishing.
Using the previously created console application, you just need the following command to create the files for publishing. The framework is selected by using -f
, and the release configuration by using -c
:
> dotnet publish -f net5.0 -c Release
You put the files needed for publishing into the bin/Release/net5.0/publish
directory.
When you use these files for publishing on the target system, you need the runtime as well. You can find the runtime downloads and installation instructions at https://www.microsoft.com/net/download/
.
Instead of needing to have the runtime installed on the target system, the application can deliver the runtime with it. This is known as self-contained deployment.
Depending on the platform where the application should be installed, the runtime differs. Thus, with self-contained deployment, you need to specify the platforms supported by specifying RuntimeIdentifiers
in the project file as shown in the following project file. Here, the runtime identifiers for Windows 10, macOS, and Ubuntu Linux are specified (project file SelfContainedHelloWorld/SelfContainedHelloWorld.csproj
):
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>net5.0</TargetFramework>
</PropertyGroup>
<PropertyGroup>
<RuntimeIdentifiers>
win10-x64;ubuntu-x64;osx.10.11-x64;
</RuntimeIdentifiers>
</PropertyGroup>
</Project>
Now you can create publish files for all the different platforms:
> dotnet publish -c Release -r win10-x64
> dotnet publish -c Release -r osx.10.11-x64
> dotnet publish -c Release -r ubuntu-x64
After running these commands, you can find the files needed for publishing in the Release/[win10-x64|osx.10.11-x64|ubuntu-x64]/publish
directories. As .NET 5.0 runtime is now included, the size of the files needed for publishing has grown. In these directories, you can also find platform-specific executables that can be started directly without using the .NET CLI command.
Instead of publishing a large list of files, you can create a single executable. Adding the option -p:PublishSingleFile=true
adds the complete runtime to one binary, which then can be used for deployment. With the following command, a single file is created to the output directory singlefile
. This directory also contains a file with the pdb
extension. This file can be deployed to get symbol information for analysis in case the application crashes.
> dotnet publish -r win10-x64 -p:PublishSingleFile=true --self-contained
-o singlefile
To speed up the startup performance of the application, some parts of the application can be precompiled to native code. This way, the IL compiler can reduce its work when running the application. This option can be used with or without PublishSingleFile
.
> dotnet publish -r win10-x64 -p:PublishReadyToRun=true --self-contained
-o readytorun
Instead of passing this configuration with the command line, the <PublishReadyToRun>
element can also be specified in the project file.
Of course, a single executable for publishing that includes the complete runtime is large. However, there's a way around that. You can trim all the classes and methods that are not needed for the application to make the binary smaller.
You can specify trimming with the PublishTrimmed
element in the project file. The TrimMode
specifies how aggressively trimming should be performed. The value link
(used in this example) is used to trim based on members and to remove members that are not used. When you set the value to copyused
, complete assemblies are kept if any of their members are used by the application:
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>net5.0</TargetFramework>
<RuntimeIdentifiers>
win10-x64;ubuntu-x64;osx.10.11-x64;
</RuntimeIdentifiers>
<PublishTrimmed>true</PublishTrimmed>
<TrimMode>link</TrimMode>
</PropertyGroup>
</Project>
You use the following command and the previous project configuration to create a single file executable that is trimmed. At the time of this writing, the size of the binary for “Hello, World!” is reduced from 54MB to 2.8MB. That's quite impressive. As the feature is improved continuously, more savings can be expected in the future.
> dotnet publish -o publishtrimmed -p:PublishSingleFile=true --self-contained
-r win10-x64
There is a risk with trimming. For example, if the application makes use of reflection, the trimmer is not aware that the reflected members are needed during runtime. To deal with such issues, you can specify what assemblies, types, and type members should not be trimmed. To configure such options, read the detailed documentation at https://docs.microsoft.com/dotnet/core/deploying/trimming-options
.
This chapter covered a lot of ground to review important technologies and changes with .NET. With new applications you should use .NET Core (now renamed to just .NET) for future development. With existing applications, it depends on the state of the application if you prefer to stay with older technologies or migrate to new ones. For moving to .NET, you now know about frameworks that you can use to replace older frameworks.
You read about tools you can use for development and dived into the .NET CLI to create, build, and publish applications.
You looked at technologies for accessing the database and creating Windows apps, and you read about different ways to create web applications.
Whereas this chapter laid the foundation with a “Hello World!” example, Chapter 2 dives fast into the syntax of C#. It covers variables, how to implement program flows, how to organize your code into namespaces, and more.
Now that you understand more about what C# can do, you need to know how to use it. This chapter gives you a good start in that direction by providing a basic understanding of the fundamentals of C# programming, which subsequent chapters build on. By the end of this chapter, you will know enough C# to write simple programs (though without using inheritance or other object-oriented features, which are covered in later chapters).
The previous chapter explained how to create a “Hello, World!” application using the .NET CLI tools. This chapter focuses on C# syntax. First, here's some general information on the syntax:
;
) and can continue over multiple lines without needing a continuation character.{}
).//
)./*
) and end with the same combination reversed (*/
).myVar
and MyVar
are two different variables.A new feature of C# 9 is top-level statements. You can create simple applications without defining a namespace, declaring a class, and defining a Main
method. A one-line “Hello, World!” application can look like this:
System.Console.WriteLine("Hello World!");
Let's enhance this one-line application to open the namespace where the Console
class is defined first. With the using
directive to import the System
namespace, you can use class Console
without prefixing it with the namespace:
using System;
Console.WriteLine("Hello World!");
Because WriteLine
is a static method of the Console
class, it's even possible to open the Console
class with the using static
directive:
using static System.Console;
WriteLine("Hello World!");
Behind the scenes, with top-level statements, the compiler creates a class with a Main
method and adds the top-level statements to the Main
method:
using System;
class Program
{
static void Main()
{
Console.WriteLine("Hello, World!");
}
}
C# offers different ways to declare and initialize variables. A variable has a type and a value that can change over time. In the next code snippet, the variable s1
is of type string as defined with the type declaration at the left of the variable name, and it is initialized to a new string object where the string literal "Hello, World!"
is passed to the constructor. Because the string
type is commonly used, instead of creating a new string object, the string "Hello, World!"
can be directly assigned to the variable (shown with the variable s2
).
C# 3 invented the var
keyword with type inference, which can be used to declare a variable as well. Here, the type is required on the right side, and the left side would infer the type from it. As the compiler creates a string object from the string literal "Hello, World"
, s3
is in the same way a type-safe strongly defined string like s1
and s2
.
C# 9 provides another new syntax to declare and initialize a variable with the target-typed new expression. Instead of writing the expression new string("Hello, World!")
, if the type is known at the left side, using just the expression new("Hello, World!")
is sufficient; you don't have to specify the type on the right side (code file TopLevelStatements/Program.cs
):
using System;
string s1 = new string("Hello, World!");
string s2 = "Hello, World!";
var s3 = "Hello, World!";
string s4 = new("Hello, World!");
Console.WriteLine(s1);
Console.WriteLine(s2);
Console.WriteLine(s3);
Console.WriteLine(s4);
//…
When you're passing values to the application when starting the program, the variable args
is automatically declared with top-level statements. In the following code snippet, with the foreach
statement, the variable args
is accessed to iterate through all the command-line arguments and display the values on the console (code file CommandLineArgs/Program.cs
):
using System;
foreach (var arg in args)
{
Console.WriteLine(arg);
}
Using the .NET CLI to run the application, you can use dotnet run
followed by --
and then pass the arguments to the program. The --
needs to be added so as not to confuse the arguments of the .NET CLI with the arguments of the application:
> dotnet run -- one two three
When you run this, you see the strings one two three
on the console.
When you create a custom Main
method, the method needs to be declared to receive a string array. You can choose a name for the variable, but the variable named args
is commonly used, which is the reason this name was selected for the automatically generated variable with top-level statements:
using System;
class Program
{
static void Main(string[] args)
{
foreach (var arg in args)
{
Console.WriteLine(arg);
}
}
}
The scope of a variable is the region of code from which the variable can be accessed. In general, the scope is determined by the following rules:
for
, while
, or similar statement is in scope in the body of that loop.It's common in a large program to use the same variable name for different variables in different parts of the program. This is fine as long as the variables are scoped to completely different parts of the program so that there is no possibility for ambiguity. However, bear in mind that local variables with the same name can't be declared twice in the same scope. For example, you can't do this:
int x = 20;
// some more code
int x = 30;
Consider the following code sample (code file VariableScopeSample/Program.cs
):
using System;
for (int i = 0; i < 10; i++)
{
Console.WriteLine(i);
} // i goes out of scope here
// We can declare a variable named i again, because
// there's no other variable with that name in scope
for (int i = 9; i>= 0; i--)
{
Console.WriteLine(i);
} // i goes out of scope here.
This code simply prints out the numbers from 0 to 9, and then from 9 to 0, using two for
loops. The important thing to note is that you declare the variable i
twice in this code, within the same method. You can do this because i
is declared in two separate loops, so each i
variable is local to its own loop.
Here's another example (code file VariableScopeSample2/Program.cs
):
int j = 20;
for (int i = 0; i < 10; i++)
{
int j = 30; // Can't do this — j is still in scope
Console.WriteLine(j + i);
}
If you try to compile this, you get an error like the following:
error CS0136: A local or parameter named 'j' cannot be declared in this scope because that name is used in an enclosing local scope to define a local or parameter
This occurs because the variable j
, which is defined before the start of the for
loop, is still in scope within the for
loop and won't go out of scope until the Main
method (which is created from the compiler) has finished executing. The compiler has no way to distinguish between these two variables, so it won't allow the second one to be declared.
It even doesn't help to put the variable j
declared outside of the for
loop after the end of the for
loop. The compiler moves all variable declarations at the beginning of a scope no matter where you declare it.
For values that never change, you can define a constant. For constant values, you can use the const
keyword.
With variables declared with the const
keyword, the compiler replaces the variable in every occurrence with the value specified with the constant.
A constant is specified with the const
keyword before the type:
const int a = 100; // This value cannot be changed.
The compiler replaces every occurrence of the local field with the value. This behavior is important in terms of versioning. If you declare a constant with a library and use the constant from an application, the application needs to be recompiled to get the new value; otherwise, the library could have a different value from the application. Because of this, it's best to use const
only with values that never change, even in future versions.
Constants have the following characteristics:
static
modifier in the constant declaration.The following are the advantages of using constants in your programs:
You can also add methods and types to the same file with top-level statements. In the following code snippet, the method named Method
is defined and invoked after the method declaration and implementation (code file TopLevelStatements/Program.cs
):
//…
void Method()
{
Console.WriteLine("this is a method");
}
Method();
//…
The method can be declared before or after it is used. Types can be added to the same file, but these need to be specified following the top-level statements. With the following code snippet, the class Book
is specified to contain a Title
property and the ToString
method. Before the declaration of the type, a new instance is created and assigned to the variable b1
, the value of the Title
property is set, and the instance is written to the console. When the object is passed as an argument to the WriteLine
method, in turn the ToString
method of Book
class is invoked:
Book b1 = new();
b1.Title = "Professional C#";
Console.WriteLine(b1);
class Book
{
public string Title { get; set; }
public override string ToString() => Title;
}
With the first version of C#, a value type couldn't have a null value, but it was always possible to assign null to a reference type. The first change happened with C# 2 and the invention of the nullable value type. C# 8 brought a change with reference types because most exceptions occurring with .NET are of type NullReferenceException
. These exceptions occur when a member of a reference is invoked that has null
assigned. To reduce these issues and get compiler errors instead, nullable reference types were introduced with C# 8.
This section covers both nullable value types and nullable reference types. The syntax looks similar, but it's very different behind the scenes.
With a value type such as int
, you cannot assign null
to it. This can lead to difficulties when mapping to databases or other data sources, such as XML or JSON. Using a reference type instead results in additional overhead: an object is stored in the heap, and the garbage collection needs to clean it up when it's not used anymore. Instead, the ?
can be used with the type definition, which allows assigning null
:
int? x1 = null;
The compiler changes this to use the Nullable<T>
type:
Nullable<int> x1 = null;
Nullable<T>
doesn't add the overhead of a reference type. This is still a struct
(a value type) but adds a Boolean flag to specify if the value is null
.
The following code snippet demonstrates using nullable value types and assigning non-nullable values. The variable n1
is a nullable int
that has been assigned the value null
. A nullable value type defines the property HasValue
, which can be used to check whether the variable has a value assigned. With the Value
property, you can access its value. This can be used to assign the value to a non-nullable value type. A non-nullable value can always be assigned to a nullable value type; this always succeeds (code file NullableValueTypes/Program.cs
):
int? n1 = null;
if (n1.HasValue)
{
int n2 = n1.Value;
}
int n3 = 42;
int? n4 = n3;
Nullable reference types have the goal of reducing exceptions of type NullReferenceException
, which is the most common exception that occurs with .NET applications. There always has been a guideline that an application should not throw such exceptions and should always check for null
, but without the help of the compiler, such issues can be missed too easily.
To get help from the compiler, you need to turn on nullable reference types. Because this feature has breaking changes with existing code, you need to turn it on explicitly. You specify the Nullable
element and set the enable
value in the project file (project file NullableReferenceTypes.csproj
):
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>net5.0</TargetFramework>
<Nullable>enable</Nullable>
</PropertyGroup>
</Project>
Now, null
cannot be assigned to reference types. When you write this code with nullable enabled,
string s1 = null; // compiler warning
you get the compiler warning “CS8600: Converting a null literal or a possible null value to non-nullable type.”
To assign null to the string, the type needs to be declared with a question mark—like nullable value types:
string? s1 = null;
When you're using the nullable s1
variable, you need to make sure to verify for not null
before invoking methods or assigning it to non-nullable strings; otherwise, compiler warnings are generated:
string s2 = s1.ToUpper(); // compiler warning
Instead, you can check for null
before invoking the method with the null-conditional operator ?.
, which invokes the method only if the object is not null
. The result cannot be written to a non-nullable string. The result of the right expression can be null
if s1
is null
:
string? s2 = s1?.ToUpper();
You can use the coalescing operator ??
to define a different return value in the case of null
. With the following code snippet, an empty string is returned in case the expression to the left of ??
returns null
. The complete result of the right expression is now written to the variable s3
, which can never be null
. It's either the uppercase version of the s1
string if s1
is not null
, or an empty string if s1
is null
:
string s3 = s1?.ToUpper() ?? string.Empty;
Instead of using these operators, you can also use the if
statement to verify whether a variable is not null
. With the if
statement in the following code snippet, the C# pattern is not
is used to verify that s1
is not null
. The block covered by the if
statement is invoked only when s1
is not null
. Here it is not necessary to use the null-conditional operator to invoke the method ToUpper
:
if (s1 is not null)
{
string s4 = s1.ToUpper();
}
Of course, it's also possible to use the not equals operator !=
:
if (s1 != null)
{
string s5 = s1.ToUpper();
}
Using nullable reference types is also important with members of types, as shown in the Book
class with the Title
and Publisher
properties in the following code snippet. The Title
is declared with a non-nullable string type; thus, it needs to be initialized when creating a new object of the Book
class. It's initialized with the constructor of the Book
class. The Publisher
property is allowed to be null
, so it doesn't need initialization (code file NullableReferenceTypes/Program.cs
):
class Book
{
public Book(string title) => Title = title;
public string Title { get; set; }
public string? Publisher { get; set; }
}
When you're declaring a variable of the Book
class, the variable can be declared as nullable (b1
), or it needs a Book
object with the declaration using the constructor (b2
). The Title
property can be assigned to a non-nullable string type. With the Publisher
property, you can assign it to a nullable string or use the operators as shown earlier:
Book? b1 = null;
Book b2 = new Book("Professional C#");
string title = b2.Title;
string? publisher = b2.Publisher;
Behind the scenes with nullable value types, the type Nullable<T>
is used behind the scenes. This is not the case with nullable reference types. Instead, the compiler adds annotation to the types. Nullable reference types have Nullable
attributes associated. With this, nullable reference types can be used with libraries to annotate parameters and members with nullability. When the library is used with new applications, IntelliSense can give information regarding whether a method or property can be null
, and the compiler acts accordingly with compiler warnings. Using an older version of the compiler (earlier than C# 8), the library can still be used in the same way nonannotated libraries are used. The compiler just ignores the attributes it doesn't know.
Now that you have seen how to declare variables and constants and know about an extremely important enhancement with nullability, let's take a closer look at the data types available in C#.
The C# keywords for data types—such as int
, short
, and string
—are mapped from the compiler to .NET data types. For example, when you declare an int
in C#, you are actually declaring an instance of a .NET struct: System.Int32
. All the primitive data types offer methods that can be invoked. For example, to convert int i
to a string
, you can write the following:
string s = i.ToString();
I should emphasize that behind this syntactical convenience, the types really are stored as primitive types, so absolutely no performance cost is associated with the idea that the primitive types are represented by .NET structs.
The following sections review the types that are recognized as built-in types in C#. Each type is listed along with its definition and the name of the corresponding .NET type. I also show you a few exceptions—some important data types that are available only with their .NET type and don't have a specific C# keyword.
Let's start with predefined value types that represent primitives, such as integers, floating-point numbers, characters, and Booleans.
C# supports integer types with various numbers of bits used and differs between types that support only positive values or types with a range of negative and positive values. Eight bits are used by the byte
and sbyte
types. The byte
type allows values from 0 to 255—only positive values—whereas the s
in sbyte
means to use a sign; that type supports values from –128 to 127, which is what's possible with 8 bits.
The short
and ushort
types make use of 16 bits. The short
type covers the range from –32,768 to 32,767. With the ushort
type, the u
is for unsigned, and it covers 0 to 65,535. Similarly, the int
type is a signed 32-bit integer, and the uint
type is an unsigned 32-bit integer. long
and ulong
have 64 bits available. Behind the scenes, the C# keywords sbyte
, short
, int
, and long
map to System.SByte
, System.Int16
, System.Int32
, and System.Int64
. The unsigned versions map to System.Byte
, System.UInt16
, System.UInt32
, and System.UInt64
. The underlying .NET types clearly list the number of bits used in the name of the type.
To check for the maximum and minimum values from the type, you can use the MaxValue
and MinValue
properties.
In case you need a number representation that has a bigger value than the 64 bits available in the long
type, you can use the BigInteger
type. This struct doesn't have a limit on the number of bits and can grow until there's not enough memory available. There's not a specific C# keyword for this type, and you need to use BigInteger
. Because this type can grow endlessly, MinValue
and MaxValue
properties are not available. This type offers built-in methods for calculation such as Add
, Subtract
, Divide
, Multiply
, Log
, Log10
, Pow
, and others.
With int
, short
, and long
, the number of bits and available sizes are independent if the application is a 32- or 64-bit application. This is different from the integer definitions as defined with C++. C# 9 has new keywords for platform-specific values: nint
and nuint
(native integer and native unsigned integer, respectively). In a 64-bit application, these integer types make use of 64 bits, whereas in a 32-bit application just 32 bits are used. These types are important with direct memory access, which is covered in Chapter 13, “Managed and Unmanaged Memory.”
For better readability of numbers, you can use digit separators. You can add underscores to numbers, as shown in the following code snippet. In this code snippet, also the 0x
prefix is used to specify hexadecimal values (code file DataTypes/Program.cs
):
long l1 = 0x_123_4567_89ab_cedf;
The underscores used as separators are just ignored by the compiler. These separators help with readability and don't add any functionality. With the preceding sample, reading from the right, every 16 bits (or 4 hexadecimal characters) a digit separator is added. This is a lot more readable compared to this:
long l2 = 0x123456789abcedf;
Of course, because the compiler ignores the underscores, you are responsible for readability yourself. You can put the underscores at any position, which may not really help with readability:
long l3 = 0x_12345_6789_abc_ed_f;
It's useful that any position can be used, which allows for different use cases such as to work with hexadecimal or octal values or to separate different bits needed for a protocol, as shown in the next section.
Besides offering digit separators, C# also makes it easy to assign binary values to integer types. Using the 0b
literal, it's only allowed to assign values of 0 and 1, such as the following (code file DataTypes/Program.cs
):
uint binary1 = 0b_1111_1110_1101_1100_1011_1010_1001_1000;
The preceding code snippet uses an unsigned int with 32 bits available. Digit separators help with readability for using binary values. This snippet makes a separation every 4 bits. Remember, you can write this in the hex notation as well:
uint hex1 = 0xfedcba98;
Using the separator every 3 bits helps in working with the octal notation, where characters are used between 0 (000 binary) and 7 (111 binary).
uint binary2 = 0b_111_110_101_100_011_010_001_000;
If you need to define a binary protocol—for example, where 2 bits define the rightmost part followed by 6 bits in the next section, and two times 4 bits to complete 16 bits—you can put separators per this protocol:
ushort binary3 = 0b1111_0000_101010_11;
C# also specifies floating-point types with different numbers of bits based on the IEEE 754 standard. The Half
type (new as of .NET 5) uses 16 bits, float
(Single
with .NET) uses 32 bits, and double
(Double
) uses 64 bits. With all of these data types, 1 bit is used for the sign. Depending on the type, 10 through 52 bits are used for the significand, and 5 through 11 bits for the exponent. The following table shows the details:
C# KEYWORD | .NET TYPE | DESCRIPTION | SIGNIFICAND BIT | EXPONENT BIT |
---|---|---|---|---|
System.Half |
16-bit, single-precision floating point | 10 | 5 | |
float |
System.Single |
32-bit, single-precision floating point | 23 | 8 |
double |
System.Double |
64-bit, double-precision floating point | 52 | 11 |
When you assign a value, if you hard-code a noninteger number (such as 12.3), the compiler assumes that's a double. To specify that the value is a float
, append the character F
(or f
):
float f = 12.3F;
With the decimal
type (.NET struct Decimal
), .NET has a high-precision floating-point type that uses 128 bits and can be used for financial calculations. With the 128 bits, 1 is used for the sign, and 96 for the integer number. The remaining bits specify a scaling factor. To specify that your number is a decimal
type rather than a double
, a float
, or an integer, you can append the M
(or m
) character to the value:
decimal d = 12.30M;
You use the C# bool
type to contain Boolean values of either true
or false
.
You cannot implicitly convert bool
values to and from integer values. If a variable (or a function return type) is declared as a bool
, you can only use values of true
and false
. You get an error if you try to use zero for false
and a nonzero value for true
.
The .NET string consists of two-byte characters. The C# keyword char
maps to the .NET type Char
. Using single quotation marks, for example, 'A'
, creates a char. With double quotation marks, a string is created.
As well as representing chars as character literals, you can represent them with four-digit hex Unicode values (for example, '\u0041'
), as integer values with a cast (for example, (char)65
), or as hexadecimal values (for example,
'\x0041'
). You can also represent them with an escape sequence, as shown in the following table:
ESCAPE SEQUENCE | CHARACTER |
---|---|
\' |
Single quotation mark |
\" |
Double quotation mark |
\\ |
Backslash |
\0 |
Null |
\a |
Alert |
\b |
Backspace |
\f |
Form feed |
\n |
Newline |
\r |
Carriage return |
\t |
Tab character |
\v |
Vertical tab |
In the preceding sections, literals have been shown for numeric values. Let's summarize them here in the following table:
LITERAL | POSITION | DESCRIPTION |
---|---|---|
U |
Postfix | unsigned int |
L |
Postfix | long |
UL |
Postfix | unsigned long |
F |
Postfix | float |
M |
Postfix | decimal (money) |
0x |
Prefix | Hexadecimal number; values from 0 to F are allowed |
0b |
Prefix | Binary number; only 0 and 1 are allowed |
true |
NA | Boolean value |
false |
NA | Boolean value |
Besides value types, with C# keywords, two reference types are defined: the object
keyword that maps to the Object
class and the string
keyword that maps to the String
class. The string
type is discussed later in this chapter in the section “Working with Strings.” The Object
class is the ultimate base class of all reference types and can be used for two purposes:
object
reference to bind to an object of any particular subtype. For example, in Chapter 5, you’ll see how you can use the object
type to box a value object on the stack to move it to the heap; object
references are also useful in reflection, when code must manipulate objects whose specific types are unknown.object
type implements a number of basic, general-purpose methods, which include Equals
, GetHashCode
, GetType
, and ToString
. User-defined classes might need to provide replacement implementations of some of these methods using an object-oriented technique known as overriding, which is discussed in Chapter 4. When you override ToString
, for example, you equip your class with a method for intelligently providing a string representation of itself. If you don't provide your own implementations for these methods in your classes, the compiler picks up the implementations of the object
type, which returns the name of the class.This section looks at the real nuts and bolts of the language: the statements that allow you to control the flow of your program rather than execute every line of code in the order it appears in the program. With conditional statements like the if
and switch
statements, you can branch your code depending on whether certain conditions are met. You can repeat statements in loops with for
, while
, and foreach
statements.
With the if
statement, you can specify an expression within parentheses. If the expression returns true
, the block that's specified with curly braces is invoked. In case the condition is not true
, you can check for another condition to be true using else if
. The else if
can be repeated to check for more conditions. If neither the expressions specified with the if
nor all the else if
expressions evaluate to true
, the block specified with the else
block is invoked.
With the following code snippet, a string is read from the console. If an empty string is entered, the code block following the if
statement is invoked. The string
method IsNullOrEmpty
returns true
if the string
is either null
or empty. The block specified with the else if
statement is invoked when the length of the input is smaller than five characters. In all other cases—for example, with an input length of five or more characters—the else
block is invoked (code file ProgramFlow/Program.cs
):
Console.WriteLine("Type in a string");
string? input = Console.ReadLine();
if (string.IsNullOrEmpty(input))
{
Console.WriteLine("You typed in an empty string.");
}
else if (input?.Length < 5)
{
Console.WriteLine("The string had less than 5 characters.");
}
else
{
Console.WriteLine("Read any other string");
}
Console.WriteLine("The string was " + input);
With the if
statement, else if
and else
are optional. If you just need to invoke a code block based on a condition and don't invoke a code block if this condition is not met, you can use the if
without else
.
One of the C# features is pattern matching, which you can use with the if
statement and the is
operator. The earlier section “Nullable Reference Types” included an example that used an if
statement and the pattern is not null
.
The following code snippet compares the argument received that is of type object
with null
, using a const pattern to compare the argument with null
and throw the ArgumentNullException
. With the expression used in
else if
, the type pattern is used to check whether the variable o
is of type Book
. If this is the case, the variable o
is assigned to the variable b
. Because variable b
is of type Book
, with b
the Title
property that is specified by the Book
type can be accessed (code file ProgramFlow/Program.cs
):
void PatternMatching(object o)
{
if (o is null) throw new ArgumentNullException(nameof(o));
else if (o is Book b)
{
Console.WriteLine($"received a book: {b.Title}");
}
}
A few more samples for const and type patterns are shown in the following code snippet:
if (o is 42) // const pattern
if (o is "42") // const pattern
if (o is int i) // type pattern
The switch
/
case
statement is good for selecting one branch of execution from a set of mutually exclusive ones. It takes the form of a switch
argument followed by a series of case
clauses. When the expression in the switch
argument evaluates to one of the values specified by a case
clause, the code immediately following the case
clause executes. This is one example for which you don't need to use curly braces to join statements into blocks; instead, you mark the end of the code for each case using the break
statement. You can also include a default
case in the switch
statement, which executes if the expression doesn't evaluate to any of the other cases. The following switch
statement tests the value of the x
variable (code file SwitchStatement/Program.cs
):
void SwitchSample(int x)
{
switch (x)
{
case 1:
Console.WriteLine("integerA = 1");
break;
case 2:
Console.WriteLine("integerA = 2");
break;
case 3:
Console.WriteLine("integerA = 3");
break;
default:
Console.WriteLine("integerA is not 1, 2, or 3");
break;
}
}
Note that the case values must be constant expressions; variables are not permitted.
With the switch
statement, you cannot remove the break
from the different cases. Contrary to the C++ and Java programming languages, with C# automatic fall-through from one case
implementation to continue with another case
is not done. Instead of an automatic fall-through, you can use the goto
keyword for an explicit fall-through and select another case. Here's an example:
goto case 3;
If the implementation is completely the same with multiple cases, you can specify multiple cases before specifying an implementation:
switch(country)
{
case "au":
case "uk":
case "us":
language = "English";
break;
case "at":
case "de":
language = "German";
break;
}
Pattern matching can also be used with the switch
statement. The following code snippet shows different case
options with const and type, and relational patterns. The method SwitchWithPatternMatching
receives a parameter of type object
. case null
is a const pattern that compares o
for null
. The next three cases specify a type pattern. case int i
uses a type pattern that creates the variable i
if o
is an int
, but only in combination with the when
clause. The when
clause uses a relational pattern to check if it is larger than 42
. The next case matches every remaining int
type. Here, no variable is specified where object o
should be assigned. Specifying a variable is not necessary if you don't need this variable and just need to know it's of this type. With the match for a Book
type, the variable b
is used. Declaring a variable here, this variable is of type Book
(code file SwitchStatement/Program.cs
):
void SwitchWithPatternMatching(object o)
{
switch (o)
{
case null:
Console.WriteLine("const pattern with null");
break;
case int i when i> 42
Console.WriteLine("type pattern with when and a relational pattern");
case int:
Console.WriteLine("type pattern with an int");
break;
case Book b:
Console.WriteLine($"type pattern with a Book {b.Title}");
break;
default:
break;
}
}
The next example shows a switch based on an enum
type. The enum
type is based on an integer but gives names to the different values. The type TrafficLight
defines the different values for the colors of a traffic light (code file SwitchExpression/Program.cs
):
enum TrafficLight
{
Red,
Amber,
Green
}
With the switch
statement so far, you've only seen invoking some actions in every case. When you use the return
statement to return from a method, you can also directly return a value from the case
without continuing with the following cases. The method NextLightClassic
receives a TrafficLight
with its parameter and returns a TrafficLight
. If the passed traffic light has the value TrafficLight.Green
, the method returns TrafficLight.Amber
. When the current light value is TrafficLight.Amber
, TrafficLight.Red
is returned:
TrafficLight NextLightClassic(TrafficLight light)
{
switch (light)
{
case TrafficLight.Green:
return TrafficLight.Amber;
case TrafficLight.Amber:
return TrafficLight.Red;
case TrafficLight.Red:
return TrafficLight.Green;
default:
throw new InvalidOperationException();
}
}
In such a scenario, if you need to return a value based on different options, you can use the switch expression that is new as of C# 8. The method NextLight
receives and returns a TrafficLight
value similar to the previously shown method. The implementation is now done with an expression bodied member because the implementation is done in a single statement. Curly braces and the return
statement are unnecessary in this case. When you use a switch
expression instead of the switch
statement, the variable and switch
keyword are reversed. With the switch
statement, the value on the switch follows in braces after the switch
keyword. With the switch
expression, the variable is followed by the switch
keyword. A block with curly braces defines the different cases. Instead of using the case
keyword, the =>
token is used to define what's returned. The functionality is the same as before, but you need fewer lines of code:
TrafficLight NextLight(TrafficLight light) =>
light switch
{
TrafficLight.Green => TrafficLight.Amber,
TrafficLight.Amber => TrafficLight.Red,
TrafficLight.Red => TrafficLight.Green,
_ => throw new InvalidOperationException()
};
If the enum
type TrafficLight
is imported with the using static
directive, you can simplify the implementation even more by just using the enum
value definitions without the type name:
using static TrafficLight;
TrafficLight NextLight(TrafficLight light) =>
light switch
{
Green => Amber,
Amber => Red,
Red => Green,
_ => throw new InvalidOperationException()
};
With the next example, a pattern combinator is used to combine multiple patterns. First, input is retrieved from the console. If string one
or two
is entered, the same match applies, using the or
combinator pattern (code file SwitchExpression/Program.cs
):
string? input = Console.ReadLine();
string result = input switch
{
"one" => "the input has the value one",
"two" or "three" => "the input has the value two or three",
_ => "any other value"
};
With pattern combinators, you can combine patterns using the and
, or
, and not
keywords.
C# provides four different loops (for
, while
, do
-
while
, and foreach
) that enable you to execute a block of code repeatedly until a certain condition is met. With the for
keyword, you iterate through a loop whereby you test whether a particular condition holds true before you perform another iteration:
for (int i = 0; i < 100; i++)
{
Console.WriteLine(i);
}
The first expression of the for
statement is the initializer. It is evaluated before the first loop is executed. Usually you use this to initialize a local variable as a loop counter.
The second expression is the condition. This is checked before every iteration of the for
block. If this expression evaluates to true
, the block is executed. If it evaluates to false
, the for
statement ends, and the program continues with the next statement after the closing curly brace of the for
body.
After the body is executed, the third expression, the iterator, is evaluated. Usually, you increment the loop counter. With i++
, a value of 1 is added to the variable i
. After the third expression, the condition expression is evaluated again to check whether another iteration with the for
block should be done.
The for
loop is a so-called pretest loop because the loop condition is evaluated before the loop statements are executed; therefore, the contents of the loop won't be executed at all if the loop condition is false
.
It's not unusual to nest for
loops so that an inner loop executes once completely for each iteration of an outer loop. This approach is typically employed to loop through every element in a rectangular multidimensional array. The outermost loop loops through every row, and the inner loop loops through every column in a particular row. The following code displays rows of numbers. It also uses another Console
method, Console.Write
, which does the same thing as Console.WriteLine
but doesn't send a carriage return to the output (code file ForLoop/Program.cs
):
// This loop iterates through rows
for (int i = 0; i < 100; i += 10)
{
// This loop iterates through columns
for (int j = i; j < i + 10; j++)
{
Console.Write($" {j}");
}
Console.WriteLine();
}
This sample results in this output:
0 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17 18 19
20 21 22 23 24 25 26 27 28 29
30 31 32 33 34 35 36 37 38 39
40 41 42 43 44 45 46 47 48 49
50 51 52 53 54 55 56 57 58 59
60 61 62 63 64 65 66 67 68 69
70 71 72 73 74 75 76 77 78 79
80 81 82 83 84 85 86 87 88 89
90 91 92 93 94 95 96 97 98 99
Like the for
loop, while
is a pretest loop. The syntax is similar, but while
loops take only one expression:
while(condition)
statement(s);
Unlike the for
loop, the while
loop is most often used to repeat a statement or a block of statements for a number of times that is not known before the loop begins. Usually, a statement inside the while
loop's body sets a Boolean flag to false
on a certain iteration, triggering the end of the loop, as in the following example:
bool condition = false;
while (!condition)
{
// This loop spins until the condition is true.
DoSomeWork();
condition = CheckCondition(); // assume CheckCondition() returns a bool
}
The do
-
while
loop is the post-test version of the while
loop. This means that the loop's test condition is evaluated after the body of the loop has been executed. Consequently, do
-
while
loops are useful for situations in which a block of statements must be executed at least one time, as in this example:
bool condition;
do
{
// This loop will at least execute once, even if the condition is false.
MustBeCalledAtLeastOnce();
condition = CheckCondition();
} while (condition);
The foreach
loop enables you to iterate through each item in a collection. For now, don't worry about exactly what a collection is (it is explained fully in Chapter 6, “Arrays”); just understand that it is an object that represents a list of objects. Technically, for an object to count as a collection, it must support an interface called IEnumerable
. Examples of collections include C# arrays, the collection classes in the System.Collections
namespaces, and user-defined collection classes. You can get an idea of the syntax of foreach
from the following code, if you assume that arrayOfInts
is (unsurprisingly) an array of int
s:
foreach (int temp in arrayOfInts)
{
Console.WriteLine(temp);
}
Here, foreach
steps through the array one element at a time. With each element, it places the value of the element in the int
variable called temp
and then performs an iteration of the loop.
Here is another situation where you can use type inference. The foreach
loop would become the following:
foreach (var temp in arrayOfInts)
{
// …
}
int would infer from temp
because that is what the collection item type is.
An important point to note with foreach
is that you can't change the value of the item in the collection (temp
in the preceding code), so code such as the following will not compile:
foreach (int temp in arrayOfInts)
{
temp++;
Console.WriteLine(temp);
}
If you need to iterate through the items in a collection and change their values, you must use a for
loop instead.
Within a loop, you can stop the iterations with the break
statement or end the current iteration and continue with the next iteration with the continue
statement. With the return
statement, you can exit the current method and thus also exit a loop.
With small sample applications, you don't need to specify a namespace. When you create libraries where classes are used in applications, to avoid ambiguities, you must specify namespaces. The Console
class used earlier is defined in the System
namespace. To use the class Console
, you either have to prefix it with the namespace or import the namespace from this class.
Namespaces can be defined in a hierarchical way. For example, the ServiceCollection
class is specified in the namespace Microsoft.Extensions.DependencyInjection
. To define the class Sample
in the namespace Wrox.ProCSharp.CoreCSharp
, you can specify this namespace hierarchy with the namespace
keyword:
namespace Wrox
{
namespace ProCSharp
{
namespace CoreCSharp
{
public class Sample
{
}
}
}
}
You can also use the dotted notation to specify the namespace:
namespace Wrox.ProCSharp.CoreCSharp
{
public class Sample
{
}
}
A namespace is a logical construct and completely independent of physical files or components. One assembly can contain multiple namespaces, and a single namespace can be spread across multiple assemblies. It's a logical construct to group different types together.
Each namespace name is composed of the names of the namespaces it resides within, separated with periods, starting with the outermost namespace and ending with its own short name. Therefore, the full name for the ProCSharp
namespace is Wrox.ProCSharp
, and the full name of the Sample
class is Wrox.ProCSharp.CoreCSharp.Sample
.
Obviously, namespaces can grow rather long and tiresome to type, and the capability to indicate a particular class with such specificity may not always be necessary. Fortunately, as noted earlier in this chapter, C# allows you to abbreviate a class's full name. To do this, list the class's namespace at the top of the file, prefixed with the using
keyword. Throughout the rest of the file, you can refer to the types in the namespace by their type names.
If two namespaces referenced by using
declarations contain a type of the same name, you need to use the full (or at least a longer) form of the name to ensure that the compiler knows which type to access. For example, suppose classes called Test
exist in both the ProCSharp.CoreCSharp
and ProCSharp.OOP
namespaces. If you then create a class called Test
and both namespaces are imported, the compiler reacts with an ambiguity compilation error. In this case, you need to specify the namespace name for the type.
Instead of specifying the complete namespace name for the class to resolve ambiguity issues, you can specify an alias with the using
directive, as shown with different Timer
classes from two namespaces:
using TimersTimer = System.Timers.Timer;
using Webtimer = System.Web.UI.Timer;
The code in this chapter has already used the string
type several times. string
is an important reference type that offers many features. Although it's a reference type, it's immutable—it can't be changed. All the methods this type offers don't change the content of the string but instead return a new string. For example, to concatenate strings, the +
operator is overloaded. The expression s1 + " " + s2
first creates a new string combining s1
and the string containing the space character. Another new string is created by combining the result string with s2
to create another new string. Finally, the result string is referenced from the variable s3
:
string s1 = "Hello";
string s2 = "World";
string s3 = s1 + " " + s2;
With many strings created, you need to be aware that the objects that are no longer necessary need to be cleaned up by the garbage collector. The garbage collector frees up memory in the managed heap from objects that are no longer needed. This doesn't happen when the reference is not used anymore; it's based on certain memory limits. Read Chapter 13 for more information on the garbage collector. It's best to avoid object allocation, which can be done when dynamically working with strings by using the StringBuilder
class.
The StringBuilder
allows a program to dynamically work with strings using Append
, Insert
, Remove
, and Replace
methods without creating new objects. Instead, the StringBuilder
uses a memory buffer and modifies this buffer as the need arises. When you're creating a StringBuilder
, the default capacity is 16 characters. If strings are appended as shown in the following code snippet and more memory is needed, the capacity is doubled to 32 characters (code file StringSample/Program.cs
):
void UsingStringBuilder()
{
StringBuilder sb = new("the quick");
sb.Append(' ');
sb.Append("brown fox jumped over ");
sb.Append("the lazy dogs 1234567890 times");
string s = sb.ToString();
Console.WriteLine(s);
}
If the capacity is too small, the buffer size always doubles—for example, from 16 to 32 to 64 to 128 characters. The length of the string can be accessed with the Length
property. The capacity of the StringBuilder
is returned from the Capacity
property. After creating the necessary string, you can use the ToString
method, which allocates a new string containing the content of the StringBuilder
.
Code snippets in this chapter have already included strings with the $
prefix. This prefix allows evaluating expressions within the string and is known as string interpolation. For example, with string s2
, the content of string s1
is embedded within s2
to have the final result of Hello, World!
:
string s1 = "World";
string s2 = $"Hello, {s1}!";
You can write code expressions within the curly braces to get the expression evaluated and the result added into the string. In the following code snippet, a string is specified with three placeholders where the value of x
, the value of y
, and the result of the addition of x
and y
are put into the string:
int x = 3, y = 4;
string s3 = $"The result of {x} and {y} is {x + y}";
Console.WriteLine(s3);
The resulting string is The result of 3 and 4 is 7
.
The compiler translates the interpolated string to invoke the Format
method of the string, passes a string with numbered placeholders, and passes additional arguments following the string. The result of the additional arguments is from the implementation of the Format
method passed to the placeholders based on the numbers. The first argument following the string is passed to the 0 placeholder, the second argument to the 1 placeholder, and so on:
string s3 = string.Format("The result of {0} and {1} is {2}", x, y, x + y);
What the interpolated string gets translated to can easily be seen by assigning a string to a FormattableString
. The interpolated string can be directly assigned to this type because it's a better match than the normal string. This type defines the Format
property that returns the resulting format string, an ArgumentCount
property, and the method GetArgument
that returns the argument values (code file StringSample/Program.cs
):
void UsingFormattableString()
{
int x = 3, y = 4;
FormattableString s = $"The result of {x} + {y} is {x + y}";
Console.WriteLine($"format: {s.Format}");
for (int i = 0; i < s.ArgumentCount; i++)
{
Console.WriteLine($"argument: {i}:{s.GetArgument(i)}");
}
Console.WriteLine();
}
Running this code snippet results in this output:
format: The result of {0} + {1} is {2}
argument 0: 3
argument 1: 4
argument 2: 7
With an interpolated string, you can add a string format to the expression. .NET defines default formats for numbers, dates, and time based on the computer's locale. The following code snippet shows a date, an int
value, and a double
with different format representations. D
is used to display the date in the long date format, d
in the short date format. The number is shown with integral and decimal digits (n
), using an exponential notation (e
), a conversion to hexadecimal (x
), and a currency (c
). With the double value, the first result is shown rounded after the decimal point to three digits (###.###
); with the second version, the three digits before the decimal point are shown as well (000.000
):
void UseStringFormat()
{
DateTime day = new(2025, 2, 14);
Console.WriteLine($"{day:D}");
Console.WriteLine($"{day:d}");
int i = 2477;
Console.WriteLine($"{i:n} {i:e} {i:x} {i:c}");
double d = 3.1415;
Console.WriteLine($"{d:###.###}");
Console.WriteLine($"{d:000.000}");
Console.WriteLine();
}
When you run the application, this is shown:
Friday, February 14, 2025
2/14/2025
2,477.00 2.477000e+003 9ad $2,477.00
3.142
Code snippets in the section “The Character Type” earlier in this chapter included special characters such as \t
for a tab or \r\n
for carriage return newline. You can use these characters in a complete string to get the specific meaning. If you need a backslash in the output of the string, you can escape this with a double backslash \\
. This can be annoying if backslashes are needed multiple times because they can make the code unreadable. For such scenarios, such as when using regular expressions, you can use verbatim strings. A verbatim string is prefixed with the @
character:
string s = @"a tab: \t, a carriage return: \r, a newline: \n";
Console.WriteLine(s);
Running the preceding code results in this output:
a tab: \t, a carriage return: \r, a newline: \n
The String
type offers a Substring
method to retrieve a part of a string. Instead of using the Substring
method, as of C# 8 you can use the hat and the range operators. The range operator uses the ..
notation to specify a range. With the string, you can use the indexer to access one character or use it with the range operator to access a substring. The numbers left and right of the ..
operator specify the range. The left number specifies the 0-indexed first value from the string, which is included from the string up to the 0-indexed last value that is excluded. The range 0..3
would span the string The
. To start from the first character in the string, the 0
can be omitted as shown with the following code snippet. The range 4..9
starts with the fifth character and goes up to the eighth character. To count from the end, you can use the hat operator ^
(code file StringSample/Program.cs
):
void RangesWithStrings()
{
string s = "The quick brown fox jumped over the lazy dogs down " +
"1234567890 times";
string the = s[..3];
string quick = s[4..9];
string times = s[^5..^0];
Console.WriteLine(the);
Console.WriteLine(quick);
Console.WriteLine(times);
Console.WriteLine();
}
The next topic—adding comments to your code—looks simple on the surface, but it can be complex. Comments can be beneficial to other developers who may look at your code. Also, as you will see, you can use comments to generate documentation for your code that other developers can use.
C# uses the traditional C-type single-line (//..
) and multiline (/* .. */
) comments:
// This is a single-line comment
/* This comment
spans multiple lines. */
Everything in a single-line comment, from the //
to the end of the line, is ignored by the compiler, and everything from an opening /*
to the next */
in a multiline comment combination is ignored. It is possible to put multiline comments within a line of code:
Console.WriteLine(/* Here's a comment! */ "This will compile.");
Inline comments can be useful when debugging if, for example, you temporarily want to try running the code with a different value somewhere, as in the following code snippet. However, inline comments can make code hard to read, so use them with care.
DoSomething(Width, /*Height*/ 100);
In addition to the C-type comments illustrated in the preceding section, C# has a very neat feature: the capability to produce documentation in XML format automatically from special comments. These comments are single-line comments, but they begin with three slashes (///
) instead of two. Within these comments, you can place XML tags containing documentation of the types and type members in your code.
The tags in the following table are recognized by the compiler:
TAG | DESCRIPTION |
---|---|
<c> |
Marks up text within a line as code—for example, <c>int i = 10;</c> . |
<code> |
Marks multiple lines as code. |
<example> |
Marks up a code example. |
<exception> |
Documents an exception class. (Syntax is verified by the compiler.) |
<include> |
Includes comments from another documentation file. (Syntax is verified by the compiler.) |
<list> |
Inserts a list into the documentation. |
<para> |
Gives structure to text. |
<param> |
Marks up a method parameter. (Syntax is verified by the compiler.) |
<paramref> |
Indicates that a word is a method parameter. (Syntax is verified by the compiler.) |
<permission> |
Documents access to a member. (Syntax is verified by the compiler.) |
<remarks> |
Adds a description for a member. |
<returns> |
Documents the return value for a method. |
<see> |
Provides a cross-reference to another parameter. (Syntax is verified by the compiler.) |
<seealso> |
Provides a “see also” section in a description. (Syntax is verified by the compiler.) |
<summary> |
Provides a short summary of a type or member. |
<typeparam> |
Describes a type parameter in the comment of a generic type. |
<typeparamref> |
Provides the name of the type parameter. |
<value> |
Describes a property. |
The following code snippet shows the Calculator
class with documentation specified for the class, and documentation for the Add
method (code file Math/Calculator.cs
):
namespace ProCSharp.MathLib
{
///<summary>
/// ProCsharp.MathLib.Calculator class.
/// Provides a method to add two doubles.
///</summary>
public static class Calculator
{
///<summary>
/// The Add method allows us to add two doubles.
///</summary>
///<returns>Result of the addition (double)</returns>
///<param name="x">First number to add</param>
///<param name="y">Second number to add</param>
public static double Add(double x, double y) => x + y;
}
}
To generate the XML documentation, you can add the GenerateDocumentationFile
to the project file (project configuration file Math/Math.csproj
):
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>exe</OutputType>
<TargetFramework>net5.0</TargetFramework>
<Nullable>enable</Nullable>
<GenerateDocumentationFile>true</GenerateDocumentationFile>
</PropertyGroup>
</Project>
With this setting, the documentation file is created in the same directory where the program binary will show up as you compile the application. You can also specify the DocumentationFile
element to define a name that's different from the project file, and you can also specify an absolute directory where the documentation should be generated.
Using tools like Visual Studio, IntelliSense will show tooltips with the information from the documentation as the classes and members are used.
Besides the C# keywords, most of which you have now encountered, C# includes a number of commands that are known as preprocessor directives. These commands are never actually translated to any commands in your executable code, but they affect aspects of the compilation process. For example, you can use preprocessor directives to prevent the compiler from compiling certain portions of your code. You might do this if you target different frameworks and deal with the differences. In another scenario, you might want to turn nullable reference types on or off because changing existing codebases cannot be fixed in the short term.
The preprocessor directives are all distinguished by beginning with the #
symbol.
The following sections briefly cover the purposes of the preprocessor directives.
#define
is used like this:
#define DEBUG
This tells the compiler that a symbol with the given name (in this case DEBUG
) exists. It is a little bit like declaring a variable, except that this variable doesn't really have a value—it just exists. Also, this symbol isn't part of your actual code; it exists only for the benefit of the compiler, whereas the compiler is compiling the code and has no meaning within the C# code itself.
#undef
does the opposite and removes the definition of a symbol:
#undef DEBUG
If the symbol doesn't exist in the first place, then #undef
has no effect. Similarly, #define
has no effect if a symbol already exists.
You need to place any #define
and #undef
directives at the beginning of the C# source file, before any code that declares any objects to be compiled.
#define
isn't of much use on its own, but when combined with other preprocessor directives, especially #if
, it becomes powerful.
By default, with a Debug build, the DEBUG
symbol is defined, and with the Release code, the RELEASE
symbol is defined. To define different code paths on debug and release builds, you don't need to define these symbols; all you have to do is to use the preprocessor directives shown in the next section to define the code paths the compiler should take.
These directives inform the compiler whether to compile a block of code. Consider this method:
int DoSomeWork(double x)
{
// do something
#if DEBUG
Console.WriteLine($"x is {x}");
#endif
}
This code compiles as normal except for the Console.WriteLine
method call contained inside the #if
clause. This line is executed only if the symbol DEBUG
has been defined. As previously mentioned, it's defined with a Debug build—or you defined it with a previous #define
directive. When the compiler finds the #if
directive, it checks to see whether the symbol concerned exists and compiles the code inside the #if
clause only if the symbol does exist. Otherwise, the compiler simply ignores all the code until it reaches the matching #endif
directive. Typical practice is to define the symbol DEBUG
while you are debugging and have various bits of debugging-related code inside #if
clauses. Then, when you are close to shipping, you simply comment out the #define
directive, and all the debugging code miraculously disappears, the size of the executable file gets smaller, and your end users don't get confused by seeing debugging information. (Obviously, you would do more testing to ensure that your code still works without DEBUG
defined.) This technique is common in C and C++ programming and is known as conditional compilation.
The #elif
(=
else if
) and #else
directives can be used in #if
blocks and have intuitively obvious meanings. It is also possible to nest #if
blocks:
#define ENTERPRISE
#define W10
// further on in the file
#if ENTERPRISE
// do something
#if W10
// some code that is only relevant to enterprise
// edition running on W10
#endif
#elif PROFESSIONAL
// do something else
#else
// code for the leaner version
#endif
#if
and #elif
support a limited range of logical operators, too, using the operators !
, ==
, !=
, &&, and ||
. A symbol is considered to be true
if it exists and false
if it doesn't. Here's an example:
#if W10 && !ENTERPRISE // if W10 is defined but ENTERPRISE isn't
Two other useful preprocessor directives, #warning
and #error
, cause a warning or an error, respectively, to be raised when the compiler encounters them. If the compiler sees a #warning
directive, it displays whatever text appears after the #warning
to the user, after which compilation continues. If it encounters an #error
directive, it displays the subsequent text to the user as if it is a compilation error message and then immediately abandons the compilation, so no IL code is generated.
You can use these directives as checks that you haven't done anything silly with your #define
statements; you can also use the #warning
statements to remind yourself to do something:
#if DEBUG && RELEASE
#error "You've defined DEBUG and RELEASE simultaneously!"
#endif
#warning "Don't forget to remove this line before the boss tests the code!"
Console.WriteLine("*I love this job.*");
The #region
and #endregion
directives are used to indicate that a certain block of code is to be treated as a single block with a given name, like this:
#region Member Field Declarations
int x;
double d;
decimal balance;
#endregion
The region directives are ignored by the compiler and used by tools such as the Visual Studio code editor. The editor allows you to collapse region sections, so only the text associated with the region shows. This makes it easier to scroll through the source code. However, you should prefer to write shorter code files instead.
You can use the #line
directive to alter the filename and line number information that is output by the compiler in warnings and error messages. You probably won't want to use this directive often. It's most useful when you are coding in conjunction with another package that alters the code you are typing before sending it to the compiler. In this situation, line numbers, or perhaps the filenames reported by the compiler, don't match up to the line numbers in the files or the filenames you are editing. The #line
directive can be used to restore the match. You can also use the syntax #line default
to restore the line to the default line numbering:
#line 164 "Core.cs" // We happen to know this is line 164 in the file
// Core.cs, before the intermediate
// package mangles it.
// later on
#line default // restores default line numbering
The #pragma
directive can either suppress or restore specific compiler warnings. Unlike command-line options, the #pragma
directive can be implemented on the class or method level, enabling fine-grained control over what warnings are suppressed and when. The following example disables the “field not used” warning and then restores it after the MyClass
class compiles:
#pragma warning disable 169
public class MyClass
{
int neverUsedField;
}
#pragma warning restore 169
With the #nullable
directive, you can turn on or off nullable reference types within a code file. #nullable enable
turns nullable reference types on, no matter what the setting in the project file. #nullable disable
turns it off. #nullable restore
switches the settings back to the settings of the project file.
How do you use this? If nullable reference types are enabled with the project file, you can temporarily turn them off in code sections where you have issues with this compiler behavior and restore it to the project file settings after the code with nullability issues.
This final section of the chapter supplies the guidelines you need to bear in mind when writing C# programs. These are guidelines that most C# developers use. When you use these guidelines, other developers will feel comfortable working with your code.
This section examines the rules governing what names you can use for variables, classes, methods, and so on. Note that the rules presented in this section are not merely guidelines: they are enforced by the C# compiler.
Identifiers are the names you give to variables, user-defined types such as classes and structs, and members of these types. Identifiers are case sensitive, so, for example, variables named interestRate
and InterestRate
would be recognized as different variables. The following are a few rules determining what identifiers you can use in C#:
See the list of C# reserved keywords at https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/
.
If you need to use one of these words as an identifier (for example, if you are accessing a class written in a different language), you can prefix the identifier with the @
symbol to indicate to the compiler that what follows should be treated as an identifier, not as a C# keyword (so abstract
is not a valid identifier, but @abstract
is).
Finally, identifiers can also contain Unicode characters, specified using the syntax \uXXXX
, where XXXX
is the four-digit hex code for the Unicode character. The following are some examples of valid identifiers:
Name
Überfluß
_Identifier
\u005fIdentifier
The last two items in this list are identical and interchangeable (because 005f
is the Unicode code for the underscore character), so, obviously, both these identifiers couldn't be declared in the same scope.
In any development language, certain traditional programming styles usually arise. The styles are not part of the language itself but rather are conventions—for example, how variables are named or how certain classes, methods, or functions are used. If most developers using that language follow the same conventions, it's easier for different developers to understand each other's code—which in turn generally helps program maintainability. Conventions do, however, depend on the language and the environment. For example, C++ developers programming on the Windows platform have traditionally used the prefixes psz
or lpsz
to indicate strings—
char *pszResult; char *lpszMessage;
—but on Unix machines it's more common not to use any such prefixes: char *Result; char *Message;
.
Whereas many languages’ usage conventions simply evolved as the language was used, for C# and the whole of the .NET Framework, Microsoft has written comprehensive usage guidelines that are detailed in the .NET/C# documentation. This means that, right from the start, .NET programs have a high degree of interoperability in terms of developers being able to understand code. The guidelines have also been developed with the benefit of some 20 years’ hindsight in object-oriented programming. Judging by the relevant newsgroups, the guidelines have been carefully thought out and are well received in the developer community. Hence, the guidelines are well worth following.
Note, however, that the guidelines are not the same as language specifications. You should try to follow the guidelines when you can. Nevertheless, you won't run into problems if you have a good reason for not doing so—for example, you won't get a compilation error because you don't follow these guidelines. The general rule is that if you don't follow the usage guidelines, you must have a convincing reason. When you depart from the guidelines, you should be making a conscious decision rather than simply not bothering. Also, if you compare the guidelines with the samples in the remainder of this book, you'll notice that in numerous examples I have chosen not to follow the conventions. That's usually because the conventions are designed for much larger programs than the samples; although the guidelines are great if you are writing a complete software package, they're not really suitable for small 20-line stand-alone programs. In many cases, following the conventions would have made the samples harder, rather than easier, to follow.
The full guidelines for good programming style are quite extensive. This section is confined to describing some of the more important guidelines, as well as those most likely to surprise you. To be absolutely certain that your code follows the usage guidelines completely, you need to refer to the Microsoft documentation.
One important aspect of making your programs understandable is how you choose to name your items—and that includes naming variables, methods, classes, enumerations, and namespaces.
It is intuitively obvious that your names should reflect the purpose of the item and should not clash with other names. The general philosophy in the .NET Framework is also that the name of a variable should reflect the purpose of that variable instance and not the data type. For example, height
is a good name for a variable, whereas integerValue
isn't. However, you are likely to find that principle is an ideal that is hard to achieve. Particularly when you are dealing with controls, in most cases you'll probably be happier sticking with variable names such as confirmationDialog
and chooseEmployeeListBox
, which do indicate the data type in the name.
The following sections look at some of the things you need to think about when choosing names.
In many cases, you should use Pascal casing for names. With Pascal casing, the first letter of each word in a name is capitalized: EmployeeSalary
, ConfirmationDialog
, PlainTextEncoding
. Notice that nearly all the names of namespaces, classes, and members in the base classes follow Pascal casing. In particular, the convention of joining words using the underscore character is discouraged. Therefore, try not to use names such as employee_salary
. It has also been common in other languages to use all capitals for names of constants. This is not advised in C# because such names are harder to read—the convention is to use Pascal casing throughout:
const int MaximumLength;
The only other casing convention that you are advised to use is camel casing. Camel casing is similar to Pascal casing, except that the first letter of the first word in the name is not capitalized: employeeSalary
, confirmationDialog
, plainTextEncoding
. The following are three situations in which you are advised to use camel casing:
private string employeeName;
public string EmployeeName
{
get
{
return employeeName;
}
}
If you are wrapping a property around a field, you should always use camel casing for the private member and Pascal casing for the public or protected member so that other classes that use your code see only names in Pascal case (except for parameter names).
You should also be wary about case sensitivity. C# is case sensitive, so it is syntactically correct for names in C# to differ only by the case, as in the previous examples. However, bear in mind that your assemblies might at some point be called from Visual Basic applications—and Visual Basic is not case sensitive. Hence, if you do use names that differ only by case, it is important to do so only in situations in which both names will never be seen outside your assembly. (The previous example qualifies as okay because camel case is used with the name that is attached to a private
variable.) Otherwise, you may prevent other code written in Visual Basic from being able to use your assembly correctly.
Be consistent about your style of names. For example, if one of the methods in a class is called ShowConfirmationDialog
, then you should not give another method a name such as ShowDialogWarning
or WarningDialogShow
. The other method should be called ShowWarningDialog
.
It is particularly important to choose namespace names carefully to avoid the risk of ending up with the same name for one of your namespaces as someone else uses. Remember, namespace names are the only way that .NET distinguishes names of objects in shared assemblies. Therefore, if you use the same namespace name for your software package as another package and both packages are used by the same program, problems will occur. Because of this, it's almost always a good idea to create a top-level namespace with the name of your company and then nest successive namespaces that narrow down the technology, group, or department you are working in or the name of the package for which your classes are intended. Microsoft recommends namespace names that begin with <CompanyName>.<TechnologyName>
.
It is important that the names do not clash with any keywords. In fact, if you attempt to name an item in your code with a word that happens to be a C# keyword, you'll almost certainly get a syntax error because the compiler will assume that the name refers to a statement. However, because of the possibility that your classes will be accessed by code written in other languages, it is also important that you don't use names that are keywords in other .NET languages. Generally speaking, C++ keywords are similar to C# keywords, so confusion with C++ is unlikely, and those commonly encountered keywords that are unique to Visual C++ tend to start with two underscore characters. As with C#, C++ keywords are spelled in lowercase, so if you hold to the convention of naming your public classes and members with Pascal-style names, they will always have at least one uppercase letter in their names, and there will be no risk of clashes with C++ keywords. However, you are more likely to have problems with Visual Basic, which has many more keywords than C# does, and being non-case-sensitive means that you cannot rely on Pascal-style names for your classes and methods.
Check the Microsoft documentation at docs.microsoft.com/dotnet/csharp/language-reference/keywords
. Here, you find a long list of C# keywords that you shouldn't use with classes and members.
One area that can cause confusion regarding a class is whether a particular quantity should be represented by a property or a method. The rules are not hard and strict, but in general you should use a property if something should look and behave like a variable. (If you're not sure what a property is, see Chapter 3.) This means, among other things, that
SetPassword
method, not a write-only Password
property.ConnectionString
, UserName
, and Password
, and then the author of the class should ensure that the class is implemented such that users can set them in any order.Speed
, in a class that monitors the motion of an automobile, is not a good candidate for a property. Use a GetSpeed
method here; but Weight
and EngineSize
are good candidates for properties because they will not change for a given object.If the item you are coding satisfies all the preceding criteria, it is probably a good candidate for a property. Otherwise, you should use a method.
The guidelines are pretty simple here. Fields should almost always be private, although in some cases it may be acceptable for constant or read-only fields to be public. Making a field public may hinder your ability to extend or modify the class in the future.
The previous guidelines should give you a foundation of good practices, and you should use them in conjunction with a good object-oriented programming style.
A final helpful note to keep in mind is that Microsoft has been relatively careful about being consistent and has followed its own guidelines when writing the .NET base classes, so a good way to get an intuitive feel for the conventions to follow when writing .NET code is to simply look at the base classes—see how classes, members, and namespaces are named, and how the class hierarchy works. Consistency between the base classes and your classes will facilitate readability and maintainability.
This chapter examined the basic syntax of C#, covering the areas needed to write simple C# programs. Much of the syntax is instantly recognizable to developers who are familiar with any C-style language (or even JavaScript). C# has its roots with C++, Java, and Pascal (Anders Hejlsberg, the original lead architect of C# was the original author of Turbo Pascal, and also created J++, Microsoft's version of Java).
Over time, some new features have been invented that are also available with other programming languages, and C# also has gotten more enhancements already available with other languages. The next chapter dives into creating different types; differences between classes, structs, and the new records; and an explanation about the members of types such as properties and more about methods.
So far, you've been introduced to some of the building blocks of the C# language, including variables, data types, and program flow statements, and you have seen a few short but complete programs that contain little more than top-level statements and a few methods. What you haven't seen yet is how to put all these elements together to form a longer program. The key to this lies in working with the types of .NET—classes, records, structs, and tuples, which are the subject of this chapter.
The types available with .NET can be categorized as pass by reference or pass by value.
Pass by value means that if you assign a variable to another variable, the value is copied. If you change the new value, the original value does not change. The content of the variable is copied on assignment. With the following code sample, a struct is created that contains a public field A
. x1
and x2
are variables of this type. After creating x1
, x2
is assigned to x1
. Because struct is a value type, the data from x2
is copied to x1
. Changing the value of the public field with x2
doesn't influence x1
at all. The x1
variable still lists the original value; the value was copied (code file TypesSample/Program.cs
):
AStruct x1 = new() { A = 1 };
AStruct x2 = x1;
x2.A = 2;
Console.WriteLine($"original didn't change with a struct: {x1.A}");
//…
public struct AStruct
{
public int A;
}
This behavior is very different with classes. If you change the public member of A
within the y2
variable, using the reference y1
, the new value assigned from y2
can be read. Pass by reference means that the variables y1
and y2
after assignment reference the same object (code file TypesSample/Program.cs
):
AClass y1 = new() { A = 1 };
AClass y2 = y1;
y2.A = 2;
Console.WriteLine($"original changed with a class: {y1.A}");
//…
public class AClass
{
public int A;
}
Another difference between the types that's worth mentioning is where the data is stored. With a reference type like the class, the memory where the data is stored is the managed heap. The variable itself is on the stack and references the content on the heap. A value type like the struct is usually stored on the stack. This is important with regard to garbage collection. The garbage collector needs to clean up objects in the heap if they are no longer used. Memory on the stack is automatically released at the end of the method, when the variable is outside of its scope.
Let's take a look at the record type that's new with C# 9. Using the record
keyword, a record is created. Similar to our previous example when the class keyword was used to create a reference type, with the record keyword a reference type is created as well. A C# 9 record is a class. This C# keyword is just “syntax sugar”: the compiler creates a class behind the scenes. There's no functionality needed from the runtime; you could create the same generated code without using this keyword, you just would need a lot more code lines (code file TypesSample/Program.cs
):
ARecord z1 = new() { A = 1 };
ARecord z2 = z1;
z2.A = 2;
Console.WriteLine($"original changed with a record: {z1.A}");
//…
public record ARecord
{
public int A;
}
What about tuples? With tuples, you combine multiple types into one type without needing to create a class, struct, or record. How does this type behave?
In the following code snippet, t1
is a tuple that combines a number and a string. The tuple t1
is then assigned to the variable t2
. If you change the value of t2
, t1
is not changed. The reason is that behind the scenes, using the C# syntax for tuples, the compiler makes use of the ValueTuple
type—which is a struct—and copies values (code file TypesSample/Program.cs
):
var t1 = (Number: 1, String: "a");
var t2 = t1;
t2.Number = 2;
t2.String = "b";
Console.WriteLine($"original didn't change with a tuple: {t1.Number} {t1.String}");
Now that you've been introduced to the main differences between classes, structs, records, and tuples, let's dive deeper into the classes, including the members of classes. Most of the members of classes you learn about also apply to records and structs. I discuss the differences between records and structs after I introduce the members of classes.
A class contains members, which can be static or instance. A static member belongs to the class; an instance member belongs to the object. With static fields, the value of the field is the same for every object. With instance fields, every object can have a different value. Static members have the static
modifier attached.
The kinds of members are explained in the following table:
MEMBER | DESCRIPTION |
---|---|
Fields | A field is a data member of a class. It is a variable of a type that is a member of a class. |
Constants | Constants are associated with the class (although they do not have the static modifier). The compiler replaces constants everywhere they are used with the real value. |
Methods | Methods are functions associated with a particular class. |
Properties | Properties are sets of functions that can be accessed from the client in a similar way to the public fields of the class. C# provides a specific syntax for implementing read and write properties on your classes, so you don't have to use method names that are prefixed with the words Get or Set . Because there's a dedicated syntax for properties that is distinct from that for normal functions, the illusion of objects as actual things is strengthened for client code. |
Constructors | Constructors are special functions that are called automatically when an object is instantiated. They must have the same name as the class to which they belong and cannot have a return type. Constructors are useful for initialization. |
Indexers | Indexers allow your object to be accessed the same way as arrays. Indexers are explained in Chapter 5, “Operators and Casts.” |
Operators | Operators, at their simplest, are actions such as + or –. When you add two integers, you are, strictly speaking, using the + operator for integers. C# also allows you to specify how existing operators will work with your own classes (operator overloading). Chapter 5 looks at operators in detail. |
Events | Events are class members that allow an object to notify a subscriber whenever something noteworthy happens, such as a field or property of the class changing, or some form of user interaction occurring. The client can have code, known as an event handler, that reacts to the event. Chapter 7, “Delegates, Lambdas, and Events,” looks at events in detail. |
Destructors | The syntax of destructors or finalizers is similar to the syntax for constructors, but they are called when the CLR detects that an object is no longer needed. They have the same name as the class, preceded by a tilde (~). It is impossible to predict precisely when a finalizer will be called. Finalizers are discussed in Chapter 13. |
Deconstructors | Deconstructors allow you to deconstruct the object into a tuple or different variables. Deconstruction is explained later in the section “Deconstruction.” |
Types | Classes can contain inner classes. This is interesting if the inner type is used only in conjunction with the outer type. |
Let's get into the details of class members.
Fields are any variables associated with the class. In the class Person
, the fields _
firstName
and _lastName
of type string
are defined. It's a good practice to declare fields with the private
access modifier, which only allows accessing fields from within the class (code file ClassesSample/Person.cs
):
public class Person
{
//…
private string _firstName;
private string _lastName;
//…
}
In the class PeopleFactory
, the field s_peopleCount
is of type int
and has the static
modifier applied. With the static
modifier, the field is used with all instances of the class. Instance fields (without the static
modifier) have different values for every instance of the class. Because this class only has static members, the class itself can have the static
modifier applied. The compiler than makes sure that instance members are not added (code file ClassesSample/PeopleFactory.cs
):
public static class PeopleFactory
{
//…
private static int s_peopleCount;
//…
}
To guarantee that fields of an object cannot be changed, fields can be declared with the readonly
modifier. Fields with the readonly
modifier can be assigned only values from constructors. This is different from the const
modifier shown in Chapter 2, “Core C#.” With the const
modifier, the compiler replaces the variable by its value everywhere it is used. The compiler already knows the value of the constant. Read-only fields are assigned during runtime from a constructor. The following Person
class specifies a constructor where values for both firstName
and lastName
need to be passed.
Contrary to const fields, read-only fields can be instance members. With the following code snippet, the _firstName
and _lastName
fields are changed to add the readonly
modifier. The compiler complains with errors if this field is changed after initializing it in the constructor (code file ClassesSample/Person.cs
):
public class Person
{
//…
public Person(string firstName, string lastName)
{
_firstName = firstName;
_lastName = lastName;
}
private readonly string _firstName;
private readonly string _lastName;
//…
}
Instead of having a method pair to set and get the values of a field, C# defines the syntax of a property. From outside of the class, a property looks like a field with typically used uppercase names. Within the class, you can write a custom implementation to set not just fields and get the value of fields, but you can add some programming logic to validate the value before assigning it to a variable. You can also define a purely computed property without any variable that is accessed by the property.
The class Person
as shown in the following code snippet defines a property with the name Age
accessing the private field _age
. With the get
accessor, the value of the field is returned. With the set
accessor, the variable value
, which contains the value passed when setting the property, is automatically created. In the code snippet, the value variable is used to assign the value to the _age
field (code file ClassesSample/Person.cs
):
public class Person
{
//…
private int _age;
public int Age
{
get => _age;
set => _age = value;
}
}
In case more than one statement is needed with the implementation of the property accessor, you can use curly brackets as shown in the following code snippet:
private int _age;
public int Age
{
get
{
return _age;
}
set
{
_age = value;
}
}
To use the property, you can access the property from an object instance. Setting a value to the property invokes the set
accessor. Reading the value invokes the get
accessor:
person.Age = 4; // setting a property value with the set accessor
int age = person.Age; // accessing the property with the get accessor
If there isn't going to be any logic in the property accessors set
and get
, then auto-implemented properties can be used. Auto-implemented properties implement the backing member variable automatically. The code for the earlier Age
example would look like this:
public int Age { get; set; }
The declaration of a private field is not needed. The compiler creates this automatically. With auto-implemented properties, you cannot access the field directly because you don't know the name the compiler generates. If all you need to do with a property is read and write a field, the syntax for the property using auto-implemented properties is shorter than using expression-bodied property accessors.
By using auto-implemented properties, validation of the property cannot be done at the property set. Therefore, with the Age
property, you could not have checked to see whether an invalid age is set.
Auto-implemented properties can be initialized using a property initializer. The compiler moves this initialization to the created constructor, and the initialization is done before the constructor body.
public int Age { get; set; } = 42;
C# allows the set
and get
accessors to have differing access modifiers. This would allow a property to have a public get
and a private or protected set
. This can help control how or when a property can be set. In the following code example, notice that the set
has a private access modifier, but the get
does not. In this case, the get
takes the access level of the property. One of the accessors must follow the access level of the property. A compile error is generated if the get
accessor has the protected
access level associated with it because that would make both accessors have a different access level from the property.
private string _name;
public string Name
{
get => _name;
private set => _name = value;
}
Different access levels can also be set with auto-implemented properties:
public int Age { get; private set; }
It is possible to create a read-only property by simply omitting the set
accessor from the property definition. Thus, to make FirstName
a read-only property, you can do this by just defining the get
accessor:
private readonly string _firstName;
public string FirstName
{
get => _firstName;
}
Declaring the field with the readonly
modifier only allows initializing the value of the property in the constructor.
With properties that only implement a get
accessor, you can use a simplified syntax with the => token and assign an expression-bodied member. There's no need to write the get
accessor to return a value. Behind the scenes, the compiler creates an implementation with a get
accessor.
In the following code snippet, a FirstName
property is defined that returns the field _firstName
using an expression-bodied property. The FullName
property combines the _firstName
field and the value from the LastName
property to return the full name (code file ClassesSample/Person.cs
):
private readonly string _firstName;
public string FirstName => _firstName;
private readonly string _lastName;
public strign LastName => _lastName;
public string FullName => $"{FirstName} {LastName}";
C# offers a simple syntax with auto-implemented properties to create read-only properties that access read-only fields. These properties can be initialized using property initializers:
public string Id { get; } = Guid.NewGuid().ToString();
Behind the scenes, the compiler creates a read-only field and a property with a get
accessor to this field. The code from the initializer moves to the implementation of the constructor and is invoked before the constructor body is called.
Read-only properties can also explicitly be initialized from the constructor, as shown with this code snippet:
public class Book
{
public Book(string title) => Title = title;
public string Title { get; }
}
C# 9 allows you to define properties with get
and init
accessors by using the init
keyword instead of the set
keyword. This way the property value can be set only in the constructor or with an object initializer (code file ClassesSample/Book.cs
):
public class Book
{
public Book(string title)
{
Title = title;
}
public string Title { get; init; }
public string? Publisher { get; init; }
}
C# 9 offers a new option with properties that should only be set with constructors and object initializers. A new Book
object can now be created by invoking the constructor and using an object initializer to set the properties as shown in the following code snippet (code file ClassesSample/Program.
cs):
Book theBook = new("Professional C#")
{
Publisher = "Wrox Press"
};
You can use object initializers to initialize properties on creation of the object. The constructor defines the required parameters that the class needs for initialization. With the object initializer, you can assign all properties with a set
and an init
accessor. The object initializer can be used only when creating the object, not afterward.
With the C# terminology, there's a distinction between functions and methods. The term function member includes not only methods, but also other nondata members such as indexers, operators, constructors, destructors, and properties—all members that contain executable code.
In C#, the definition of a method consists of any method modifiers (such as the method's accessibility), followed by the type of the return value, followed by the name of the method, followed by a list of parameters enclosed in parentheses, followed by the body of the method enclosed in curly brackets.
Each parameter consists of the name of the type of the parameter and the name by which it can be referenced in the body of the method. Also, if the method returns a value, a return statement must be used with the return value to indicate each exit point, as shown in this example:
public bool IsSquare(Rectangle rect)
{
return (rect.Height == rect.Width);
}
If the method doesn't return anything, specify a return type of void
because you can't omit the return type altogether. If the method takes no parameters, you need to include an empty set of parentheses after the method name. With a void return, using a return statement in the implementation is optional—the method returns automatically when the closing curly brace is reached.
If the implementation of a method consists just of one statement, C# gives a simplified syntax to method definitions: expression-bodied methods. You don't need to write curly brackets and the return
keyword with this syntax. The =>
token is used to distinguish the declaration of the left side of this operator to the implementation that is on the right side.
The following example is the same method as before, IsSquare
, implemented using the expression-bodied method syntax. The right side of the =>
token defines the implementation of the method. Curly brackets and a return statement are not needed. What's returned is the result of the statement, and the result needs to be of the same type as the method declared on the left side, which is a bool
in this code snippet:
public bool IsSquare(Rectangle rect) => rect.Height == rect.Width;
The following example illustrates the syntax for definition and instantiation of classes and for definition and invocation of methods. The class Math
defines instance and static members (code file MathSample/Math.cs
):
public class Math
{
public int Value { get; set; }
public int GetSquare() => Value * Value;
public static int GetSquareOf(int x) => x * x;
}
The top-level statements in the Program.cs
file uses the Math
class, calls static methods, and instantiates an object to invoke instance members (code file MathSample/Program.cs
):
using System;
// Call static members
int x = Math.GetSquareOf(5);
Console.WriteLine($"Square of 5 is {x}");
// Instantiate a Math object
Math math = new();
// Call instance members
math.Value = 30;
Console.WriteLine($"Value field of math variable contains {math.Value}");
Console.WriteLine($"Square of 30 is {math.GetSquare()}");
Running the MathSample
example produces the following results:
Square of 5 is 25
Value field of math variable contains 30
Square of 30 is 900
As you can see from the code, the Math
class contains a property that contains a number, as well as a method to find the square of this number. It also contains one static method to find the square of the number passed in as a parameter.
C# supports method overloading—several versions of the method that have different signatures (that is, the same name but a different number of parameters and/or different parameter data types). To overload methods, simply declare the methods with the same name but different numbers of parameter types:
class ResultDisplayer
{
public void DisplayResult(string result)
{
// implementation
}
public void DisplayResult(int result)
{
// implementation
}
}
It's not just the parameter types that can differ; the number of parameters can differ too, as shown in the next example. One overloaded method can invoke another:
class MyClass
{
public int DoSomething(int x) => DoSomething(x, 10);
public int DoSomething(int x, int y)
{
// implementation
}
}
When invoking methods, the variable name need not be added to the invocation. However, if you have a method signature like the following to move a rectangle:
public void MoveAndResize(int x, int y, int width, int height)
and you invoke it with the following code snippet, it's not clear from the invocation what numbers are used for what:
r.MoveAndResize(30, 40, 20, 40);
You can change the invocation to make it immediately clear what the numbers mean:
r.MoveAndResize(x: 30, y: 40, width: 20, height: 40);
Any method can be invoked using named arguments. You just need to write the name of the variable followed by a colon and the value passed. The compiler gets rid of the name and creates an invocation of the method just as if the variable name is not there—so there's no difference within the compiled code.
You can also change the order of variables this way, and the compiler rearranges it to the correct order. The real advantage to this is shown in the next section with optional arguments.
Parameters can also be optional. You must supply a default value for optional parameters, which must be the last ones defined:
public void TestMethod(int notOptionalNumber, int optionalNumber = 42)
{
Console.WriteLine(optionalNumber + notOptionalNumber);
}
This method can now be invoked using one or two parameters. When you pass one parameter, the compiler changes the method call to pass 42
with the second parameter:
TestMethod(11);
TestMethod(11, 42);
You can define multiple optional parameters, as shown here:
public void TestMethod(int n, int opt1 = 11, int opt2 = 22, int opt3 = 33)
{
Console.WriteLine(n + opt1 + opt2 + opt3);
}
This way, the method can be called using one, two, three, or four parameters. The first line of the following code leaves the optional parameters with the values 11
, 22
, and 33
. The second line passes the first three parameters, and the last one has a value of 33
:
TestMethod(1);
TestMethod(1, 2, 3);
With multiple optional parameters, the feature of named arguments shines. When you use named arguments, you can pass any of the optional parameters. For example, this example passes just the last one:
TestMethod(1, opt3: 4);
When you use optional arguments, you can define a variable number of arguments. However, there's also a different syntax that allows passing a variable number of arguments—and this syntax doesn't have versioning issues.
When you declare the parameter of type array—the sample code uses an int
array—and add the params
keyword, the method can be invoked using any number of int
parameters.
public void AnyNumberOfArguments(params int[] data)
{
foreach (var x in data)
{
Console.WriteLine(x);
}
}
Because the parameter of the method AnyNumberOfArguments
is of type int[]
, you can pass an int
array, or because of the params
keyword, you can pass zero or more int
values:
AnyNumberOfArguments(1);
AnyNumberOfArguments(1, 3, 5, 7, 11, 13);
If arguments of different types should be passed to methods, you can use an object
array:
public void AnyNumberOfArguments(params object[] data)
{
// …
Now it is possible to use any type for the parameters calling this method:
AnyNumberOfArguments("text", 42);
If the params
keyword is used with multiple parameters that are defined with the method signature, params
can be used only once, and it must be the last parameter:
Console.WriteLine(string format, params object[] arg);
Now that you've looked at the many aspects of methods, let's get into constructors, which are a special kind of method.
The syntax for declaring basic constructors is a method that has the same name as the containing class and that does not have any return type:
public class MyClass
{
public MyClass()
{
}
//…
}
It's not necessary to provide a constructor for your class. If you don't supply any constructor, the compiler generates a default behind the scenes. This constructor initializes all the member fields to the default values, which is 0
for numbers, false
for bool
, and null
for reference types. When you're using nullable reference types and don't declare your reference types to allow null
, you'll get a compiler warning if these fields are not initialized.
Constructors follow the same rules for overloading as other methods—that is, you can provide as many overloads to the constructor as you want, provided they are clearly different in signature:
public MyClass() // parameterless constructor
{
// construction code
}
public MyClass(int number) // constructor overload with an int parameter
{
// construction code
}
If you supply any constructors, the compiler does not automatically supply a default one. The default constructor is created only if other constructors are not defined.
Note that it is possible to define constructors as private
or protected
so that they are invisible to code in unrelated classes, too:
public class MyNumber
{
private int _number;
private MyNumber(int number) => _number = number;
//…
}
An example in which this is useful is to create a singleton where an instance can be created only from a static factory method.
If the implementation of a constructor just consists of a single expression, the constructor can be implemented with an expression-bodied implementation:
public class Singleton
{
private static Singleton s_instance;
private int _state;
private Singleton(int state) => _state = state;
public static Singleton Instance => s_instance ??= new Singleton(42);
}
You can also initialize multiple properties with a single expression. You can do this using the tuple syntax as shown in the following code snippet. With the Book
constructor, two parameters are required. Putting these two variables in parentheses creates a tuple. This tuple is then deconstructed and put into the properties specified with the left side of the assignment operator. Behind the scenes, the compiler detects that tuples are not needed for the initialization and creates the same code whether you initialize the properties within curly brackets or with the tuple syntax shown:
public class Book
{
public Book(string title, string publisher) =>
(Title, Publisher) = (title, publisher);
public string Title { get; }
public string Publisher { get; }
}
When you're creating multiple constructors in a class, you shouldn't duplicate the implementation. Instead, one constructor can invoke another one from a constructor initializer.
Both constructors initialize the same fields. It would clearly be tidier to place all the code in one location. C# has a special syntax known as a constructor initializer to enable this:
class Car
{
private string _description;
private uint _nWheels;
public Car(string description, uint nWheels)
{
_description = description;
_nWheels = nWheels;
}
public Car(string description): this(description, 4)
{
}
}
In this context, the this
keyword simply causes the constructor with the matching parameters to be called. Note that any constructor initializer is executed before the body of the constructor.
Static members of a class can be used before any instance of this class is created (if any instance is created at all). To initialize static members, you can create a static constructor. The static constructor has the same name as the class (similar to an instance constructor), but the static
modifier is applied. This constructor cannot have an access modifier applied because it isn't invoked from the code using the class. This constructor is automatically invoked before any other member of this class is called or any instance is created:
class MyClass
{
static MyClass()
{
// initialization code
}
//…
}
The .NET runtime makes no guarantees about when a static constructor will be executed, so you should not place any code in it that relies on it being executed at a particular time (for example, when an assembly is loaded). Nor is it possible to predict in what order static constructors of different classes will execute. However, what is guaranteed is that the static constructor will run at most once, and it will be invoked before your code makes any reference to the class. In C#, the static constructor is usually executed immediately before the first call to any member of the class.
Methods with a public
access modifier can be invoked from outside of the class. Methods with a private
access modifier can be invoked from anywhere within the class (from other methods, property accessors, constructors, and so on). To restrict this further, a local function can be invoked only from within the method where the local function is declared. The local function has the scope of the method and cannot be invoked from somewhere else.
Within the method IntroLocalFunctions
, the local function Add
is defined. Parameters and return types are implemented in the same way as a normal method. Similarly to a normal method, a local function can be implemented by using curly brackets or with an expression-bodied implementation as shown in the following code. Since C# 8, the local function can have the static
modifier associated if the implementation doesn't access instance members defined with the class or local variables of the method. With the static
modifier, the compiler makes sure this does not happen and can optimize the generated code. The local function is invoked in the method itself; it cannot be invoked anywhere else in the class. Whether the local function is declared before or after its use is just a matter of taste (code file MethodSample/LocalFunctionsSample.cs
):
public static void IntroLocalFunctions()
{
static int Add(int x, int y) => x + y;
int result = Add(3, 7);
Console.WriteLine("called the local function with this result: {result}");
}
With the next code snippet, the local function Add
is declared without the static
modifier. In the implementation, this function not only uses the variables specified with the arguments of the function but also variable z
, which is specified in the outer scope of the local function, within the scope of the method. When accessing the variable outside of its scope (known as closure), the compiler creates a class where the data used within this function is passed in a constructor. Here, the local function needs to be declared after the variables used within the local function. That's why the local function is put at the end of the method LocalFunctionWithClosure
:
public static void LocalFunctionWithClosure()
{
int z = 3;
int result = Add(1, 2);
Console.WriteLine("called the local function with this result: {result}");
int Add(int x, int y) => x + y + z;
}
If you need implementations of methods that support multiple types, you can implement generic methods. The method Swap<T>
defines T
as a generic type that is used for two arguments and a local variable temp (code file MeethodSample/GenericMethods.cs
):
class GenericMethods
{
public static void Swap<T>(ref T x, ref T y)
{
T temp;
temp = x;
x = y;
y = temp;
}
}
With extension methods, you can create methods that extend other types.
The following code snippet defines the method GetWordCount
that is used to extend the string
type. An extension method is not defined by the name of the class but instead by using the this
modifier with the parameter.
GetWordCount
extends the string type because the parameter with the this
modifier (which needs to be the first parameter) is of type string. Extension methods need to be static and declared in a static class (code file ExtensionMethods/StringExtensions.cs
):
public static class StringExtensions
{
public static int GetWordCount(this string s) => s.Split().Length;
}
To use this extension method, the namespace of the extension class needs to be imported; then the method can be called in the same way as an instance method (code file ExtensionMethods/Program.cs
):
string fox = "the quick brown fox jumped over the lazy dogs";
int wordCount = fox.GetWordCount();
Console.WriteLine($"{wordCount} words");
Console.ReadLine();
It might look like extension methods break object-oriented rules in regard to inheritance and encapsulation because methods can be added to an existing type without inheriting from it and without changing the type. However, you can only access public members. Extension methods are really just “syntax sugar” because the compiler changes the invocation of the method to call a static method that's passing the instance as the parameter, as shown here:
int wordCount = StringExtensions.GetWordCount(fox);
Why would you create extension methods instead of calling static methods? The code can become a lot easier to read. Just check into the extension methods implemented for LINQ (see Chapter 9) or the extension methods used to configure configuration and logging providers (see Chapter 15, “Dependency Injection and Configuration”).
Chapter 2 discusses the var
keyword in reference to implicitly typed variables. When used with the new
keyword, you can create anonymous types. An anonymous type is simply a nameless class that inherits from object
. The definition of the class is inferred from the initializer, just as with implicitly typed variables.
For example, if you need an object that contains a person's first, middle, and last name, the declaration would look like this:
var captain = new
{
FirstName = "James",
MiddleName = "Tiberius",
LastName = "Kirk"
};
This would produce an object with FirstName
, MiddleName
, and LastName
read-only properties. If you were to create another object that looked like this:
var doctor = new
{
FirstName = "Leonard",
MiddleName = string.Empty,
LastName = "McCoy"
};
then the types of captain
and doctor
are the same. You could set captain = doctor
, for example. This is possible only if all the properties match.
The names for the members of anonymous types can be inferred if the values that are being set come from another object. This way, the initializer can be abbreviated. If you already have a class that contains the properties FirstName
, MiddleName
, and LastName
and you have an instance of that class with the instance name person
, then the captain
object could be initialized like this:
var captain = new
{
person.FirstName,
person.MiddleName,
person.LastName
};
The property names from the person
object are inferred in the new object named captain
, so the object named captain
has FirstName
, MiddleName
, and LastName
properties.
The actual type name of anonymous types is unknown, which is where the name comes from. The compiler “makes up” a name for the type, but only the compiler is ever able to make use of it. Therefore, you can't and shouldn't plan on using any type reflection on the new objects because you will not get consistent results.
So far in this chapter, you've seen that records are reference types that support value semantics. This type allows reducing the code you need to write because the compiler automatically implements comparing records by value and gives some more features, which are explained in this section.
A main use case for records is to create immutable types (although you can also create mutable types with records). An immutable type just contains members where the state of the type cannot be changed. You can initialize such a type in a constructor or with an object initializer, but you can't change any values afterward.
Immutable types are useful with multithreading. When you're using multiple threads to access the immutable object, you don't need to worry with synchronization because the values cannot change.
An example of an immutable type is the String
class. This class does not define any member that is allowed to change its content. Methods such as ToUpper
(which changes the string to uppercase) always return a new string, but the original string passed to the constructor remains unchanged.
Records can be created in two kinds: nominal and positional records. A nominal record looks like a class just using the record
keyword instead of the class
keyword, as shown with the type Book1
. Here, init-only set accessors are used to forbid state changes after an instance has been created (code file RecordsSample/Program.cs
):
public record Book1
{
public string Title { get; init; } = string.Empty;
public string Publisher { get; init; } = string.Empty;
}
You can add constructors and all the other members you learned about in this chapter. The compiler just creates a class with the record syntax. What's different from classes is that the compiler creates some more functionality inside this class. The compiler overrides the GetHashCode
and ToString
methods of the base class object
, creates methods and operator overloads to compare different values for equality, and creates methods to clone existing objects and create new ones where object initializers can be used to change some property values.
The second way to implement a record is to use the positional record syntax. With this syntax, parentheses are used after the name of the record to specify the members. This syntax has the name primary constructor. The compiler creates a class from this code as well, with init-only set accessors for the types used with the primary constructor and a constructor with the same parameters to initialize the properties (code file RecordsSample/Program.cs
):
public record Book2(string Title, string Publisher);
You can use curly brackets to add what you need to the already existing implementation—for example, by adding overloaded constructors, methods, or any other members you've seen earlier in this chapter:
public record Book2(string Title, string Publisher)
{
// add your members, overloads
}
As the compiler creates a constructor with parameters, you can instantiate an object as you're used to—by passing the values to the constructor (code file RecordsSample/Program.cs
):
Book2 b2 = new("Professional C#", "Wrox Press");
Console.WriteLine(b2);
Because the compiler creates a ToString
method that is implicitly invoked by passing the variable to the WriteLine
method, this is what's shown on the console: the name of the class followed by the property names with their values in curly brackets:
Book2 { Title = Professional C#, Publisher = Wrox Press }
With positional records, the compiler creates the same members as with nominal records and adds methods for deconstruction. Deconstruction is explained later in this chapter in the section “Deconstruction.”
With classes, the default implementation for equality is to compare the reference. Creating two new objects of the same type that are initialized to the same values are different because they reference different objects in the heap. This is different with records. With the equality implementation of records, two records are equal if their property values are the same.
In the following code snippet, two records that contain the same values are created. The object.ReferenceEquals
method returns false
, because these are two different references. Using the equal operator ==
returns true
because this operator is implemented with the record type (code file RecordsSample/Program.cs
):
Book1 book1a = new() { Title = "Professional C#", Publisher = "Wrox Press" };
Book1 book1b = new() { Title = "Professional C#", Publisher = "Wrox Press" };
if (!object.ReferenceEquals(book1a, book1b))
Console.WriteLine("Two different references for equal records");
if (book1a == book1b)
Console.WriteLine("Both records have the same values");
The record type implements the IEquality
interface with the Equals
method, as well as the equality ==
and the inequality !=
operators.
Records make it easy to create immutable types, but there's a new feature with records for easily creating new record instances. The .NET Compiler Platform (also known by the name Roslyn) is built with immutable objects and many With
methods to create new objects from existing ones. With the C# 9 enhancement, the with
expressions, there's a lot of simplification that can be used by the Roslyn team. The code created with the record syntax includes a copy constructor and a Clone method with a hidden name where all the values of the existing object are copied to a new instance that's returned from this method. The with
expression now makes use of this Clone method, and with the init-only set accessors, you can use object initialization to set the values that should be different.
var aNewBook = book1a with { Title = "Professional C# and .NET - 2024" };
So far, you have seen how classes and records offer a great way to encapsulate objects in your program. You have also seen how they are stored on the heap in a way that gives you much more flexibility in data lifetime but with a slight cost in performance. Objects stored in the heap require work from the garbage collector to remove the memory of the objects that are no longer needed. To reduce the work needed by the garbage collector, you can use the stack for smaller objects.
Chapter 2 discusses predefined value types such as int
and double
, which are represented as a struct type. You can create such structs on your own.
Just by using the struct
keyword instead of the class
keyword, the type is by default stored in the stack instead of the heap.
The following code snippet defines a struct called Dimensions
, which simply stores the length and width of an item. Suppose you're writing a furniture-arranging program that enables users to experiment with rearranging their furniture on the computer, and you want to store the dimensions of each item of furniture. All you have is two numbers, which you'll find convenient to treat as a pair rather than individually. There is no need for a lot of methods, or for you to be able to inherit from the class, and you certainly don't want to have the .NET runtime go to the trouble of bringing in the heap, with all the performance implications, just to store two double
s (code file StructsSample/Dimensions.cs
):
public readonly struct Dimensions
{
public Dimensions(double length, double width)
{
Length = length;
Width = width;
}
public double Length { get; }
public double Width { get; }
//…
}
Defining members for structs is done in the same way as defining them for classes and records. You've already seen a constructor with the Dimensions
struct. The following code demonstrates adding the property Diagonal
invoking the Sqrt
method of the Math
class (code file StructsSample/Dimensions.cs
):
public struct Dimensions
{
//…
public double Diagonal => Math.Sqrt(Length * Length + Width * Width);
}
Structs make use of the previously discussed pass by value semantics, where values are copied. This is not the only difference with classes and records:
An enumeration is a value type that contains a list of named constants, such as the Color
type shown here. The enumeration type is defined by using the enum
keyword:
public enum Color
{
Red,
Green,
Blue
}
You can declare variables of enum
types, such as the variable c1
, and assign a value from the enumeration by setting one of the named constants prefixed with the name of the enum
type (code file EnumSample/Program.cs
):
void ColorSamples()
{
Color c1 = Color.Red;
Console.WriteLine(c1);
//…
}
When you run the program, the console output shows Red
, which is the constant value of the enumeration.
By default, the type behind the enum type is an int
. You can change the underlying type to other integral types (byte
, short
, int
, long
with signed and unsigned variants). The values of the named constants are incremental values starting with 0, but you can change them to other values (code file EnumSample/Color.cs
):
public enum Color : short
{
Red = 1,
Green = 2,
Blue = 3
}
You can change a number to an enumeration value and back using casts.
Color c2 = (Color)2;
short number = (short)c2;
You can also use an enum type to assign multiple options to a variable and not just one of the enum constants. To make exclusive enum values, the numbers assigned to the values should each set a single different bit.
The enum type DaysOfWeek
defines different values for every day. Setting different bits can be done easily using hexadecimal values that are assigned using the 0x
prefix. The Flags
attribute is information for the compiler for creating a different string representation of the values—for example, setting the value 3
to a variable of DaysOfWeek
results in Monday, Tuesday
when you use the Flags
attribute (code file EnumSample/DaysOfWeek.cs
):
[Flags]
public enum DaysOfWeek
{
Monday = 0x1,
Tuesday = 0x2,
Wednesday = 0x4,
Thursday = 0x8,
Friday = 0x10,
Saturday = 0x20,
Sunday = 0x40
}
With such an enum declaration, you can assign a variable multiple values using the logical OR operator (code file EnumSample/Program.cs
):
DaysOfWeek mondayAndWednesday = DaysOfWeek.Monday | DaysOfWeek.Wednesday;
Console.WriteLine(mondayAndWednesday);
When you run the program, the output is a string representation of the days:
Monday, Wednesday
When you set different bits, you also can combine single bits to cover multiple values, such as Weekend
with a value of 0x60
. The value 0x60
is created by combining Saturday
and Sunday
with the logical OR operator. Workday
is set to 0x1f
to combine all days from Monday
to Friday
, and AllWeek
to combine Workday
and Weekend
with the logical OR operator (code file EnumSample/DaysOfWeek.cs
):
[Flags]
public enum DaysOfWeek
{
Monday = 0x1,
Tuesday = 0x2,
Wednesday = 0x4,
Thursday = 0x8,
Friday = 0x10,
Saturday = 0x20,
Sunday = 0x40,
Weekend = Saturday | Sunday,
Workday = 0x1f,
AllWeek = Workday | Weekend
}
With this in place, you can assign DaysOfWeek.Weekend
directly to a variable, but assigning the separate values DaysOfWeek.Saturday
and DaysOfWeek.Sunday
combined with the logical OR operator gives the same result. The output shown is the string representation of Weekend
:
DaysOfWeek weekend = DaysOfWeek.Saturday | DaysOfWeek.Sunday;
Console.WriteLine(weekend);
When you're working with enumerations, the class Enum
is sometimes a big help for dynamically getting some information about enum types. Enum
offers methods to parse strings to get the corresponding enumeration constant and to get all the names and values of an enum type.
The following code snippet uses a string to get the corresponding Color
value using Enum.TryParse
(code file EnumSample/Program.cs
):
if (Enum.TryParse<Color>("Red", out Color red))
{
Console.WriteLine($"successfully parsed {red}");
}
The Enum.GetNames
method returns a string array of all the names of the enumeration:
foreach (var color in Enum.GetNames(typeof(Color)))
{
Console.WriteLine(color);
}
When you run the application, this is the output:
Red
Green
Blue
To get all the values of the enumeration, you can use the method Enum.GetValues
. To get the integral value, it needs to be cast to the underlying type of the enumeration, which is done by the foreach
statement:
foreach (short color in Enum.GetValues(typeof(Color)))
{
Console.WriteLine(color);
}
A value type is passed by value; thus, the value of a variable is copied when assigned to another variable, such as when it's passed to a method. There's a way around that. If you use the ref
keyword, a value type is passed by reference. In this section, you learn about the parameter and return type modifiers ref
, in
, and out
.
The following code snippet defines the method ChangeAValueType
, where an int
is passed by reference. Remember, the int
is declared as struct, so this behavior is valid with custom structs as well. By default, the int
would be passed by value. Because of the ref
modifier, the int
is passed by reference (using an address of the int
variable). Within the implementation, now the variable named x
references the same data on the stack as the variable a
does. Changing the value of x
also changes the value of a
, so after the invocation, the variable a
contains the value 2
(code file RefInOutSample/Program.cs
):
int a = 1;
ChangeAValueType(ref a);
Console.WriteLine($"the value of a changed to {a}");
void ChangeAValueType(ref int x)
{
x = 2;
}
Passing a value type by reference requires the ref
keyword with the method declaration and when calling the method. This is important information for the caller; knowing the method receiving this value type can change the content.
Now you might wonder if it could be useful to pass a reference by using the ref
keyword. Passing a reference allows the method to change the content anyway. Indeed, it can be useful, as the following code snippet demonstrates. The method ChangingAReferenceByReference
specifies the ref
modifier with the argument of type SomeData
, which is a class. In the implementation, first the value of the Value
property is changed to 2
. After this, a new instance is created, which references an object with a Value
of 3
. If you try to remove the ref
keyword from the method declaration, as well as the invocation of this method, after the invocation data1.Value
has the value 2
. Without the ref
keyword, the data1
variable references the object on the heap and the data
variable at the beginning of the method. After creating a new object, the data
variable references a new object on the heap, which then contains the value 3
. With the ref
keyword used as in the sample, the data
variable references the data1
variable; it's a pointer to a pointer. This way, a new instance can be created within the ChangingAReferenceByRef
method, and the variable data1
references this new object instead of the old one:
SomeData data1 = new() { Value = 1 };
ChangingAReferenceByRef(ref data1);
Console.WriteLine($"the new value of data1.Value is: {data1.Value}");
void ChangingAReferenceByRef(ref SomeData data)
{
data.Value = 2;
data = new SomeData { Value = 3 };
}
class SomeData
{
public int Value { get; set; }
}
If you want to avoid the overhead of copying a value type when passing it to a method but don't want to change the value within the method, you can use the in
modifier.
For the next sample code, the SomeValue
struct, which contains four int
values, is defined (code file RefInOutSample/Program.cs
):
struct SomeValue
{
public SomeValue(int value1, int value2, int value3, int value4)
{
Value1 = value1;
Value2 = value2;
Value3 = value3;
Value4 = value4;
}
public int Value1 { get; set; }
public int Value2 { get; set; }
public int Value3 { get; set; }
public int Value4 { get; set; }
}
If you declare a method where the SomeValue
struct is passed as an argument, the four int
values need to be copied on method invocation. When you use the ref
keyword, you don't need a copy, and you can pass a reference. However, with the ref
keyword, the caller might not want the called method to make any change. To guarantee that changes are not happening, you use the in
modifier. With this modifier, a pass by reference is happening, but the compiler does not allow change to any value when the data variable is used. Data is now a read-only variable:
void PassValueByReferenceReadonly(in SomeValue data)
{
// data.Value1 = 4; - you cannot change a value, it's a read-only variable!
}
To avoid copying the value on return of a method, you can declare the return type with the ref
keyword and use return ref
. The Max
method receives two SomeValue
structs with the parameters and returns the larger of these two. With the parameters, the values are not copied using the ref
modifier, as shown here:
ref SomeValue Max(ref SomeValue x, ref SomeValue y)
{
int sumx = x.Value1 + x.Value2 + x.Value3 + x.Value4;
int sumy = y.Value1 + y.Value2 + y.Value3 + y.Value4;
if (sumx> sumy)
{
return ref x;
}
else
{
return ref y;
}
}
Within the implementation of the Max
method, you can replace the if
/
else
statement with a conditional ref expression. With this, the ref
keyword needs to be used with the expression to compare sumx
and sumy
. Based on the result, a ref
to x
or to y
is written to a ref
local, which is then returned:
ref SomeValue Max(ref SomeValue x, ref SomeValue y)
{
int sumx = x.Value1 + x.Value2 + x.Value3 + x.Value4;
int sumy = y.Value1 + y.Value2 + y.Value3 + y.Value4;
ref SomeValue result = ref (sumx > sumy) ? ref x : ref y;
return ref result;
}
Whether the returned value should be copied or a reference should be used is a decision from the caller. In the following code snippet, with the first invocation of the Max
method, the result is copied to the bigger1
variable, although the method is declared to return a ref
. There's not a compiler error with the first version (contrary to the ref parameters). You will not have any issues when the value is copied—other than the performance hit. With the second invocation, the ref
keyword is used to invoke the method to get a ref return
. With this invocation, the result needs to be written to a ref local. The third invocation writes the result into a ref readonly local. With the Max
method, there's no change needed. The readonly
used here is only to specify that the bigger3
variable will not be changed, and the compiler complains if properties are set to change its values:
SomeValue one = new SomeValue(1, 2, 3, 4);
SomeValue two = new SomeValue(5, 6, 7, 8);
SomeValue bigger1 = Max(ref one, ref two);
ref SomeValue bigger2 = ref Max(ref one, ref two);
ref readonly SomeValue bigger3 = ref Max(ref one, ref two);
The Max
method doesn't change any of its inputs. This allows using the in
keyword with the parameters as shown with the MaxReadonly
method. However, here the declaration of the return must be changed to ref readonly
. If this change wouldn't be necessary, the caller of this method would be allowed to change one of the inputs of the MaxReadonly
method after receiving the result:
ref readonly SomeValue MaxReadonly(in SomeValue x, in SomeValue y)
{
int sumx = x.Value1 + x.Value2 + x.Value3 + x.Value4;
int sumy = y.Value1 + y.Value2 + y.Value3 + y.Value4;
return ref (sumx> sumy) ? ref x : ref y;
}
Now the caller is required to write the result to a ref readonly
or to copy the result into a new local. With bigger5
, readonly
is not required because the original value received is copied:
ref readonly SomeValue bigger4 = ref MaxReadonly(in one, in two);
SomeValue bigger5 = MaxReadonly(in one, in two);
If a method should return multiple values, there are different options. One option is to create a custom type. Another option is to use the ref
keyword with parameters. Using the ref
keyword, the parameter needs to be initialized before invoking the method. With the ref
keyword, data is passed into and returned from the method. If the method should just return data, you can use the out
keyword.
The int.Parse
method expects a string
to be passed and returns an int
—if the parsing succeeds. If the string
cannot be parsed to an int
, an exception is thrown. To avoid such exceptions, you can instead use the
int.TryParse
method. This method returns a Boolean whether the parsing is successful or not. The result of the parse operation is returned with an out
parameter.
This is the declaration of the TryParse
method with the int
type:
bool TryParse(string? s, out int result);
To invoke the TryParse
method, an int
is passed with the out
modifier. Using the out
modifier, the variable doesn't need to be declared before invoking the method and doesn't need to be initialized:
Console.Write("Please enter a number: ");
string? input = Console.ReadLine();
if (int.TryParse(input, out int x))
{
Console.WriteLine();
Console.WriteLine($"read an int: {x}");
}
With arrays, you can combine multiple objects of the same type into one object. When you're using classes, structs, and records, you can combine multiple objects into one object and add properties, methods, events, and all the different members of types. Tuples enable you to combine multiple objects of different types into one without the complexity of creating custom types.
To better understand some advantages of tuples, let's take a look at what a method can return. To return a result from a method that returns multiple results, you need to either create a custom type where you can combine the different result types or use the ref
or out
keywords with parameters. Using ref
and out
has an important restriction: you cannot use this with asynchronous methods. Creating custom types has its advantages, but in some cases, this is not needed. You have a simpler path with tuples and can return a tuple from a method. As of C# 7, tuples are integrated with the C# syntax.
Declaring and Initializing Tuples
A tuple can be declared using parentheses and initialized using a tuple literal that is created with parentheses as well. In the following code snippet, on the left side, a tuple variable tuple1
that contains a string
, an int
, and a Book
is declared. On the right side, a tuple literal is used to create a tuple with the string magic
, the number 42
, and a Book
object initialized using the primary constructor of the Book
record. The tuple can be accessed using the variable tuple1
with the members declared in the parentheses (AString
, Number
, and Book
in this example; code file TuplesSample/Program.cs
):
void IntroTuples()
{
(string AString, int Number, Book Book) tuple1 =
("magic", 42, new Book("Professional C#", "Wrox Press"));
Console.WriteLine($"a string: {tuple1.AString}, " +
$"number: {tuple1.Number}, " +
$"book: {tuple1.Book}");
//…
}
public record Book(string Title, string Publisher);
When you run the application (the top-level statements invoke IntroTuples
), the output shows the values of the tuple:
a string: magic, number: 42, book: Book { Title = Professional C#, Publisher = Wrox Press }
The tuple literal also can be assigned to a tuple variable without declaring its members. This way the members of the tuple are accessed using the member names of the ValueTuple
struct: Item1
, Item2
, and Item3
:
var tuple2 = ("magic", 42, new Book("Professional C#", "Wrox Press"));
Console.WriteLine($"a string: {tuple2.Item1}, number: {tuple2.Item2}, " +
$"book: {tuple2.Item3}");
You can assign names to the tuple fields in the tuple literal by defining the name followed by a colon, which is the same syntax as with object literals:
var tuple3 = (AString: "magic", Number: 42,
Book: new Book("Professional C#", "Wrox Press"));
With all this, names are just a convenience. You can assign one tuple to another one when the types match; the names do not matter:
(string S, int N, Book B) tuple4 = tuple3;
The name of the tuple members can also be inferred from the source. With the variable tuple5
, the second member is a string with the title of the book. A name for this member is not assigned, but because the property has the name Title
, Title
is automatically taken for the tuple member name:
Book book = new("Professional C#", "Wrox Press");
var tuple5 = (ANumber: 42, book.Title);
Console.WriteLine(tuple5.Title);
Tuple Deconstruction
Tuples can be deconstructed into variables. To do this, you just need to remove the tuple variable from the previous code sample and define variable names in parentheses. The variables that contain the values of the tuple parts can then be directly accessed. In case some variables are not needed, you can use discards. Discards are C# placeholder variables with the name _. Discards are meant to just ignore the results, as shown with the second deconstruction in the following code snippet (code file TuplesSample/Program.cs
):
void TuplesDeconstruction()
{
var tuple1 = (AString: "magic",
Number: 42, Book: new Book("Professional C#", "Wrox Press"));
(string aString, int number, Book book) = tuple1;
Console.WriteLine($"a string: {aString}, number: {number}, book: {book}");
(_, _, var book1) = tuple1;
Console.WriteLine(book1.Title);
}
Returning Tuples
Let's get into a more useful example: a method returning a tuple. The method Divide
from the following code snippet receives two parameters and returns a tuple consisting of two int
values. Tuple results are created by putting the methods return group within parentheses (code file Tuples/Program.cs
):
static (int result, int remainder) Divide(int dividend, int divisor)
{
int result = dividend / divisor;
int remainder = dividend % divisor;
return (result, remainder);
}
The result is deconstructed into the result
and remainder
variables:
private static void ReturningTuples()
{
(int result, int remainder) = Divide(7, 2);
Console.WriteLine($"7 / 2 - result: {result}, remainder: {remainder}");
}
When you're using the C# tuple syntax, the C# compiler creates ValueTuple
structures behind the scenes. .NET defines seven generic ValueTuple
structures for one to seven generic parameters and another one where the eighth parameter can be another tuple. Using a tuple literal results in an invocation of Tuple.Create
. The tuple structure defines public fields named Item1
, Item2
, Item3
, and so on to access all the items.
For the names of the elements, the compiler uses the attribute TupleElementNames
to store the custom names of the tuple members. This information is read from the compiler to invoke the correct members.
You've already seen deconstruction with tuples—writing tuples into simple variables. You also can do deconstruction with any custom type: deconstructing a class or struct into its parts.
For example, you can deconstruct the previously shown Person
class into first name, last name, and age. In the sample code, the age returned from the deconstruction is ignored using discard (code file Classes/Program.cs
):
//…
(var first, var last, _) = katharina;
Console.WriteLine($"{first} {last}");
All you need to do is create a Deconstruct
method (also known by the name deconstructor) that fills the separate parts into parameters with the out
modifier (code file Classes/Person.cs
):
public class Person
{
//…
public void Deconstruct(out string firstName, out string lastName,
out int age)
{
firstName = FirstName;
lastName = LastName;
age = Age;
}
}
Deconstruction is implemented with the method name Deconstruct
. This method is always of type void
and returns the parts with multiple out
parameters. Instead of creating a member of a class, for deconstruction you can also create an extension method as shown here:
public static class PersonExtensions
{
public static void Deconstruct(this Person person, out string firstName,
out string lastName, out int age)
{
firstName = person.FirstName;
lastName = person.LastName;
age = person.Age;
}
}
Chapter 2 covers basic functionality with pattern matching using the is
operator and the switch
statement. This can now be extended with some more features on pattern matching, such as using tuples and property patterns.
The previous chapter included a sample of simple pattern matching with traffic lights. Now let's extend this sample with not just a simple flow from red to green to amber to red… but to change to different states after amber depending on what the previous light was. Pattern matching can be based on tuple values.
The method NextLightUsingTuples
receives enum values for the current and previous traffic light in two parameters. The two parameters are combined to a tuple with (current, previous)
to define the switch expression based on this tuple. With the switch expression, tuple patterns are used. The first case matches when the current light has the value Red
. The value of the previous light is ignored using a discard. The NextLightUsingTuples
method is declared to return a tuple with Current
and Previous
properties. In the first match, a tuple that matches this return type is returned with (Amber, current)
to specify the new value Amber
for the current light. In all the cases, the previous light is set from the current light that was received. When the current light is Amber
, now the tuple pattern results in different outcomes depending on the previous light. If the previous light was Red
, the new light returned is Green
, and vice versa (code file PatternMatchingSample/Program.cs
):
(TrafficLight Current, TrafficLight Previous)
NextLightUsingTuples(TrafficLight current, TrafficLight previous) =>
(current, previous) switch
{
(Red, _) => (Amber, current),
(Amber, Red) => (Green, current),
(Green, _) => (Amber, current),
(Amber, Green) => (Red, current),
_ => throw new InvalidOperationException()
};
With the following code snippet, the method NextLightUsingTuples
is invoked in a for
loop. The return value is deconstructed into currentLight
and previousLight
variables to write the current light information to the console and to invoke the NextLightUsingTuples
method in the next iteration:
var previousLight = Red;
var currentLight = Red;
for (int i = 0; i < 10; i++)
{
(currentLight, previousLight) = NextLightUsingTuples(currentLight,
previousLight);
Console.Write($"{currentLight} - ");
await Task.Delay(1000);
}
Console.WriteLine();
Let's extend the traffic light sample again. When you're using tuples, additional values and types can be added to extend the functionality. However, at some point this doesn't help with readability, and using classes or records is helpful.
One extension to the traffic light is having different timings for the different light phases. Another extension is used in some countries: before the light changes from the green to the amber light, another phase is introduced: the green light blinks three times. To keep up with the different states, the record TrafficLightState
is introduced (code file PatternMatchingSample/Program.cs
):
public record TrafficLightState(TrafficLight CurrentLight,
TrafficLight PreviousLight, int Milliseconds, int BlinkCount = 0);
The enum type TrafficLight
is extended to include GreenBlink
and AmberBlink
:
public enum TrafficLight
{
Red,
Amber,
Green,
GreenBlink,
AmberBlink
}
The new method NextLightUsingRecords
receives a parameter of type TrafficLightState
with the current light state and returns a TrafficLightState
with the new state. In the implementation, a switch
expression is used again. This time, the cases are selected using the property pattern. If the property CurrentLight
of the variable trafficLightState
has the value AmberBlink
, a new TrafficLightState
with the current red light is returned. When the CurrentLight
is set to Amber
, the PreviousLight
property is verified as well. Depending on the PreviousLight
value, different records are returned. Another pattern is used in this scenario—the relational pattern that is new with C# 9. BlinkCount: < 3
references the BlinkCount
property and verifies whether the value is smaller than 3
. If this is the case, the returned TrafficLightState
is cloned from the previous state using the with
expression, and the BlinkCount
is incremented by 1
:
TrafficLightState NextLightUsingRecords(TrafficLightState trafficLightState)
=> trafficLightState switch
{
{ CurrentLight: AmberBlink } =>
new TrafficLightState(Red, trafficLightState.PreviousLight, 3000),
{ CurrentLight: Red } =>
new TrafficLightState(Amber, trafficLightState.CurrentLight, 200),
{ CurrentLight: Amber, PreviousLight: Red} =>
new TrafficLightState(Green, trafficLightState.CurrentLight, 2000),
{ CurrentLight: Green } =>
new TrafficLightState(GreenBlink, trafficLightState.CurrentLight,
100, 1),
{ CurrentLight: GreenBlink, BlinkCount: < 3 } =>
trafficLightState with
{ BlinkCount = trafficLightState.BlinkCount + 1 },
{ CurrentLight: GreenBlink } =>
new TrafficLightState(Amber, trafficLightState.CurrentLight, 200),
{ CurrentLight: Amber, PreviousLight: GreenBlink } =>
new TrafficLightState(Red, trafficLightState.CurrentLight, 3000),
_ => throw new InvalidOperationException()
};
The method NextLightUsingRecords
is invoked in a for
loop similar to the sample before. Now, an instance of TrafficLightState
is passed as an argument to the method NextLightUsingRecords
. The new value is received from this method, and the current state is shown on the console:
TrafficLightState currentLightState = new(AmberBlink, AmberBlink, 2000);
for (int i = 0; i < 20; i++)
{
currentLightState = NextLightUsingRecords(currentLightState);
Console.WriteLine($"{currentLightState.CurrentLight},
{currentLightState.Milliseconds}");
await Task.Delay(currentLightState.Milliseconds);
}
The partial
keyword allows a type to span multiple files. Typically, a code generator of some type is generating part of a class, and having the class in multiple files can be beneficial. Let's assume you want to make some additions to the class that is automatically generated from a tool. If the tool reruns, your changes are lost. The partial
keyword is helpful for splitting the class into two files and making your changes to the file that is not defined by the code generator.
To use the partial
keyword, simply place partial
before class
, struct
, or interface
. In the following example, the class SampleClass
resides in two separate source files: SampleClassAutogenerated.cs
and SampleClass.cs
:
//SampleClassAutogenerated.cs
partial class SampleClass
{
public void MethodOne() { }
}
//SampleClass.cs
partial class SampleClass
{
public void MethodTwo() { }
}
When the project that contains the two source files is compiled, a single type called SampleClass
will be created with two methods: MethodOne
and MethodTwo
.
Nested partials are allowed as long as the partial
keyword precedes the class
keyword in the nested type. Attributes, XML comments, interfaces, generic-type parameter attributes, and members are combined when the partial types are compiled into the type. Given these two source files:
// SampleClassAutogenerated.cs
[CustomAttribute]
partial class SampleClass: SampleBaseClass, ISampleClass
{
public void MethodOne() { }
}
// SampleClass.cs
[AnotherAttribute]
partial class SampleClass: IOtherSampleClass
{
public void MethodTwo() { }
}
the equivalent source file would be as follows after the compile:
[CustomAttribute]
[AnotherAttribute]
partial class SampleClass: SampleBaseClass, ISampleClass, IOtherSampleClass
{
public void MethodOne() { }
public void MethodTwo() { }
}
Partial classes can contain partial methods. This is extremely useful if generated code should invoke methods that might not exist at all. The programmer extending the partial class can decide to create a custom implementation of the partial method or do nothing. The following code snippet contains a partial class with the method MethodOne
that invokes the method APartialMethod
. The method APartialMethod
is declared with the partial
keyword; thus, it does not need any implementation. If there's not an implementation, the compiler removes the invocation of this method:
//SampleClassAutogenerated.cs
partial class SampleClass
{
public void MethodOne()
{
APartialMethod();
}
public partial void APartialMethod();
}
An implementation of the partial method can be done within any other part of the partial class, as shown in the following code snippet. With this method in place, the compiler creates code within MethodOne
to invoke this APartialMethod
declared here:
// SampleClass.cs
partial class SampleClass: IOtherSampleClass
{
public void APartialMethod()
{
// implementation of APartialMethod
}
}
This chapter examined C# syntax for creating custom types with classes, records, structs, and tuples. You've seen how to declare static and instance fields, properties, methods, and constructors, both with curly brackets and with expression-bodied members.
In a continuation of Chapter 2, you've also seen more features with pattern matching, such as tuple, property, and relational patterns.
The next chapter extends the types with inheritance, adding interfaces, and using inheritance with classes, records, and interfaces.
C# is not a pure object-oriented programming language because it offers multiple programming paradigms. However, object orientation is an important concept with C#; it's a core principle of all the libraries offered by .NET.
The three most important concepts of object orientation are inheritance, encapsulation, and polymorphism. Chapter 3, “Classes, Records, Structs, and Tuples,” talks about creating individual types to arrange properties, methods, and fields. When members of a type are declared private
, they cannot be accessed from the outside. They are encapsulated within the type. This chapter covers inheritance and polymorphism and extends encapsulation features with inheritance.
The previous chapter explained all the members of a type. This chapter explains how to use inheritance to enhance base types, how to create a hierarchy of classes, and how polymorphism works with C#. It also describes all the C# keywords related to inheritance, shows how to use interfaces as contracts for dependency injection, and covers default interface methods that allow implementations with interfaces.
If you want to declare that a class derives from another class, use the following syntax:
class MyDerivedClass: MyBaseClass
{
// members
}
Let's get into an example to define a base class Shape
. Something that's common with shapes—no matter whether they are rectangles or ellipses—is that they have position and size. For position and size, corresponding records are defined that are contained within the Shape
class. The Shape
class defines read-only properties Position
and Size
that are initialized using auto properties with property initializers (code file VirtualMethods/Shape.cs
):
public class Position
{
public int X { get; set; }
public int Y { get; set; }
}
public class Size
{
public int Width { get; set; }
public int Height { get; set; }
}
public class Shape
{
public Position Position { get; } = new Position();
public Size Size { get; } = new Size();
}
By declaring a base class method as virtual
, you allow the method to be overridden in any derived classes.
The following code snippet shows the DisplayShape
method that is declared with the virtual
modifier. This method is invoked by the Draw
method of the Shape
. Virtual methods can be public
or protected
. The access modifier cannot be changed when overriding this method in a derived class. Because the Draw
method has a public
access modifier, this method can be used from the outside when using the Shape
or when using any class deriving from Shape
. The Draw
method cannot be overridden as it doesn't have the virtual modifier applied (code file VirtualMethods/Shape.cs
):
public class Shape
{
public void Draw() => DisplayShape();
protected virtual void DisplayShape()
{
Console.WriteLine($"Shape with {Position} and {Size}");
}
}
You also may declare a property as virtual
. For a virtual or overridden property, the syntax is the same as for a nonvirtual property, with the exception of the keyword virtual
, which is added to the definition:
public virtual Size Size { get; set; }
For simplicity, the following discussion focuses mainly on methods, but it applies equally well to properties.
Methods that are declared virtual can be overridden in a derived class. To declare a method that overrides a method from a base class, use the override
keyword (code file VirtualMethods/ConcreteShapes.cs
):
public class Rectangle : Shape
{
protected override void DisplayShape()
{
Console.WriteLine($"Rectangle at position {Position} with size {Size}");
}
}
Virtual functions offer a core feature of OOP: polymorphism. With virtual functions, the decision of which method to invoke is delayed during runtime. The compiler creates a virtual method table (vtable) that lists the methods that can be invoked during runtime, and it invokes the method based on the type at runtime.
For performance reasons, in C#, functions are not virtual by default. For nonvirtual functions, the vtable is not needed, and the compiler directly addresses the method that's invoked.
The Size
and Position
types override the ToString
method. This method is declared as virtual
in the base class Object
(code file VirtualMethods/ConcreteShapes.cs
):
public class Position
{
public int X { get; set; }
public int Y { get; set; }
public override string ToString() => $"X: {X}, Y: {Y}";
}
public class Size
{
public int Width { get; set; }
public int Height { get; set; }
public override string ToString() => $"Width: {Width}, Height: {Height}";
}
Before C# 9, there was the rule that, when overriding methods of the base class, the signature (all parameter types and the method name) and the return type must match exactly. If you want different parameters, you need to create a new member that does not override the base member.
With C# 9, there's a small change to this rule: when overriding methods, the return type might differ, but only to return a type that derives from the return type of the base class. One example where this can be used is to create a type-safe Clone
method. The Shape
class defines a virtual Clone
method that returns a Shape
(code file VirtualMethods/Shape.cs
):
public virtual Shape Clone() => throw new NotImplementedException();
The Rectangle
class overrides this method to return a Rectangle
type instead of the base class Shape
by creating a new instance and copying all the values from the existing instance to the newly created one:
public override Rectangle Clone()
{
Rectangle r = new();
r.Position.X = Position.X;
r.Position.Y = Position.Y;
r.Size.Width = Size.Width;
r.Size.Height = Size.Width;
return r;
}
In the top-level statements of the Program.cs
file, a rectangle and an ellipse are instantiated, properties are set, and the rectangle is cloned by invoking the virtual Clone
method. Finally, the DisplayShapes
method is invoked passing all the different created shapes. The Draw
method of the Shape
class is invoked to, in turn, invoke the overridden methods of the derived types. In this code snippet, you also see the Ellipse
class used; this is similar to the Rectangle
type, deriving from Shape
(code file VirtualMethods/Program.cs
):
Rectangle r1 = new();
r1.Position.X = 33;
r1.Position.Y = 22;
r1.Size.Width = 200;
r1.Size.Height = 100;
Rectangle r2 = r1.Clone();
r2.Position.X = 300;
Ellipse e1 = new();
e1.Position.X = 122;
e1.Position.Y = 200;
e1.Size.Width = 40;
e1.Size.Height = 20;
DisplayShapes(r1, r2, e1);
void DisplayShapes(params Shape[] shapes)
{
foreach (var shape in shapes)
{
shape.Draw();
}
}
Run the program to see the output of the Draw
method coming from the implementation of the overridden Rectangle
and Shape DisplayShape
methods:
Rectangle at position X: 33, Y: 22 with size Width: 200, Height: 100
Rectangle at position X: 300, Y: 22 with size Width: 200, Height: 200
Ellipse at position X: 122, Y: 200 with size Width: 40, Height: 20
If a method with the same signature is declared in both base and derived classes, but the methods are not declared with the modifiers virtual
and override
, respectively, then the derived class version is said to hide the base class version.
For hiding methods, you can use the new
keyword as a modifier with the method declaration. In most cases, you would want to override methods rather than hide them. By hiding them, you risk calling the wrong method for a given class instance. However, as shown in the following example, C# syntax is designed to ensure that the developer is warned at compile time about this potential problem, thus making it safer to hide methods if that is your intention. This also has versioning benefits for developers of class libraries.
Suppose that you have a class called Shape
in a class library:
public class Shape
{
// various members
}
At some point in the future, you write a derived class Ellipse
that adds some functionality to the Shape
base class. In particular, you add a method called MoveBy
, which is not present in the base class:
public class Ellipse: Shape
{
public void MoveBy(int x, int y)
{
Position.X += x;
Position.Y += y;
}
}
At some later time, the developer of the base class decides to extend the functionality of the base class and, by coincidence, adds a method that is also called MoveBy
and that has the same name and signature as yours; however, it probably doesn't do the same thing. This new method might be declared virtual
or not.
If you recompile the derived class, you get a compiler warning because of a potential method clash. The application is still working, and where you've written code to invoke the MoveBy
method using the Ellipse
class, the method you've written is invoked. Hiding a method is the default behavior to avoid breaking changes when adding methods to a base class.
To get rid of the compilation error, you need to add the new
modifier to the MoveBy
method. The code the compiler is creating with or without the new
modifier is the same; you just get rid of the compiler warning and flag this as a new method—a different one from the base class:
public class Ellipse: Shape
{
new public void MoveBy(int x, int y)
{
Position.X += x;
Position.Y += y;
}
//…
}
Instead of using the new
keyword, you can also rename the method or override the method of the base class if it is declared virtual and serves the same purpose. However, if other methods already invoke this method, a simple rename can lead to breaking other code.
If a derived class overrides or hides a method in its base class, then it can invoke the base class version of the method by using the base
keyword. For example, in the base class Shape
, the virtual Move
method is declared to change the actual position and write some information to the console. This method should be called from the derived class Rectangle
to use the implementation from the base class (code file VirtualMethods/Shape.cs
):
public class Shape
{
public virtual void Move(Position newPosition)
{
Position.X = newPosition.X;
Position.Y = newPosition.Y;
Console.WriteLine($"moves to {Position}");
}
//…
}
The Move
method is overridden in the Rectangle
class to add the term Rectangle
to the console. After this text is written, the method of the base class is invoked using the base
keyword (code file VirtualMethods/ConcreteShapes.cs
):
public class Rectangle: Shape
{
public override void Move(Position newPosition)
{
Console.Write("Rectangle ");
base.Move(newPosition);
}
//…
}
Now move the rectangle to a new position (code file VirtualMethods/Program.cs
):
r1.Move(new Position { X = 120, Y = 40 });
Run the application to see output that is a result of the Move
method in the Rectangle
and the Shape
classes:
Rectangle moves to X: 120, Y: 40
C# allows both classes and methods to be declared as abstract. An abstract class cannot be instantiated, whereas an abstract method does not have an implementation and must be overridden in any nonabstract derived class. Obviously, an abstract method is automatically virtual. If any class contains any abstract methods, that class is also abstract and must be declared as such.
Let's change the Shape
class to be abstract
. Instead of throwing a NotImplementedException
, the Clone
method is now declared abstract, and thus it can't have any implementation in the Shape
class (code file AbstractClasses/Shape.cs
):
public abstract class Shape
{
public abstract Shape Clone(); // abstract method
}
When deriving a type from the abstract base class that itself is not abstract, it's a concrete type. With a concreate class it is necessary to implement all abstract members. Otherwise, the compiler complains (code file AbstractClasses/ConcreteShapes.cs
):
public class Rectangle : Shape
{
//…
public override Rectangle Clone()
{
Rectangle r = new();
r.Position.X = Position.X;
r.Position.Y = Position.Y;
r.Size.Width = Size.Width;
r.Size.Height = Size.Width;
return r;
}
}
Using the abstract Shape
class and the derived Ellipse
class, you can declare a variable of a Shape
. You cannot instantiate it, but you can instantiate an Ellipse
and assign it to the Shape
variable (code file AbstractClasses/Program.cs
):
Shape s1 = new Ellipse();
s1.Draw();
If you don't want to allow other classes to derive from your class, your class should be sealed. Adding the sealed
modifier to a class doesn't allow you to create a subclass of it. Sealing a method means it's not possible to override this method.
sealed class FinalClass
{
//…
}
class DerivedClass: FinalClass // wrong. Cannot derive from sealed class.
{
//…
}
The most likely situation in which you'll mark a class or method as sealed
is if the class or method is internal to the operation of the library, class, or other classes that you are writing. Overriding methods could lead to instability of the code. When you seal the class, you make sure that overriding is not possible.
There's another reason to seal classes. With a sealed class, the compiler knows that derived classes are not possible, and thus the virtual table used for virtual methods can be reduced or eliminated, which can increase performance. The string
class is sealed. I haven't seen a single application that doesn't use strings, so it's best to have this type as performant as possible. Making the class sealed is a good hint for the compiler.
Declaring a method as sealed
serves a purpose similar to that for a class. The method can be an overridden method from a base class, but in the following example, the compiler knows another class cannot extend the virtual table for this method; it ends here.
class MyClass: MyBaseClass
{
public sealed override void FinalMethod()
{
// implementation
}
}
class DerivedClass: MyClass
{
public override void FinalMethod() // wrong. Will give compilation error
{
}
}
To use the sealed
keyword on a method or property, the member must have first been overridden from a base class. If you do not want a method or property in a base class overridden, then don't mark it as virtual.
Chapter 3 discusses how constructors can be applied to individual classes. An interesting question arises as to what happens when you start defining your own constructors for classes that are part of a hierarchy, inherited from other classes that may also have custom constructors.
In the sample application that uses shapes, so far, custom constructors have not been specified. The compiler creates a default constructor automatically to initialize all members to null
or 0
(depending on whether the types are reference or value types) or uses the code from specified property initializers to add these to the default constructor. Now, let's change the implementation to create immutable types and define custom constructors to initialize their values. The Position
, Size
, and Shape
classes are changed to specify read-only properties, and the constructors are changed to initialize the properties. The Shape
class is still abstract, which doesn't allow creating instances of this type (code file InheritanceWithConstructors/Shape.cs
):
public class Position
{
public Position(int x, int y) => (X, Y) = (x, y);
public int X { get; }
public int Y { get; }
public override string ToString() => $"X: {X}, Y: {Y}";
}
public class Size
{
public Size(int width, int height) => (Width, Height) = (width, height);
public int Width { get; }
public int Height { get; }
public override string ToString() => $"Width: {Width}, Height: {Height}";
}
public abstract class Shape
{
public Shape(int x, int y, int width, int height)
{
Position = new Position(x, y);
Size = new Size(width, height);
}
public Position Position { get; }
public virtual Size Size { get; }
public void Draw() => DisplayShape();
protected virtual void DisplayShape()
{
Console.WriteLine($"Shape with {Position} and {Size}");
}
public abstract Shape Clone();
}
Now the Rectangle
and Ellipse
types need to be changed as well. Because the Shape
class doesn't have a parameterless constructor, the compiler complains because it cannot automatically invoke the constructor of the base class. A custom constructor is required here as well.
With the new implementation of the Ellipse
class, a constructor is defined to supply the position and size for the shape. To invoke the constructor from the base class, such as invoking methods of the base class, you use the base
keyword, but you just can't use the base
keyword in the block of the constructor body. Instead, you need to use the base
keyword in the constructor initializer and pass the required arguments. The Clone
method can now be simplified to invoke the constructor to create a new Ellipse
object by forwarding the values from the existing object (code file InheritanceWithConstructors/ConcreteShapes.cs
):
public class Ellipse : Shape
{
public Ellipse(int x, int y, int width, int height)
: base(x, y, width, height) { }
protected override void DisplayShape()
{
Console.WriteLine($"Ellipse at position {Position} with size {Size}");
}
public override Ellipse Clone() =>
new(Position.X, Position.Y, Size.Width, Size.Height);
}
You have already encountered quite a number of so-called modifiers—keywords that can be applied to a type or a member. Modifiers can indicate the visibility of a method, such as public
or private
, or the nature of an item, such as whether a method is virtual
or abstract
. C# has a number of modifiers, and at this point it's worth taking a minute to provide the complete list.
Access modifiers indicate which other code items can access an item.
You can use all the access modifiers with members of a type. The public
and internal
access modifiers can also be applied to the type itself. With nested types (types that are specified within types), you can apply all access modifiers. In regard to access modifiers, nested types are members of the outer type, such as those shown in the following code snippet where the OuterType
is declared with the public
access modifier, and the type InnerType
has the protected
access modifier applied. With the protected
access modifier, the InnerType
can be accessed from the members of the OuterType
, and all types that derive from the OuterType
:
public class OuterType
{
protected class InnerType
{
// members of the inner type
}
// more members of the outer type
}
The public
access modifier is the most open one; everyone has access to a class or a member that has the public
access modifier applied. The private access modifier is the most restrictive one. Members with this access modifier can be used only within the class where the modifier is used. The protected
access modifier is in between these access restrictions. In addition to the private
access modifier, it allows access to all types that derive from the type where the protected
access modifier is used.
The internal
access modifier is different. This access modifier has the scope of the assembly. All the types defined within the same assembly have access to members and types where the internal access modifier is used.
If you do not supply an access modifier with a type, by default internal
access is specified. You can use this type only within the same assembly.
The protected internal
access modifier is a combination of protected
and internal
—combining these access modifiers with OR. protected internal
members can be used from any type in the same assembly or from types from another assembly if an inheritance relationship is used. With the intermediate language (IL) code, this is known as famorassem
(family or assembly)—family for the protected
C# keyword and assembly for the internal
keyword. famandassem
is also available with the IL code. Because of the demand for an AND combination, the C# team had some issues finding a good name for this, and finally it was decided to use private protected
to restrict access from within the assembly to types that have an inheritance relationship but no types from any other assembly.
The following table lists all the access modifiers and their uses:
MODIFIER | APPLIES TO | DESCRIPTION |
---|---|---|
public |
Any types or members | The item is visible to any other code. |
protected |
Any member of a type and any nested type | The item is visible only to the type and any derived type. |
internal |
Any types or members | The item is visible only within its containing assembly. |
private |
Any member of a type, and any nested type | The item is visible only inside the type to which it belongs. |
protected internal |
Any member of a type and any nested type | The item is visible to any code within its containing assembly and to any code inside a derived type. |
private protected |
Any members of a type and any nested type | The item is visible to the type and any derived type that is specified within the containing assembly. |
The modifiers in the following table can be applied to members of types and have various uses. A few of these modifiers also make sense when applied to types:
MODIFIER | APPLIES TO | DESCRIPTION |
---|---|---|
new |
Function members | The member hides an inherited member with the same signature. |
static |
All members | The member does not operate on a specific instance of the class. This is also known as class member instead of instance member. |
virtual |
Function members only | The member can be overridden by a derived class. |
abstract |
Function members only | A virtual member that defines the signature of the member but doesn't provide an implementation. |
override |
Function members only | The member overrides an inherited virtual or abstract member. |
sealed |
Classes, methods, and properties | For classes, the class cannot be inherited from. For properties and methods, the member overrides an inherited virtual member but cannot be overridden by any members in any derived classes. This must be used in conjunction with override . |
extern |
Static [DllImport ] methods only |
The member is implemented externally, in a different language. The use of this keyword is explained in Chapter 13, “Managed and Unmanaged Memory.“ |
Chapter 3 discusses a new feature with C# 9: records. Behind the scenes, records are classes. However, you cannot derive a record from a class (other than the object
type), and a class cannot derive from a record. However, records can derive from other records.
Let's change the shapes sample to use positional records. With the following code snippet, Position
and Size
are records that contain X
, Y
, Width
, and Height
properties with set init-only accessors as specified by the primary constructor. Shape
is an abstract record with Position
and Size
properties, a Draw
method, and a virtual DisplayShape
method. As with classes, you can use modifiers with records, such as abstract
and virtual
. The previously specified Clone
method is not needed with records because this is created automatically using the record
keyword (code file RecordsInheritance/Shape.cs
):
public record Position(int X, int Y);
public record Size(int Width, int Height);
public abstract record Shape(Position Position, Size Size)
{
public void Draw() => DisplayShape();
protected virtual void DisplayShape()
{
Console.WriteLine($"Shape with {Position} and {Size}");
}
}
The Rectangle
record derives from the Shape
record. With the primary constructor syntax used with the Rectangle
type, derivation from Shape
passes the same values to the primary constructor of the Shape
. Similar to the Rectangle
class created earlier, in the Rectangle
record, the DisplayShape
method is overridden (code file RecordsInheritance/ConcreteShapes.cs
):
public record Rectangle(Position Position, Size Size) : Shape(Position, Size)
{
protected override void DisplayShape()
{
Console.WriteLine($"Rectangle at position {Position} with size {Size}");
}
}
With the top-level statements in the Program.cs
file, a Rectangle
and an Ellipse
are created using primary constructors. The implementation of the Ellipse
record is similar to the Rectangle
record. The first rectangle created is cloned by using the built-in functionality, and with the new Rectangle
, the Position
property is set to a new value using the with
expression. The with
expression makes use of the init-only set accessors created from the primary constructor (code file RecordsInheritance/Program.cs
):
Rectangle r1 = new(new Position(33, 22), new Size(200, 100));
Rectangle r2 = r1 with { Position = new Position(100, 22) };
Ellipse e1 = new(new Position(122, 200), new Size(40, 20));
DisplayShapes(r1, r2, e1);
void DisplayShapes(params Shape[] shapes)
{
foreach (var shape in shapes)
{
shape.Draw();
}
}
A class can derive from one class, and a record can derive from one record; you cannot use multiple inheritance with classes and records. You can use interfaces to bring multiple inheritance into C#. Both classes and records can implement multiple interfaces. Also, one interface can inherit from multiple interfaces.
Before C# 8, an interface never had any implementation. In the versions since C# 8, you can create an implementation with interfaces, but this is very different from the implementation with classes and records; interfaces cannot keep state, so fields or automatic properties are not possible. Because method implementation is only an additional feature of interfaces, let's keep this discussion for later in this chapter and first focus on the contract aspect of interfaces.
Let's take a look at some predefined interfaces and how they are used with .NET. Some C# keywords are even designed to work with particular predefined interfaces. The using
statement and the using
declaration (covered in detail in Chapter 13) use the IDisposable
interface. This interface defines the method Dispose
without any arguments and without return type. A class deriving from this interface needs to implement this Dispose
method:
public IDisposable
{
void Dispose();
}
The using
statement uses this interface. You can use this statement with any class (here, the Resource
class) implementing this interface:
using (Resource resource = new())
{
// use the resource
}
The compiler converts the using
statement to this code to invoke the Dispose
method in the finally
block of the try
/
finally
statement:
Resource resource = new();
try
{
// use the resource
}
finally
{
resource.Dispose();
}
Another example where an interface is used with a language keyword is the foreach
statement that's using the IEnumerator
and IEnumerable
interfaces. This code snippet
string[] names = { "James", "Jack", "Jochen" };
foreach (var name in names)
{
Console.WriteLine(name);
}
is converted to access the GetEnumerator
method of the IEnumerable
interface and uses a while
loop to access the MoveNext
method and the Current
property of the IEnumerator
interface:
string[] names = { "James", "Jack", "Jochen" };
var enumerator = names.GetEnumerator();
while (enumerator.MoveNext())
{
var name = enumerator.Current;
Console.WriteLine(name);
}
Let's look at an example where an interface is used from a .NET class, and you can easily implement this interface. The interface IComparable<T>
defines the CompareTo
method to sort objects of the type you need to specify with the generic parameter T
. This interface is used by various classes in .NET to order objects of any type:
public interface IComparable<in T>
{
int CompareTo(T? other);
}
With the following code snippet, the record Person
implements this interface specifying Person
as a generic parameter. Person
specifies the properties FirstName
and LastName
. The CompareTo
method is defined to return 0 if both values (this
and other
) are the same, a value lower than 0 if this
object should come before the other
object, and a value greater than 0 if other
should be first. Because the string
type also implements IComparable
, this implementation is used to compare the LastName
properties. If the comparison on the last name returns 0
, a comparison is done on the FirstName
property as well (code file UsingInterfaces/Person.cs
):
public record Person(string FirstName, string LastName) : IComparable<Person>
{
public int CompareTo(Person? other)
{
int compare = LastName.CompareTo(other?.LastName);
if (compare is 0)
{
return FirstName.CompareTo(other?.FirstName);
}
return compare;
}
}
With the top-level statements in Program.cs
, three Person
records are created within an array, and the array's Sort
method is used to sort the elements in the array (code file UsingInterfaces/Program.cs
):
Person p1 = new("Jackie", "Stewart");
Person p2 = new("Graham", "Hill");
Person p3 = new("Damon", "Hill");
Person[] people = { p1, p2, p3 };
Array.Sort(people);
foreach (var p in people)
{
Console.WriteLine(p);
}
Running the application shows the ToString
output of the record type in a sorted order:
Person { FirstName = Damon, LastName = Hill }
Person { FirstName = Graham, LastName = Hill }
Person { FirstName = Jackie, LastName = Stewart }
Interfaces can act as a contract. The record Person
implements the IComparable
contract that is used by the Sort
method of the Array
class. The Array
class just needs to know the contract definition (the members of the interface) to know what it can use.
Let's create a custom interface. With the shapes sample, the Shape
and Rectangle
types used the Console.WriteLine
method to write a message to the console:
protected virtual void DisplayShape()
{
Console.WriteLine($"Shape with {Position} and {Size}");
}
This way, the method DisplayShape
has a strong dependency on the Console
class. To make this implementation independent of the Console
class and to write to either the console or a file, you can define a contract such as the ILogger
interface in the following code snippet. This interface specifies the Log
method where a string can be passed as an argument (code file UsingInterfaces/ILogger.cs
):
public interface ILogger
{
void Log(string message);
}
A new version of the Shape
class uses constructor injection where the interface is injected into an object of this class. In the constructor, the object passed with the parameter is assigned to the read-only property Logger
. With the implementation of the DisplayShape
method, the property of type ILogger
is used to write a message (code file UsingInterfaces/Shape.cs
):
public abstract class Shape
{
public Shape(ILogger logger)
{
Logger = logger;
}
protected ILogger Logger { get; }
public Position? Position { get; init; }
public Size? Size { get; init; }
public void Draw() => DisplayShape();
protected virtual void DisplayShape()
{
Logger.Log($"Shape with {Position} and {Size}");
}
}
With a concrete implementation of the abstract Shape
class, in the constructor, the ILogger
interface is forwarded to the constructor of the base class. With the DisplayShape
method, the protected property Logger
is used from the base class (code file UsingInterfaces/ConcreteShapes.cs
):
public class Ellipse : Shape
{
public Ellipse(ILogger logger) : base(logger) { }
protected override void DisplayShape()
{
Logger.Log($"Ellipse at position {Position} with size {Size}");
}
}
Next, a concrete implementation of the ILogger
interface is required. One way you can implement writing a message to the console is with the ConsoleLogger
class. This class implements the ILogger
interface to write a message to the console (code file UsingInterfaces/ConsoleLogger.cs
):
public class ConsoleLogger : ILogger
{
public void Log(string message) => Console.WriteLine(message);
}
For creating a Rectangle
, the ConsoleLogger
can be created on passing an instance to implement the ILogger
interface (code file UsingInterfaces/Program.cs
):
Ellipse e1 = new(new ConsoleLogger())
{
Position = new(20, 30),
Size = new(100, 120)
};
r1.Draw();
Interfaces can be explicitly or implicitly implemented. With the example so far, you've seen implicitly implemented interfaces, such as with the ConsoleLogger
class:
public class ConsoleLogger : ILogger
{
public void Log(string message) => Console.WriteLine(message);
}
With an explicit interface implementation, the member implemented doesn't have an access modifier and has the interface prefixed to the method name:
public class ConsoleLogger : ILogger
{
void ILogger.Log(string message) => Console.WriteLine(message);
}
With an explicit interface implementation, the interface is not accessible when you use a variable of type ConsoleLogger
(it's not public). If you use a variable of the interface type (ILogger
), you can invoke the Log
method; the contract of the interface is fulfilled. You can also cast the ConsoleLogger
variable to the interface ILogger
to invoke this method.
Why would you want to do this? One reason is to resolve a conflict. If different interfaces define the same method signature, your class needs to implement all these interfaces, and the implementations need to differ, you can use explicit interface implementation.
Another reason to use explicit interface implementation is to hide the interface method from code outside of the class but still fulfill the contract from the interface. An example is the StringCollection
class from the System.Collections.Specialized
namespace and the IList
interface. One of the members that's defined by the IList
interface is the Add
method:
int Add(object? value);
The StringCollection
class is optimized for strings and thus prefers to use the string type with the Add
method:
public int Add(string? value);
The version to pass an object is hidden from the StringCollection
class because the StringCollection
class has an explicit interface implementation with this method. To use this type directly, you just pass a string parameter. If a method uses IList
as a parameter, then you can use any object that implements IList
for that parameter. In particular, you can use a StringCollection
for the parameter because that class still implements that interface.
Now that you've seen the foundations of interfaces, let's compare interfaces, classes, records, and structs with regard to object orientation:
Before C# 8, changing an interface was always a breaking change. Even just adding a member to an interface is a breaking change. The type implementing this interface needs to implement this new interface member. Because of this, many .NET libraries are built with abstract base classes. When you add a new member to an abstract base class, if it's not an abstract member, it is not a breaking change. With Microsoft's Component Object Model (COM), which is based on interfaces, always a new interface was defined when a breaking change was introduced—for example, IViewObject
, IViewObjectEx
, IViewObject2
, IViewObject3
.
As of C# 8, interfaces can have implementations. However, you need to be aware where you can use this feature. C# 8 is supported by .NET Core 3.x. With older technologies, you can change the compiler version at your own risk. To support default interface members, a runtime change is required. This runtime change is available only with .NET Core 3.x+ and .NET Standard 2.1+. You cannot use default interface members with .NET Framework applications or UWP applications without .NET 5 support.
Let's get into the main feature of default interface members to avoid breaking changes. In a previous code sample, the ILogger
interface has been specified:
public interface ILogger
{
void Log(string message);
}
If you add any member without implementation, the ConsoleLogger
class needs to be updated. To avoid a breaking change, an implementation to the new Log
method with the Exception
parameter is added. With the implementation, the previous Log
method is invoked by passing a string
(code file DefaultInterfaceMethods/ILogger.cs
):
public interface ILogger
{
void Log(string message);
public void Log(Exception ex) => Log(ex.Message);
}
The application can be built without changing the implementation of the ConsoleLogger
class. If a variable of the interface type is used, both Log
methods can be invoked: the Log
method with the string parameter and the Log
method with the Exception
parameter (code file DefaultInterfaceMethods/Program.cs
):
ILogger logger = new ConsoleLogger();
logger.Log("message");
logger.Log(new Exception("sample exception"));
With a new implementation of the ConsoleLogger
class, a different implementation of the new Log
method defined with the ILogger
interface can be created. In this case, using the ILogger
interface invokes the method implemented with the ConsoleLogger
class. The method is implemented with explicit interface implementation but could be implemented with implicit interface implementation as well (code file DefaultInterfaceMethods/ConsoleLogger.cs
):
public class ConsoleLogger : ILogger
{
public void Log(string message) => Console.WriteLine(message);
void ILogger.Log(Exception ex)
{
Console.WriteLine(
$"exception type: {ex.GetType().Name}, message: {ex.Message}");
}
}
Default interface members can be used to implement traits with C#. Traits allow you to define methods for a group of types. One way to implement traits is with extension methods; the other option is using default interface methods.
With Language Integrated Query (LINQ), many LINQ operators have been implemented with extension methods. With this new feature, it would be possible to implement these methods with default interface members instead.
To demonstrate this, the IEnumerableEx<T>
interface is defined that derives from the interface IEnumerable<T>
. Deriving from this interface, IEnumerableEx<T>
specifies the same contract as the base interface, but the Where
method is added. This method receives a delegate parameter to pass a predicate method that returns a Boolean value, iterates through all the items, and invokes the method referenced by the predicate. If the predicate returns true, the Where method returns the item with yield return
.
using System;
using System.Collections.Generic;
public interface IEnumerableEx<T> : IEnumerable<T>
{
public IEnumerable<T> Where(Func<T, bool> pred)
{
foreach (T item in this)
{
if (pred(item))
{
yield return item;
}
}
}
}
Now you need a collection to implement the interface IEnumerableEx<T>
. You can do this easily by creating a new collection type, MyCollection,
that derives from the Collection<T>
base class defined in the System.Collections.ObjectModel
namespace. Because the Collection<T>
class already implements the interface IEnumerable<T>
, no additional implementation is needed to support IEnumerableEx<T>
(code file DefaultInterfaceMethods/MyCollection.cs
):
class MyCollection<T> : Collection<T>, IEnumerableEx<T>
{
}
With this in place, a collection of type MyCollections<string>
is created that's filled with names. A lambda expression that returns a Boolean value and receives a string is passed to the Where
method that's defined with the interface. The foreach
statement iterates through the result and only displays the names starting with J
(code file DefaultInterfaceMethods/Program.cs
):
IEnumerableEx<string> names = new MyCollection<string>
{ "James", "Jack", "Jochen", "Sebastian", "Lewis", "Juan" };
var jNames = names.Where(n => n.StartsWith("J"));
foreach (var name in jNames)
{
Console.WriteLine(name);
}
One way to reduce the code you need to write is by using inheritance and adding functionality to base classes. Another way is to create generics where a type parameter is used, which allows specifying the type when instantiating the generic (which can also be combined with inheritance).
Let's get into an example to create a linked list of objects where every item references the next and previous items. The first generic type created is a record. The generic type parameter is specified using angle brackets. T
is the placeholder type parameter name. With the primary constructor, a property with an init-only set accessor is created. The record has two additional properties, Next
and Prev
, to reference the next and previous items. With these additional properties, the internal
access modifier is used to allow calling the set accessor only from within the same assembly (code file GenericTypes/LinkedListNode.cs
):
public record LinkedListNode<T>(T Value)
{
public LinkedListNode<T>? Next { get; internal set; }
public LinkedListNode<T>? Prev { get; internal set; }
public override string? ToString() => Value?.ToString();
}
The generic class LinkedList
contains the properties First
and Last
to access the first and last elements of the list, the method AddLast
to add a new node at the end of the list, and an implementation of the IEnumerable<T>
interface, which allows iterating through all elements (code file GenericTypes/LinkedList.cs
):
public class LinkedList<T> : IEnumerable<T>
{
public LinkedListNode<T>? First { get; private set; }
public LinkedListNode<T>? Last { get; private set; }
public LinkedListNode<T> AddLast(T node)
{
LinkedListNode<T> newNode = new(node);
if (First is null || Last is null)
{
First = newNode;
Last = First;
}
else
{
newNode.Prev = Last;
LinkedListNode<T> previous = Last;
Last.Next = newNode;
Last = newNode;
}
return newNode;
}
public IEnumerator<T> GetEnumerator()
{
LinkedListNode<T>? current = First;
while (current is not null)
{
yield return current.Value;
current = current.Next;
}
}
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}
In the generated Main
method, the LinkedList
is initiated by using the int
type by using the string
type, a tuple, and a record. LinkedList
works with any type (code file GenericTypes/Program.cs
):
LinkedList<int> list1 = new();
list1.AddLast(1);
list1.AddLast(3);
list1.AddLast(2);
foreach (var item in list1)
{
Console.WriteLine(item);
}
Console.WriteLine();
LinkedList<string> list2 = new();
list2.AddLast("two");
list2.AddLast("four");
list2.AddLast("six");
Console.WriteLine(list2.Last);
LinkedList<(int, int)> list3 = new();
list3.AddLast((1, 2));
list3.AddLast((3, 4));
foreach (var item in list3)
{
Console.WriteLine(item);
}
Console.WriteLine();
LinkedList<Person> list4 = new();
list4.AddLast(new Person("Stephanie", "Nagel"));
list4.AddLast(new Person("Matthias", "Nagel"));
list4.AddLast(new Person("Katharina", "Nagel"));
// show the first
Console.WriteLine(list4.First);
public record Person(string FirstName, string LastName);
With the previous implementation of the LinkedListNode<T>
and LinkedList<T>
types there was not a special requirement on the generic type; any type can be used. This prevents you from using any nonobject members with the implementation. The compiler doesn't accept invoking any property or method on the generic type T
.
Adding the DisplayAllTitles
method to the LinkedList<T>
class results in a compiler error. T
does not contain a definition for Title
, and no accessible extension method Title
accepting a first argument of type T
could be found (code file GenericTypesWithConstraints/LinkedList.cs
):
public void DisplayAllTitles()
{
foreach (T item in this)
{
Console.WriteLine(item.Title);
}
}
To resolve this, the interface ITitle
is specified that defines a Title
property that needs to be implemented with the implementation of this interface:
public interface ITitle
{
string Title { get; }
}
Defining the generic LinkedList<T>
, now the constraint for the generic type T
, can be specified to implement the interface ITitle
. Constraints are specified with the where
keyword followed by the requirement on the type:
public class LinkedList<T> : IEnumerable<T>
where T : ITitle
{
//…
}
With this change in place, the DisplayAllTitles
method compiles. This method uses the members specified by the ITitle
interface, and this is a requirement on the generic type. You can no longer use int
and string
for the generic type parameter, but the Person
record can be changed to implement this constraint (code file GenericTypesWithConstraints/Program.cs
):
public record Person(string FirstName, string LastName, string Title)
: ITitle { }
The following table lists the constraints you can specify with a generic:
CONSTRAINT | DESCRIPTION |
---|---|
where T : struct |
With a struct constraint, T must be a value type. |
where T : class |
With a class constraint, T must be a reference type. |
where T : class? |
T must be a nullable or a non-nullable reference type. |
where T : notnull |
T must be a non-nullable type. This can be a value or a reference type. |
where T : unmanaged |
T must be a non-nullable unmanaged type. |
where T : IFoo |
This specifies that the type T is required to implement interface IFoo . |
where T : Foo |
This specifies that the type T is required to derive from base class Foo . |
where T : new() |
A constructor constraint; this specifies that T must have a parameterless constructor. You cannot specify a constraint for constructors with parameters. |
where T1 : T2 |
With constraints, it is also possible to specify that type T1 derives from a generic type T2 . |
This chapter described how to code inheritance in C#. You saw the rich support for both implementing multiple interfaces and single inheritance with classes and records. You saw how C# provides a number of useful syntactical constructs designed to assist in making code more robust, which includes different access modifiers, and the concept of nonvirtual and virtual methods. You also saw the new feature for interfaces, which allows adding code implementation. Generics have been covered as another concept to reuse code.
The next chapter continues with all the C# operators and casts.
C# supports the operators and expressions listed in the following table. In the table, the operators start with the highest precedence and go down to the lowest.
CATEGORY | OPERATOR |
---|---|
Primary | x.y x?.y f (x) a[x] x++ x-- x! x->y new typeof default checked unchecked delegate nameof sizeof delegate stackalloc
|
Unary | +x -x !x ~x ++x --x ^x (T)x await &x *x true false |
Range | x..y |
Multiplicative | x*y x/y x%y |
Additive | x+y x-y |
Shift | x<<y x>>y |
Relational | x<y x>y x<=y x>=y |
Type testing | is as |
Equality | x==y x!=y |
Logical | x&y x^y x|y |
Conditional logical | x&&y x||y |
Null coalescing | x??y |
Conditional operator | c?t:f |
Assignment | x=y x+=y x-=y x*=y x/=y x%=y x&=y x|=y x^=y x<<=y x>>=y x??=y |
Lambda expression | => |
Compound assignment operators are a shortcut to using the assignment operator with another operator. Instead of writing x = x + 2
, you can use the compound assignment x += 2
. Incrementing by 1
is required even more often, so there's another shortcut, x++
:
int x = 1;
int x += 2; // shortcut for int x = x + 2;
x++; // shortcut for x = x + 1;
Shortcuts can be used with all the other compound assignment operators. A new compound assignment operator has been available since C# 8: the null-coalescing compound assignment operator. This operator is discussed later in this chapter.
You may be wondering why there are two examples for the ++
increment operator. Placing the operator before the expression is known as a prefix; placing the operator after the expression is known as a postfix. Note that there is a difference in the way they behave.
The increment and decrement operators can act both as entire expressions and within expressions. When used by themselves, the effect of both the prefix and postfix versions is identical and corresponds to the statement x = x + 1
. When used within larger expressions, the prefix operator increments the value of x
before the expression is evaluated; in other words, x
is incremented, and the new value is used as the result of the expression. Conversely, the postfix operator increments the value of x
after the expression is evaluated. The result of the expression returns the original value of x
. The following example uses the increment operator (++
) as an example to demonstrate the difference between the prefix and postfix behavior (code file OperatorsSample/Program.cs
):
void PrefixAndPostfix()
{
int x = 5;
if (++x == 6) // true – x is incremented to 6 before the evaluation
{
Console.WriteLine("This will execute");
}
if (x++ == 6) // true – x is incremented to 7 after the evaluation
{
Console.WriteLine("The value of x is: {x}"); // x has the value 7
}
}
The following sections look at some of the commonly used and new operators that you will frequently use within your C# code.
The conditional-expression operator (?:
), also known as the ternary operator, is a shorthand form of the if…else
construction. It gets its name from the fact that it involves three operands. It allows you to evaluate a condition, returning one value if that condition is true or another value if it is false. The syntax is as follows:
condition ? true_value: false_value
Here, condition
is the Boolean expression to be evaluated, true
_
value
is the value that is returned if condition
is true, and false
_
value
is the value that is returned otherwise.
When used sparingly, the conditional-expression operator can add a dash of terseness to your programs. It is especially handy for providing one of a couple of arguments to a function that is being invoked. You can use it to quickly convert a Boolean value to a string value of true
or false
. It is also handy for displaying the correct singular or plural form of a word (code file OperatorsSample/Program.cs
):
int x = 1;
string s = x + " ";
s += (x == 1 ? "man": "men");
Console.WriteLine(s);
This code displays 1 man
if x
is equal to one but displays the correct plural form for any other number. Note, however, that if your output needs to be localized to different languages, you have to write more sophisticated routines to take into account the different grammatical rules of different languages. Read Chapter 22, “Localization,” for globalizing and localizing .NET applications.
Consider the following code:
byte b = byte.MaxValue;
b++;
Console.WriteLine(b);
The byte
data type can hold values only in the range 0 to 255. Assigning byte.MaxValue
to a byte results in 255. With 255, all bits of the 8 available bits in the byte are set: 11111111. Incrementing this value by one causes an overflow and results in 0.
To get exceptions in such cases, C# provides the checked
and unchecked
operators. If you mark a block of code as checked
, the CLR enforces overflow checking, throwing an OverflowException
if an overflow occurs. The following changes the preceding code to include the checked
operator (code file OperatorsSample/Program.cs
):
byte b = 255;
checked
{
b++;
}
Console.WriteLine(b);
Instead of writing a checked
block, you also can use the checked
keyword in an expression:
b = checked(b + 3);
When you try to run this code, the OverflowException
is thrown.
You can enforce overflow checking for all unmarked code by adding the CheckForOverflowUnderflow
setting in the csproj
file:
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>net5.0</TargetFramework>
<Nullable>enable</Nullable>
<CheckForOverflowUnderflow>true</CheckForOverflowUnderflow>
</PropertyGroup>
With a project setting to be configured for overflow checking, you can mark code that should not be checked using the unchecked
operator.
You can use the is
and as
operators to determine whether an object is compatible with a specific type. This is useful with class hierarchies.
Let's assume a simple class hierarchy. The class DerivedClass
derives from the class BaseClass
. You can assign a variable of type DerivedClass
to a variable of type BaseClass
; all the members of the BaseClass
are available with the DerivedClass
. In the following example, an implicit conversion is taking place:
BaseClass = new();
DerivedClass = new();
baseClass = derivedClass;
If you have a parameter of the BaseClass
and want to assign it to a variable of the DerivedClass
, implicit conversion is not possible. To the SomeAction
method, an instance of the BaseClass
or any type that derives from this class can be passed. This will not necessarily succeed. Here, you can use the as
operator. The as
operator either returns a DerivedClass
instance (if the variable is of this type) or returns null
:
public void SomeAction(BaseClass baseClass)
{
DerivedClass? derivedClass = baseClass as DerivedClass;
if (derivedClass != null)
{
// use the derivedClass variable
}
}
Instead of using the as
operator, you can use the is
operator. The is
operator returns true
if the conversion succeeds; otherwise, it returns false
. With the is
operator, a variable can be specified that is assigned if the is
operator returns true:
public void SomeAction(BaseClass baseClass)
{
if (baseClass is DerivedClass derivedClass)
{
// use the derivedClass variable
}
}
You can determine the size (in bytes) required on the stack by a value type using the sizeof
operator (code file OperatorsSample/Program.cs
):
Console.WriteLine(sizeof(int));
This displays the number 4
because an int
is 4 bytes long.
You can also use the sizeof
operator with structs if the struct contains only value types—for example, the Point
class as shown here (code file OperatorsSample/Point.cs
):
public readonly struct Point
{
public Point(int x, int y) => (X, Y) = (x, y);
public int X { get; }
public int Y { get; }
}
When you use sizeof
with custom types, you need to write the code within an unsafe code block (code file OperatorsSample/Program.cs
):
unsafe
{
Console.WriteLine(sizeof(Point));
}
The typeof
operator returns a System.Type
object representing a specified type. For example, typeof(string)
returns a Type
object representing the System.String
type. This is useful when you want to use reflection to find information about an object dynamically. For more information, see Chapter 12, “Reflection, Metadata, and Source Generators.”
The nameof
operator is of practical use when strings are needed as parameters that are already known at compile time. This operator accepts a symbol, property, or method and returns the name.
One use example is when the name of a variable is needed, as in checking a parameter for null, as shown here:
public void Method(object o)
{
if (o == null) throw new ArgumentNullException(nameof(o));
}
Of course, it would be similar to throw the exception by passing a string instead of using the nameof
operator. However, using nameof
prevents you from misspelling the parameter name when you pass it to the exception's constructor. Also, when you change the name of the parameter, you can easily miss changing the string passed to the ArgumentNullException
constructor. Refactoring features also help changing all occurrences where nameof
is used:
if (o == null) throw new ArgumentNullException("o");
Using the nameof
operator for the name of a variable is just one use case. You can also use it to get the name of a property—for example, for firing a change event (using the interface INotifyPropertyChanged
) in a property set
accessor and passing the name of a property.
public string FirstName
{
get => _firstName;
set
{
_firstName = value;
OnPropertyChanged(nameof(FirstName));
}
}
The nameof
operator can also be used to get the name of a method. This also works if the method is overloaded because all overloads result in the same value: the name of the method.
public void Method()
{
Log($"{nameof(Method)} called");
You use the indexer (brackets) for accessing arrays in Chapter 6. In the following code snippet, the indexer is used to access the third element of the array named arr1
by passing the number 2:
int[] arr1 = {1, 2, 3, 4};
int x = arr1[2]; // x == 3
Similarly to accessing elements of an array, the indexer is implemented with collection classes (discussed in Chapter 8, “Collections”).
The indexer doesn't require an integer within the brackets. Indexers can be defined with any type. The following code snippet creates a generic dictionary where the key is a string
and the value an int
. With dictionaries, the key can be used with the indexer. In the following sample, the string first
is passed to the indexer to set this element in the dictionary, and then the same string is passed to the indexer to retrieve this element:
Dictionary<string, int> dict = new();
dict["first"] = 1;
int x = dict["first"];
The null-coalescing operator (??
) provides a shorthand mechanism to cater to the possibility of null
values when working with nullable and reference types. The operator is placed between two operands—the first operand must be a nullable type or reference type, and the second operand must be of the same type as the first or of a type that is implicitly convertible to the type of the first operand. The null-coalescing operator evaluates as follows:
null
, then the overall expression has the value of the first operand.null
, then the overall expression has the value of the second operand.Here's an example:
int? a = null;
int b;
b = a ?? 10; // b has the value 10
a = 3;
b = a ?? 10; // b has the value 3
If the second operand cannot be implicitly converted to the type of the first operand, a compile-time error is generated.
The null-coalescing operator is not only important with nullable types but also with reference types. In the following code snippet, the property Val
returns the value of the _val
variable only if it is not null. In case it is null, a new instance of MyClass
is created, assigned to the _val
variable, and finally returned from the property. This second part of the expression within the get
accessor only happens when the variable _val
is null:
private MyClass _val;
public MyClass Val
{
get => _val ?? (_val = new MyClass());
}
Using the null-coalescing assignment operator, the preceding code can now be simplified to create a new MyClass
and assign it to _val
if _val
is null
:
private MyClass _val;
public MyClass Val
{
get => _val ??= new MyClass();
}
The null-conditional operator, is a feature of C# that reduces the number of code lines. A great number of code lines in production code verify null conditions. Before accessing members of a variable that is passed as a method parameter, the variable needs to be checked to determine whether it has a value of null. Otherwise, a NullReferenceException
would be thrown. A .NET design guideline specifies that code should never throw exceptions of these types and should always check for null conditions. However, such checks could be missed easily. This code snippet verifies whether the passed parameter p
is not null. In case it is null, the method just returns without continuing:
public void ShowPerson(Person? p)
{
if (p is null) return;
string firstName = p.FirstName;
//…
}
Using the null-conditional operator to access the FirstName
property (p?.FirstName
), when p
is null
, only null
is returned without continuing to the right side of the expression (code file OperatorsSample/Program.cs
):
public void ShowPerson(Person? p)
{
string firstName = p?.FirstName;
//…
}
When a property of an int
type is accessed using the null-conditional operator, the result cannot be directly assigned to an int
type because the result can be null
. One option to resolve this is to assign the result to a nullable int
:
int? age = p?.Age;
Of course, you can also solve this issue by using the null-coalescing operator and defining another result (for example, 0
) in case the result of the left side is null
:
int age1 = p?.Age ?? 0;
You also can combine multiple null-conditional operators. In the following example, the Address
property of a Person
object is accessed, and this property in turn defines a City
property. Null checks need to be done for the Person
object and, if it is not null, also for the result of the Address
property:
Person p = GetPerson();
string city = null;
if (p != null && p.HomeAddress != null)
{
city = p.HomeAddress.City;
}
When you use the null-conditional operator, the code becomes much simpler:
string city = p?.HomeAddress?.City;
You can also use the null-conditional operator with arrays. With the following code snippet, a NullReferenceException
is thrown using the index operator to access an element of an array variable that is null
:
int[] arr = null;
int x1 = arr[0];
Of course, traditional null checks could be done to avoid this exceptional condition. A simpler version uses ?[0]
to access the first element of the array. In case the result is null
, the null-coalescing operator returns the value 0 for the x1
variable:
int x1 = arr?[0] ?? 0;
Working with binary values historically has been an important concept to understand when learning programming because the computer works with 0s and 1s. Many people who are newer to programming may have missed learning this because they start to learn programming with Blocks, Scratch, Python, and possibly JavaScript. If you are already fluent with 0s and 1s, this section might still help you as a refresher.
First, let's start with simple calculations using binary operators. The method SimpleCalculations
first declares and initializes the variables binary1
and binary2
with binary values—using the binary literal and digit separators. Using the &
operator, the two values are combined with the binary AND
operator and written to the variable binaryAnd
. In the following code, the |
operator is used to create the binaryOr
variable, the ^
operator for the binaryXOR
variable, and the ~
operator for the reverse1
variable (code file BinaryCalculations/Program.cs
):
void SimpleCalculations()
{
Console.WriteLine(nameof(SimpleCalculations));
uint binary1 = 0b1111_0000_1100_0011_1110_0001_0001_1000;
uint binary2 = 0b0000_1111_1100_0011_0101_1010_1110_0111;
uint binaryAnd = binary1 & binary2;
DisplayBits("AND", binaryAnd, binary1, binary2);
uint binaryOR = binary1 | binary2;
DisplayBits("OR", binaryOR, binary1, binary2);
uint binaryXOR = binary1 ^ binary2;
DisplayBits("XOR", binaryXOR, binary1, binary2);
uint reverse1 = ~binary1;
DisplayBits("NOT", reverse1, binary1);
Console.WriteLine();
}
To display uint
and int
variables in a binary form, the extension method ToBinaryString
is created. Convert.ToString
offers an overload with two int
parameters, where the second int
value is the toBase
parameter. Using this, you can format the output string binary by passing the value 2 (for binary), 8 (for octal), 10 (for decimal), and 16 (for hexadecimal). By default, if a binary value starts with 0 values, these values are ignored and not printed. The PadLeft
method fills up these 0 values in the string. The number of string characters needed is calculated by the sizeof
operator and a left shift of four bits. The sizeof
operator returns the number of bytes for the specified type, as discussed earlier in this chapter. For displaying the bits, the number of bytes needs to be multiplied by 8, which is the same as shifting three bits to the left. Another extension method is AddSeparators
, which adds _
separators after every four digits using LINQ methods (code file BinaryCalculations/BinaryExtensions.cs
):
public static class BinaryExtensions
{
public static string ToBinaryString(this uint number) =>
Convert.ToString(number, toBase: 2).PadLeft(sizeof(uint) << 3, '0');
public static string ToBinaryString(this int number) =>
Convert.ToString(number, toBase: 2).PadLeft(sizeof(int) << 3, '0');
public static string AddSeparators(this string number) =>
string.Join('_',
Enumerable.Range(0, number.Length / 4)
.Select(i => number.Substring(i * 4, 4)).ToArray());
}
The method DisplayBits
, which is invoked from the previously shown SimpleCalculations
method, makes use of the ToBinaryString
and AddSeparators
extension methods. Here, the operands used for the operation are displayed, as well as the result (code file BinaryCalculations/Program.cs
):
void DisplayBits(string title, uint result, uint left,
uint? right = null)
{
Console.WriteLine(title);
Console.WriteLine(left.ToBinaryString().AddSeparators());
if (right.HasValue)
{
Console.WriteLine(right.Value.ToBinaryString().AddSeparators());
}
Console.WriteLine(result.ToBinaryString().AddSeparators());
Console.WriteLine();
}
When you run the application, you can see the following output using the binary &
operator. With this operator, the resulting bits are only 1 when both input values are also 1:
AND
1111_0000_1100_0011_1110_0001_0001_1000
0000_1111_1100_0011_0101_1010_1110_0111
0000_0000_1100_0011_0100_0000_0000_0000
When you apply the binary | operator, the result bit is set (1) if one of the input bits is set:
OR
1111_0000_1100_0011_1110_0001_0001_1000
0000_1111_1100_0011_0101_1010_1110_0111
1111_1111_1100_0011_1111_1011_1111_1111
With the ^
operator, the result is set if just one of the original bits is set, but not both:
XOR
1111_0000_1100_0011_1110_0001_0001_1000
0000_1111_1100_0011_0101_1010_1110_0111
1111_1111_0000_0000_1011_1011_1111_1111
And finally, with the ~
operator, the result is the negation of the original:
NOT
1111_0000_1100_0011_1110_0001_0001_1000
0000_1111_0011_1100_0001_1110_1110_0111
As you've already seen in the previous sample, shifting three bits to the left is a multiplication by 8. A shift by one bit is a multiplication by 2. This is a lot faster than invoking the multiply operator—in case you need to multiply by 2, 4, 8, 16, 32, and so on.
The following code snippet sets one bit in the variable s1
, and in the for
loop the bit always shifts by one bit (code file BinaryCalculations/Program.cs
):
void ShiftingBits()
{
Console.WriteLine(nameof(ShiftingBits));
ushort s1 = 0b01;
Console.WriteLine($"{"Binary",16} {"Decimal",8} {"Hex",6}");
for (int i = 0; i < 16; i++)
{
Console.WriteLine($"{s1.ToBinaryString(),16} {s1,8} hex: {s1,6:X}");
s1 = (ushort)(s1 << 1);
}
Console.WriteLine();
}
In the program output, you can see binary, decimal, and hexadecimal values with the loop:
Binary Decimal Hex
0000000000000001 1 1
0000000000000010 2 2
0000000000000100 4 4
0000000000001000 8 8
0000000000010000 16 10
0000000000100000 32 20
0000000001000000 64 40
0000000010000000 128 80
0000000100000000 256 100
0000001000000000 512 200
0000010000000000 1024 400
0000100000000000 2048 800
0001000000000000 4096 1000
0010000000000000 8192 2000
0100000000000000 16384 4000
1000000000000000 32768 8000
One important thing to remember when working with binary numbers is that when using signed types, such as int
, long
, and short
, the leftmost bit is used to represent the sign. When you use an int
, the highest number available is 2147483647
—the positive number of 31 bits or 0x7FFF_FFFF
. With a uint
, the highest number available is 4294967295
or 0xFFFF_FFFF
. This represents the positive number of 32 bits. With the int
, the other half of the number range is used for negative numbers.
To understand how negative numbers are represented, the following code snippet initializes the maxNumber
variable to the highest positive number that fits into 15 bits using short.MaxValue
. Then, in a for
loop, the variable is incremented three times. In the results, binary, decimal, and hexadecimal values are shown (code file BinaryCalculations/Program.cs
):
void SignedNumbers()
{
Console.WriteLine(nameof(SignedNumbers));
void DisplayNumber(string title, short x) =>
Console.WriteLine($"{title,-11} " +
$"bin: {x.ToBinaryString().AddSeparators()}, " +
$"dec: {x,6}, hex: {x,4:X}");
short maxNumber = short.MaxValue;
DisplayNumber("max short", maxNumber);
for (int i = 0; i < 3; i++)
{
maxNumber++;
DisplayNumber($"added {i + 1}", maxNumber);
}
Console.WriteLine();
//…
}
With the output of the application, you can see all the bits—except the sign bit—are set to achieve the maximum integer value. The output shows the same value in different formats—binary, decimal, and hexadecimal. Adding 1 to the first output results in an overflow of the short
type setting the sign bit, and all other bits are 0. This is the highest negative value for the int
type. After this result, two more increments are done:
max short bin: 0111_1111_1111_1111, dec: 32767, hex: 7FFF
added 1 bin: 1000_0000_0000_0000, dec: -32768, hex: 8000
added 2 bin: 1000_0000_0000_0001, dec: -32767, hex: 8001
added 3 bin: 1000_0000_0000_0010, dec: -32766, hex: 8002
With the next code snippet, the variable zero
is initialized to 0
. In the for
loop, this variable is decremented three times:
short zero = 0;
DisplayNumber("zero", zero);
for (int i = 0; i < 3; i++)
{
zero--;
DisplayNumber($"subtracted {i + 1}", zero);
}
Console.WriteLine();
With the output, you can see 0 is represented with all the bits not set. Doing a decrement results in decimal -1
, which is all the bits set, including the sign bit:
zero bin: 0000_0000_0000_0000, dec: 0, hex: 0
subtracted 1 bin: 1111_1111_1111_1111, dec: -1, hex: FFFF
subtracted 2 bin: 1111_1111_1111_1110, dec: -2, hex: FFFE
subtracted 3 bin: 1111_1111_1111_1101, dec: -3, hex: FFFD
Next, start with the largest negative number for a short
. The number is incremented three times:
short minNumber = short.MinValue;
DisplayNumber("min number", minNumber);
for (int i = 0; i < 3; i++)
{
minNumber++;
DisplayNumber($"added {i + 1}", minNumber);
}
Console.WriteLine();
The highest negative number was already shown earlier when overflowing the highest positive number. Earlier you saw this same number when int.MinValue
was used. This number is then incremented three times:
min number bin: 1000_0000_0000_0000, dec: -32768, hex: 8000
added 1 bin: 1000_0000_0000_0001, dec: -32767, hex: 8001
added 2 bin: 1000_0000_0000_0010, dec: -32766, hex: 8002
added 3 bin: 1000_0000_0000_0011, dec: -32765, hex: 8003
The Intermediate Language (IL) enforces strong type safety upon its code. Strong typing enables many of the services provided by .NET, including security and language interoperability. As you would expect from a language compiled into IL, C# is also strongly typed. Among other things, this means that data types are not always seamlessly interchangeable. This section looks at conversions between primitive types.
Often, you need to convert data from one type to another. Consider the following code:
byte value1 = 10;
byte value2 = 23;
byte total = value1 + value2;
Console.WriteLine(total);
When you attempt to compile these lines, you get the following error message:
Cannot implicitly convert type 'int' to 'byte'
The problem here is that when you add 2 bytes together, the result is returned as an int
, not another byte
. This is because a byte
can contain only 8 bits of data, so adding 2 bytes together could easily result in a value that cannot be stored in a single byte
. If you want to store this result in a byte
variable, you have to convert it back to a byte
. The following sections discuss two conversion mechanisms supported by C#—implicit and explicit.
Conversion between types can normally be achieved automatically (implicitly) only if you can guarantee that the value is not changed in any way. This is why the previous code failed; by attempting a conversion from an int
to a byte
, you were potentially losing 3 bytes of data. The compiler won't let you do that unless you explicitly specify that's what you want to do. If you store the result in a long
instead of a byte
, however, you will have no problems:
byte value1 = 10;
byte value2 = 23;
long total = value1 + value2; // this will compile fine
Console.WriteLine(total);
Your program has compiled with no errors at this point because a long
holds more bytes of data than a byte
, so there is no risk of data being lost. In these circumstances, the compiler is happy to make the conversion for you without you needing to ask for it explicitly. As you would expect, you can perform implicit conversions only from a smaller integer type to a larger one, not from larger to smaller. You can also convert between integers and floating-point values; however, the rules are slightly different here. Though you can convert between types of the same size, such as int
/
uint
to float
and long
/
ulong
to double
, you can also convert from long
/
ulong
to float
. You might lose 4 bytes of data doing this, but it only means that the value of the float
you receive will be less precise than if you had used a double
; the compiler regards this as an acceptable possible error because the magnitude of the value is not affected. You can also assign an unsigned variable to a signed variable as long as the value limits of the unsigned type fit between the limits of the signed variable.
Nullable value types introduce additional considerations when you're implicitly converting value types:
int?
implicitly converts to long?
, float?
, double?
, and decimal
?.int
implicitly converts to long?
, float?
, double?
, and decimal?
.null
, which cannot be represented by a non-nullable type.Many conversions cannot be implicitly made between types, and the compiler returns an error if any are attempted. The following are some of the conversions that cannot be made implicitly:
int
to short
—Data loss is possible.int
to uint
—Data loss is possible.uint
to int
—Data loss is possible.float
to int
—Everything is lost after the decimal point.char
—Data loss is possible.decimal
to any other numeric type—The decimal type is internally structured differently from both integers and floating-point numbers.int?
to int
—The nullable type may have the value null
.However, you can explicitly carry out such conversions using casts. When you cast one type to another, you deliberately force the compiler to make the conversion. A cast looks like this:
long val = 30000;
int i = (int)val; // A valid cast. The maximum int is 2147483647
You indicate the type to which you are casting by placing its name in parentheses before the value to be converted.
Casting can be a dangerous operation to undertake. Even a simple cast from a long
to an int
can cause problems if the value of the original long
is greater than the maximum value of an int
:
long val = 3000000000;
int i = (int)val; // An invalid cast. The maximum int is 2147483647
In this case, you get neither an error nor the result you expect. If you run this code and output the value stored in i
, this is what you get:
-1294967296
It is good practice to assume that an explicit cast does not return the results you expect. As shown earlier, C# provides a checked
operator that you can use to test whether an operation causes an arithmetic overflow. You can use the checked
operator to confirm that a cast is safe and to force the runtime to throw an overflow exception if it is not:
long val = 3000000000;
int i = checked((int)val);
Bearing in mind that all explicit casts are potentially unsafe, make sure you include code in your application to deal with possible failures of the casts. Chapter 10, “Errors and Exceptions,” introduces structured exception handling using the try
and catch
statements.
Using casts, you can convert most primitive data types from one type to another; for example, in the following code, the value 0.5
is added to price
, and the total is cast to an int
:
double price = 25.30;
int approximatePrice = (int)(price + 0.5);
This gives the price rounded to the nearest dollar. However, in this conversion, data is lost—namely, everything after the decimal point. Therefore, such a conversion should never be used if you want to continue to do more calculations using this modified price value. However, it is useful if you want to output the approximate value of a completed or partially completed calculation—if you don't want to bother the user with a lot of figures after the decimal point.
This example shows what happens if you convert an unsigned integer into a char
:
ushort c = 43;
char symbol = (char)c;
Console.WriteLine(symbol);
The output is the character that has an ASCII number of 43, which is the + sign. This will work for any kind of conversion you want between the numeric types (including char
), such as converting a decimal
into a char
, or vice versa.
Converting between value types is not restricted to isolated variables, as you have seen. You can convert an array element of type double
to a struct member variable of type int
:
struct ItemDetails
{
public string Description;
public int ApproxPrice;
}
//…
double[] Prices = { 25.30, 26.20, 27.40, 30.00 };
ItemDetails id;
id.Description = "Hello there.";
id.ApproxPrice = (int)(Prices[0] + 0.5);
To convert a nullable type to a non-nullable type or another nullable type where data loss may occur, you must use an explicit cast. This is true even when converting between elements with the same basic underlying type—for example, int?
to int
or float?
to float
. This is because the nullable type may have the value null
, which cannot be represented by the non-nullable type. As long as an explicit cast between two equivalent non-nullable types is possible, so is the explicit cast between nullable types. However, when casting from a nullable type to a non-nullable type and the variable has the value null
, an InvalidOperationException
is thrown. Here is an example:
int? a = null;
int b = (int)a; // Will throw exception
By using explicit casts and a bit of care and attention, you can convert any instance of a simple value type to almost any other. However, there are limitations on what you can do with explicit type conversions—as far as value types are concerned, you can only convert to and from the numeric and char
types and enum
types. You cannot directly cast Booleans to any other type or vice versa.
If you need to convert between numeric and string, you can use methods provided in the .NET class library. The Object
class implements a ToString
method, which has been overridden in all the .NET predefined types and which returns a string representation of the object:
int i = 10;
string s = i.ToString();
Similarly, if you need to parse a string to retrieve a numeric or Boolean value, you can use the Parse
method supported by all the predefined value types:
string s = "100";
int i = int.Parse(s);
Console.WriteLine(i + 50); // Add 50 to prove it is really an int
Note that Parse
registers an error by throwing an exception if it is unable to convert the string (for example, if you try to convert the string Hello
to an integer). Again, exceptions are covered in Chapter 10. Instead of using the Parse
method, you can also use TryParse
, which doesn't throw an exception in case of an error, but returns true
if it succeeds.
Chapter 2 explains that all types—both the simple predefined types, such as int
and char
, and the complex types, such as classes and structs—derive from the object
type. This means you can treat even literal values as though they are objects:
string s = 10.ToString();
However, you also saw that C# data types are divided into value types, which are allocated on the stack, and reference types, which are allocated on the managed heap. How does this work with the capability to call methods on an int
, if the int
is nothing more than a 4-byte value on the stack?
C# achieves this through a bit of magic called boxing. Boxing and its counterpart, unboxing, enable you to convert value types to reference types and then back to value types. I include this topic in the section on casting because this is essentially what you are doing—you are casting your value to the object
type. Boxing is the term used to describe the transformation of a value type to a reference type. Basically, the runtime creates a temporary reference-type box for the object on the heap.
This conversion can occur implicitly, as in the preceding example, but you can also perform it explicitly:
int myIntNumber = 20;
object myObject = myIntNumber;
Unboxing is the term used to describe the reverse process, whereby the value of a previously boxed value type is cast back to a value type. Here, I use the term cast because this has to be done explicitly. The syntax is similar to explicit type conversions already described:
int myIntNumber = 20;
object myObject = myIntNumber; // Box the int
int mySecondNumber = (int)myObject; // Unbox it back into an int
A variable can be unboxed only if it has been boxed. If you execute the last line when myObject
is not a boxed int
, you get a runtime exception.
One word of warning: When unboxing, you have to be careful that the receiving value is of the same type as the value that was boxed. Even if the resulting type has enough room to store all the bytes in the value being unboxed, an InvalidCastException
is thrown. You can avoid this by casting from the original type in the new type, as shown here:
int myIntNumber = 42;
object myObject = (object)myIntNumber;
long myLongNumber = (long)(int)myObject;
Instead of invoking methods, the code can become more readable using operators. Just compare these two code lines to add two vectors:
vect3 = vect1 + vect2;
vect3 = vect1.Add(vect2);
With predefined number types, you can use +
, -
, /
, *
, and %
operators, and you can also concatenate strings with the + operator. Using such operators is not only possible with predefined types, but also with custom types as long as they make sense with the types. What would a +
operator used with two Person
objects do?
You can overload the following operators:
OPERATORS | DESCRIPTION |
---|---|
+x , -x , !x , ~x , ++ , -- , true , false
|
These are unary operators that can be overloaded. |
x + y , x - y , x * y , x / y , x % y , x & y , x | y , x ^ y , x << y , x >> y , x == y, x != y , x < y , x > y , x <= y , x >= y
|
These are binary operators that can be overloaded. |
a[i] , a?[i]
|
Element access cannot be overloaded with an operator overload, but you can create an indexer, which is shown later in this chapter. |
(T)x |
Instead of using an operator overload, you can use the cast to create a user-defined conversion, which is shown later in this chapter as well. |
To understand how to overload operators, it's useful to think about what happens when the compiler encounters an operator. Using the addition operator (+
) as an example, suppose that the compiler processes the following lines of code:
int x = 1;
int y = 2;
long z = x + y;
The compiler identifies that it needs to add two integers and assign the result to a long
. The expression x + y
is just an intuitive and convenient syntax for calling a method that adds two numbers. The method takes two parameters, x
and y
, and returns their sum. Therefore, the compiler does the same thing it does for any method call: it looks for the best matching overload of the addition operator based on the parameter types—in this case, one that takes two integers. As with normal overloaded methods, the desired return type does not influence the compiler's choice as to which version of a method it calls. As it happens, the overload called in the example takes two int
parameters and returns an int
; this return value is subsequently converted to a long
. This can result in an overflow if the two added int
values don't fit into an int
although a long is declared to write the result to.
The next lines cause the compiler to use a different overload of the addition operator:
double d1 = 4.0;
double d2 = d1 + x;
In this instance, the parameters are a double
and an int
, but there is no overload of the addition operator that takes this combination of parameters. Instead, the compiler identifies the best matching overload of the addition operator as being the version that takes two double
s as its parameters, and it implicitly casts the int
to a double
. Adding two double
s requires a different process from adding two integers. Floating-point numbers are stored as a mantissa and an exponent. Adding them involves bit-shifting the mantissa of one of the double
s so that the two exponents have the same value, adding the mantissas, and then shifting the mantissa of the result and adjusting its exponent to maintain the highest possible accuracy in the answer.
Now you are in a position to see what happens if the compiler finds something like this:
Vector vect1, vect2, vect3;
// initialize vect1 and vect2
vect3 = vect1 + vect2;
vect1 = vect1 * 2;
Here, Vector
is the struct, which is defined in the following section. The compiler sees that it needs to add two Vector
instances, vect1
and vect2
, together. It looks for an overload of the addition operator, which takes two Vector
instances as its parameters.
If the compiler finds an appropriate overload, it calls up the implementation of that operator. If it cannot find one, it checks whether there is any other overload for +
that it can use as a best match—perhaps something with two parameters of other data types that can be implicitly converted to Vector
instances. If the compiler cannot find a suitable overload, it raises a compilation error, just as it would if it could not find an appropriate overload for any other method call.
This section demonstrates operator overloading through developing a struct named Vector
that represents a three-dimensional vector. The 3D vector is just a set of three numbers (doubles) that tell you how far something is moving. The variables representing the numbers are called X
, Y
, and Z
: the X
tells you how far something moves east, Y
tells you how far it moves north, and Z
tells you how far it moves upward. Combine the three numbers and you get the total movement.
You can add or multiply vectors by other vectors or by numbers. Incidentally, in this context, we use the term scalar, which is math-speak for a simple number—in C# terms that is just a double
. The significance of addition should be clear. If you move first by the vector (3.0, 3.0, 1.0)
and then move by the vector (2.0, -4.0, -4.0)
, the total amount you have moved can be determined by adding the two vectors. Adding vectors means adding each component individually, so you get (5.0, -1.0, -3.0)
. In this context, mathematicians write c=a+b
, where a
and b
are the vectors and c
is the resulting vector. You want to be able to use the Vector
struct the same way.
The following is the definition for Vector
—containing the read-only public fields, constructors, and a ToString
override so you can easily view the contents of a Vector
. Operator overloads are added next (code file OperatorOverloadingSample/Vector.cs
):
readonly struct Vector
{
public Vector(double x, double y, double z) => (X, Y, Z) = (x, y, z);
public Vector(Vector v) => (X, Y, Z) = (v.X, v.Y, v.Z);
public readonly double X;
public readonly double Y;
public readonly double Z;
public override string ToString() => $"( {X}, {Y}, {Z} )";
}
This example has two constructors that require specifying the initial value of the vector, either by passing in the values of each component or by supplying another Vector
whose value can be copied. Constructors like the second one, which takes a single Vector
argument, are often termed copy constructors because they effectively enable you to initialize a class or struct instance by copying another instance.
Here is the interesting part of the Vector
struct—the operator overload that provides support for the addition operator:
public static Vector operator +(Vector left, Vector right) =>
new Vector(left.X + right.X, left.Y + right.Y, left.Z + right.Z);
The operator overload is declared in much the same way as a static method, except that the operator
keyword tells the compiler it is actually an operator overload you are defining. The operator
keyword is followed by the actual symbol for the relevant operator, in this case the addition operator (+
). The return type is whatever type you get when you use this operator. Adding two vectors results in a vector; therefore, the return type is also a Vector
. For this particular override of the addition operator, the return type is the same as the containing class, but that is not necessarily the case, as you see later in this example. The two parameters are the things you are operating on. For binary operators (those that take two parameters), such as the addition and subtraction operators, the first parameter is the value on the left of the operator, and the second parameter is the value on the right.
The implementation of this operator returns a new Vector
that is initialized using the X
, Y
, and Z
fields from the left
and right
variables.
C# requires that all operator overloads be declared as public
and static
, which means they are associated with their class or struct, not with a particular instance. Because of this, the body of the operator overload has no access to nonstatic class members or the this
identifier. This is fine because the parameters provide all the input data the operator needs to know to perform its task.
Now all you need to do is write some simple code to test the Vector
struct (code file OperatorOverloadingSample/Program.cs
):
Vector vect1, vect2, vect3;
vect1 = new(3.0, 3.0, 1.0);
vect2 = new(2.0, -4.0, -4.0);
vect3 = vect1 + vect2;
Console.WriteLine($"vect1 = {vect1}");
Console.WriteLine($"vect2 = {vect2}");
Console.WriteLine($"vect3 = {vect3}");
Compiling and running this code returns the following result:
vect1 = ( 3, 3, 1 )
vect2 = ( 2, -4, -4 )
vect3 = ( 5, -1, -3 )
Just by implementing the + operator, you can use the compound assignment operator +=
. Let's add vect2
to the existing value of vect3
:
vect3 += vect2;
Console.WriteLine($"vect3 = {vect3}");
This compiles and runs, resulting in the following:
vect3 = ( 7, -5, -7)
In addition to adding vectors, you can multiply and subtract them and compare their values. These operators can be implemented in the same way as the +
operator. What might be more interesting is multiplying a vector by a double. With the following three operator overloads, a vector is multiplied by a vector, a vector is multiplied by a double, and a double is multiplied by a vector. You need to implement the different operators depending what's on the left and right sides, but you can reuse implementations. The operator overload where the vector is on the left and the double on the right just reuses the operator overload where the arguments are changed (code file OperatorOverloadingSample/Vector.cs
):
public static Vector operator *(Vector left, Vector right) =>
new Vector(left.X * right.X, left.Y * right.Y, left.Z * right.Z);
public static Vector operator *(double left, Vector right) =>
new Vector(left * right.X, left * right.Y, left * right.Z);
public static Vector operator *(Vector left, double right) =>
right * left;
The operators are used in the following code snippet. The int
number used is converted to a double
because this is the best match for the overload:
Console.WriteLine($"2 * vect3 = {2 * vect3}");
Console.WriteLine($"vect3 += vect2 gives {vect3 += vect2}");
Console.WriteLine($"vect3 = vect1 * 2 gives {vect3 = vect1 * 2}");
Console.WriteLine($"vect1 * vect3 = {vect1 * vect3}");
Comparing objects for equality has become easier with C# 9 and records. Records already have built-in functionality to compare the values of the type. Let's look at what's implemented with records (what you can override) and what you need to do with classes and structs.
To compare references, the object
class defines the static method ReferenceEquals
. This is not a comparison by value; instead it just compares the variables if they reference the same object in the heap. The functionality is the same for classes and records. Comparing two variables referencing the same object in the heap returns true
. If the two variables reference different objects in the heap, the method returns false
, even if the content of the two objects is the same. Using this method to compare two variables referencing structs, new objects are created to reference the value type (known as boxing) and thus always returns false
. The compiler warns on comparing structs this way.
The default implementation of the object
class Equals
method just invokes object.ReferenceEquals
. In case you need to compare the values for equality, you can use the built-in functionality of the record type or create a custom implementation with the class. To compare the values of two reference types, you need to consider what's automatically implemented by a record and what you can implement when comparing classes for equality:
bool Equals(object?)
that can be overridden.IEquatable<T>
defines the generic method bool Equals(T? object)
that can be implemented.==
and !=
can be overridden.EqualityContract
, which is used with the comparison to not only compare the values, but also if the comparison is done with the same contract.To compare references, the Book
class implements the IEquatable<Book>
interface with the bool Equals(Book? other)
method. This method compares the Title
and Publisher
properties. Similar to the record type, the Book
class specifies the EqualityContract
property to also compare the type of the class. This way, comparing the Title
and Publisher
properties with an object of another type returns always false
. The implementation for equality comparison is only done with this method. The overridden Equals
method from the base class invokes this method, as well as the implementation for the operators ==
and !=
. Implementing equality also requires overriding the GetHashCode
method from the base class (code file EqualitySample/Book.cs
):
class Book : IEquatable<Book>
{
public Book(string title, string publisher)
{
Title = title;
Publisher = publisher;
}
public string Title { get; }
public string Publisher { get; }
protected virtual Type EqualityContract { get; } = typeof(Book);
public override string ToString() => Title;
public override bool Equals(object? obj) =>
this == obj as Book;
public override int GetHashCode() =>
Title.GetHashCode() ^ Publisher.GetHashCode();
public virtual bool Equals(Book? other) =>
this == other;
public static bool operator ==(Book? left, Book? right) =>
left?.Title == right?.Title && left?.Publisher == right?.Publisher &&
left?.EqualityContract == right?.EqualityContract;
public static bool operator !=(Book? left, Book? right) =>
!(left == right);
}
In the Program.cs
file, two Book
objects are created that have the same content. Because there are two different objects in the heap, object.ReferenceEquals
returns false
. Next, the Equals
method from the IEquatable<Book>
interface, the overloaded object Equals
, and the operator ==
are used, and they all return true
because of the implemented value comparison (code file EqualitySample/Program.cs
):
Book book1 = new("Professional C#", "Wrox Press");
Book book2 = new("Professional C#", "Wrox Press");
if (!object.ReferenceEquals(book1, book2))
{
Console.WriteLine("Not the same reference");
}
if (book1.Equals(book2))
{
Console.WriteLine("The same object using the generic Equals method");
}
object book3 = book2;
if (book1.Equals(book3))
{
Console.WriteLine("The same object using the overridden Equals method");
}
if (book1 == book2)
{
Console.WriteLine("The same book using the == operator");
}
Custom indexers cannot be implemented using the operator overloading syntax, but they can be implemented with a syntax that looks similar to properties.
With the following code snippet, an array is created, and the indexer is used to access array elements. The second code line uses the indexer to access the second element and pass 42
to it. The third line uses the indexer to access the third element and pass the value of the element to the variable x
.
int[] arr1 = {1, 2, 3};
arr1[1] = 42;
int x = arr1[2];
To create a custom indexer, first create a Person
record with the properties FirstName
, LastName
, and Birthday
(code file CustomIndexerSample/Person.cs
):
public record Person(string FirstName, string LastName, DateTime Birthday)
{
public override string ToString() => $"{FirstName} {LastName}";
}
The class PersonCollection
defines a private array field that contains Person
elements and a constructor where a number of Person
objects can be passed (code file CustomIndexerSample/PersonCollection.cs
):
public class PersonCollection
{
private Person[] _people;
public PersonCollection(params Person[] people) =>
_people = people.ToArray();
}
For allowing indexer-syntax to be used to access the PersonCollection
and return Person
objects, you can create an indexer. The indexer looks very similar to a property because it also contains get
and set
accessors. What's different is the name. Specifying an indexer makes use of the this
keyword. The brackets that follow the this
keyword specify the type that is used with the index. An array offers indexers with the int
type, so int
types are used here to pass the information directly to the contained array _people
. The use of the set
and get
accessors is similar to properties. The get
accessor is invoked when a value is retrieved; the set
accessor is invoked when a Person
object is passed on the right side.
public Person this[int index]
{
get => _people[index];
set => _people[index] = value;
}
With indexers, any type can be used as the indexing type. With the sample application, the DateTime
struct is used. This indexer is used to return every person with a specified birthday. Because multiple people can have the same birthday, not a single Person
object is returned; instead, a list of people is returned with the interface IEnumerable<Person>
. With the implementation of the indexer, the Where
method is used. A lambda expression is passed with the argument. The Where
method is defined in the namespace System.Linq
:
public IEnumerable<Person> this[DateTime birthDay]
{
get => _people.Where(p => p.Birthday == birthDay);
}
The indexer using the DateTime
type lets you retrieve Person
objects but doesn't allow you to set Person
objects because there's a get
accessor but no set
accessor. A shorthand notation exists to create the same code with an expression-bodied member (the same syntax available with properties):
public IEnumerable<Person> this[DateTime birthDay] =>
_people.Where(p => p.Birthday == birthDay);
With the top-level statements of the sample application, a PersonCollection
object with four Person
objects is created. With the first WriteLine
method, the third element is accessed using the get
accessor of the indexer with the int
parameter. Within the foreach
loop, the indexer with the DateTime
parameter is used to pass a specified date (code file CustomIndexerSample/Program.cs
):
Person p1 = new("Ayrton", "Senna", new DateTime(1960, 3, 21));
Person p2 = new("Ronnie", "Peterson", new DateTime(1944, 2, 14));
Person p3 = new("Jochen", "Rindt", new DateTime(1942, 4, 18));
Person p4 = new("Francois", "Cevert", new DateTime(1944, 2, 25));
PersonCollection coll = new(p1, p2, p3, p4);
Console.WriteLine(coll[2]);
foreach (var r in coll[new DateTime(1960, 3, 21)])
{
Console.WriteLine(r);
}
Console.ReadLine();
When you run the program, the first WriteLine
method writes Jochen Rindt
to the console; the result of the foreach
loop is Ayrton Senna
because that person has the same birthday as is assigned within the second indexer.
Earlier in this chapter (see the “Explicit Conversions” section), you learned that you can convert values between predefined data types through a process of casting. You also saw that C# allows two different types of casts: implicit and explicit. This section looks at these types of casts.
For an explicit cast, you explicitly mark the cast in your code by including the destination data type inside parentheses:
int i = 3;
long l = i; // implicit
short s = (short)i; // explicit
For the predefined data types, explicit casts are required where there is a risk that the cast might fail or some data might be lost. The following are some examples:
int
to a short
, the short
might not be large enough to hold the value of the int
.null
causes an exception.By making the cast explicit in your code, C# forces you to affirm that you understand there is a risk of data loss, and therefore presumably you have written your code to take this into account.
Because C# allows you to define your own data types (structs and classes), it follows that you need the facility to support casts to and from those data types. The mechanism is to define a cast as a member operator of one of the relevant classes. Your cast operator must be marked as either implicit
or explicit
to indicate how you intend it to be used. The expectation is that you follow the same guidelines as for the predefined casts: if you know that the cast is always safe regardless of the value held by the source variable, then you define it as implicit
. Conversely, if you know there is a risk of something going wrong for certain values—perhaps some loss of data or an exception being thrown—then you should define the cast as explicit
.
The syntax for defining a cast is similar to that for overloading operators discussed earlier in this chapter. This is not a coincidence—a cast is regarded as an operator whose effect is to convert from the source type to the destination type. To illustrate the syntax, the following is taken from an example struct
named Currency
, which is introduced in the next section, “Implementing User-Defined Casts”:
public static implicit operator float (Currency value)
{
// processing
}
The return type of the operator defines the target type of the cast operation, and the single parameter is the source object for the conversion. The cast defined here allows you to implicitly convert the value of a Currency
into a float
. Note that if a conversion has been declared as implicit
, the compiler permits its use either implicitly or explicitly. If it has been declared as explicit
, the compiler only permits it to be used explicitly. Similar to other operator overloads, casts must be declared as both public
and static
.
This section illustrates the use of implicit and explicit user-defined casts in an example called CastingSample
. In this example, you define a struct, Currency
, which holds a positive USD ($) monetary value. C# provides the decimal
type for this purpose, but it is possible you still will want to write your own struct or class to represent monetary values if you need to perform sophisticated financial processing and therefore want to implement specific methods on such a class.
Initially, the definition of the Currency
struct is as follows (code file CastingSample/Currency.cs
):
public readonly struct Currency
{
public readonly uint Dollars;
public readonly ushort Cents;
public Currency(uint dollars, ushort cents) => (Dollars, Cents) = (dollars, cents);
public override string ToString() => $"${Dollars}.{Cents,-2:00}";
}
The use of unsigned data types for the Dollar
and Cents
fields ensures that a Currency
instance can hold only positive values. It is restricted this way to illustrate some points about explicit casts later. You might want to use a type like this to hold, for example, salary information for company employees (people's salaries tend not to be negative!).
Start by assuming that you want to be able to convert Currency
instances to float
values, where the integer part of the float
represents the dollars. In other words, you want to be able to write code like this:
Currency balance = new(10, 50);
float f = balance; // We want f to be set to 10.5
To be able to do this, you need to define a cast. Hence, you add the following to your Currency
definition:
public static implicit operator float (Currency value) =>
value.Dollars + (value.Cents/100.0f);
The preceding cast is implicit. It is a sensible choice in this case because, as it should be clear from the definition of Currency
, any value that can be stored in the Currency
can also be stored in a float
. There is no way that anything should ever go wrong in this cast.
However, if you have a float
that you would like to be converted to a Currency
, the conversion is not guaranteed to work. A float
can store negative values, whereas Currency
instances can't, and a float
can store numbers of a far higher magnitude than can be stored in the (uint
) Dollar
field of Currency
. Therefore, if a float
contains an inappropriate value, converting it to a Currency
could give unpredictable results. Because of this risk, the conversion from float
to Currency
should be defined as explicit. Here is the first attempt, which does not return quite the correct results, but it is instructive to examine why:
public static explicit operator Currency (float value)
{
uint dollars = (uint)value;
ushort cents = (ushort)((value-dollars)*100);
return new Currency(dollars, cents);
}
The following code now successfully compiles:
float amount = 45.63f;
Currency amount2 = (Currency)amount;
However, the following code, if you tried it, would generate a compilation error because it attempts to use an explicit cast implicitly:
float amount = 45.63f;
Currency amount2 = amount; // wrong
By making the cast explicit, you warn the developer to be careful because data loss might occur. However, as you will soon see, this is not how you want your Currency
struct to behave. Try writing a test harness and running the sample. Here is the Main
method, which instantiates a Currency
struct and attempts a few conversions. At the start of this code, you write out the value of balance
in two different ways—this is needed to illustrate something later in the example (code file CastingSample/Program.cs
):
try
{
Currency balance = new(50,35);
Console.WriteLine(balance);
Console.WriteLine($"balance is {balance}"); // implicitly invokes ToString
float balance2 = balance;
Console.WriteLine($"After converting to float, = {balance2}");
balance = (Currency) balance2;
Console.WriteLine($"After converting back to Currency, = {balance}");
Console.WriteLine("Now attempt to convert out of range value of " +
"-$50.50 to a Currency:");
checked
{
balance = (Currency) (-50.50);
Console.WriteLine($"Result is {balance}");
}
}
catch(Exception e)
{
Console.WriteLine($"Exception occurred: {e.Message}");
}
Notice that the entire code is placed in a try
block to catch any exceptions that occur during your casts. In addition, the lines that test converting an out-of-range value to Currency
are placed in a checked
block in an attempt to trap negative values. Running this code produces the following output:
50.35
Balance is $50.35
After converting to float, = 50.35
After converting back to Currency, = $50.34
Now attempt to convert out of range value of -$50.50 to a Currency:
Result is $4294967246.00
This output shows that the code did not quite work as expected. First, converting back from float
to Currency
gave a wrong result of $50.34
instead of $50.35
. Second, no exception was generated when you tried to convert an obviously out-of-range value.
The first problem is caused by rounding errors. If a cast is used to convert from a float
to a uint
, the computer truncates the number rather than rounds it. The computer stores numbers in binary rather than decimal, and the fraction 0.35 cannot be exactly represented as a binary fraction (just as 1∕3 cannot be represented exactly as a decimal fraction; it comes out as 0.3333 recurring). The computer ends up storing a value very slightly lower than 0.35 that can be represented exactly in binary format. Multiply by 100, and you get a number fractionally less than 35, which is truncated to 34 cents. Clearly, in this situation, such errors caused by truncation are serious, and the way to avoid them is to ensure that some intelligent rounding is performed in numerical conversions.
Luckily, Microsoft has written a class that does this: System.Convert
. The System.Convert
object contains a large number of static methods to perform various numerical conversions, and the one that we want is Convert.ToUInt16
. Note that the extra care taken by the System.Convert
methods comes at a performance cost. You should use them only when necessary.
Let's examine the second problem—why the expected overflow exception wasn't thrown. The issue here is that the place where the overflow really occurs isn't actually in the Main
routine at all—it is inside the code for the cast operator, which is called from the Main
method. The code in this method was not marked as checked
.
The solution is to ensure that the cast itself is computed in a checked
context, too. With both this change and the fix for the first problem, the revised code for the conversion looks like the following:
public static explicit operator Currency (float value)
{
checked
{
uint dollars = (uint)value;
ushort cents = Convert.ToUInt16((value-dollars)*100);
return new Currency(dollars, cents);
}
}
Note that you use Convert.ToUInt16
to calculate the cents, as described earlier, but you do not use it for calculating the dollar part of the amount. System.Convert
is not needed when calculating the dollar amount because truncating the float
value is what you want there.
You won't look at the new results with this new checked
cast just yet because you have some more modifications to make to the CastingSample
example later in this section.
The Currency
example involves only classes that convert to or from float
—one of the predefined data types. However, it is not necessary to involve any of the simple data types. It is perfectly legitimate to define casts to convert between instances of different structs or classes that you have defined. You need to be aware of a couple of restrictions, however:
To illustrate these requirements, suppose that you have the class hierarchy shown in Figure 5-1.
In other words, classes C
and D
are indirectly derived from A
. In this case, the only legitimate user-defined cast between A
, B
, C
, or D
would be to convert between classes C
and D
, because these classes are not derived from each other. The code for this might look like the following (assuming you want the casts to be explicit, which is usually the case when defining casts between user-defined classes):
public static explicit operator D(C value)
{
//…
}
public static explicit operator C(D value)
{
//…
}
For each of these casts, you can choose where you place the definitions—inside the class definition of C
or inside the class definition of D
, but not anywhere else. C# requires you to put the definition of a cast inside either the source class (or struct) or the destination class (or struct). A side effect of this is that you cannot define a cast between two classes unless you have access to edit the source code for at least one of them. This is sensible because it prevents third parties from introducing casts into your classes.
After you have defined a cast inside one of the classes, you cannot also define the same cast inside the other class. Obviously, there should be only one cast for each conversion; otherwise, the compiler would not know which one to use.
To see how these casts work, start by considering the case in which both the source and the destination are reference types and consider two classes, MyBase
and MyDerived
, where MyDerived
is derived directly or indirectly from MyBase
.
First, from MyDerived
to MyBase
, it is always possible (assuming the constructors are available) to write this:
MyDerived derivedObject = new MyDerived();
MyBase baseCopy = derivedObject;
Here, you are casting implicitly from MyDerived
to MyBase
. This works because of the rule that any reference to a type MyBase
is allowed to refer to objects of class MyBase
or anything derived from MyBase
. In object-oriented programming, instances of a derived class are, in a real sense, instances of the base class, plus something extra. All the functions and fields defined on the base class are defined in the derived class, too.
Alternatively, you can write this:
MyBase derivedObject = new MyDerived();
MyBase baseObject = new MyBase();
MyDerived derivedCopy1 = (MyDerived) derivedObject; // OK
MyDerived derivedCopy2 = (MyDerived) baseObject; // Throws exception
This code is perfectly legal C# (in a syntactic sense, that is) and illustrates casting from a base class to a derived class. However, the final statement throws an exception when executed. When you perform the cast, the object being referred to is examined. Because a base class reference can, in principle, refer to a derived class instance, it is possible that this object is actually an instance of the derived class that you are attempting to cast to. If that is the case, the cast succeeds, and the derived reference is set to refer to the object. If, however, the object in question is not an instance of the derived class (or of any class derived from it), the cast fails, and an exception is thrown.
Notice that the casts that the compiler has supplied, which convert between base and derived class, do not actually do any data conversion on the object in question. All they do is set the new reference to refer to the object if it is legal for that conversion to occur. To that extent, these casts are very different in nature from the ones that you normally define yourself. For example, in the CastingSample
example earlier, you defined casts that convert between a Currency
struct and a float
. In the float
-to-
Currency
cast, you actually instantiated a new
Currency
struct and initialized it with the required values. The predefined casts between base and derived classes do not do this. If you want to convert a MyBase
instance into a real MyDerived
object with values based on the contents of the MyBase
instance, you cannot use the cast syntax to do this. The most sensible option is usually to define a derived class constructor that takes a base class instance as a parameter and have this constructor perform the relevant initializations:
class DerivedClass: BaseClass
{
public DerivedClass(BaseClass base)
{
// initialize object from the Base instance
}
// …
The previous discussion focused on casting between base and derived classes where both participants were reference types. Similar principles apply when casting value types, although in this case it is not possible to simply copy references—some copying of data must occur.
It is not, of course, possible to derive from structs or primitive value types. Casting between base and derived structs invariably means casting between a primitive type or a struct and System.Object
. (Theoretically, it is possible to cast between a struct and System.ValueType
, though it is hard to see why you would want to do this.)
The cast from any struct (or primitive type) to object
is always available as an implicit cast—because it is a cast from a derived type to a base type—and is just the familiar process of boxing. Here's an example using the Currency
struct:
Currency balance = new(40,0);
object baseCopy = balance;
When this implicit cast is executed, the contents of balance
are copied onto the heap into a boxed object, and the baseCopy
object reference is set to this object. What actually happens behind the scenes is this: when you originally defined the Currency
struct, .NET implicitly supplied another (hidden) class, a boxed Currency
class, which contains all the same fields as the Currency
struct but is a reference type, stored on the heap. This happens whenever you define a value type, whether it is a struct
or an enum
, and similar boxed reference types exist corresponding to all the primitive value types of int
, double
, uint
, and so on. It is not possible, or necessary, to gain direct programmatic access to any of these boxed classes in source code, but they are the objects that are working behind the scenes whenever a value type is cast to object
. When you implicitly cast Currency
to object
, a boxed Currency
instance is instantiated and initialized with all the data from the Currency
struct. In the preceding code, it is this boxed Currency
instance to which baseCopy
refers. By these means, it is possible for casting from derived to base type to work syntactically in the same way for value types as for reference types.
Casting the other way is known as unboxing. Like casting between a base reference type and a derived reference type, it is an explicit cast because an exception is thrown if the object being cast is not of the correct type:
object derivedObject = new Currency(40,0);
object baseObject = new object();
Currency derivedCopy1 = (Currency)derivedObject; // OK
Currency derivedCopy2 = (Currency)baseObject; // Exception thrown
This code works in a way similar to the code presented earlier for reference types. Casting derivedObject
to Currency
works fine because derivedObject
actually refers to a boxed Currency
instance—the cast is performed by copying the fields out of the boxed Currency
object into a new Currency
struct. The second cast fails because baseObject
does not refer to a boxed Currency
object.
When using boxing and unboxing, it is important to understand that both processes actually copy the data into the new boxed or unboxed object. Hence, manipulations on the boxed object, for example, do not affect the contents of the original value type.
One thing you have to watch for when you are defining casts is that if the C# compiler is presented with a situation in which no direct cast is available to perform a requested conversion, it attempts to find a way of combining casts to do the conversion. For example, with the Currency
struct, suppose the compiler encounters a few lines of code like this:
Currency balance = new(10,50);
long amount = (long)balance;
double amountD = balance;
You first initialize a Currency
instance, and then you attempt to convert it to a long
. The trouble is that you haven't defined the cast to do that. However, this code still compiles successfully. Here's what happens: the compiler realizes that you have defined an implicit cast to get from Currency
to float
, and the compiler already knows how to explicitly cast a float
to a long
. Hence, it compiles that line of code into IL code that converts balance
first to a float
and then converts that result to a long
. The same thing happens in the final line of the code, when you convert balance
to a double
. However, because the cast from Currency
to float
and the predefined cast from float
to double
are both implicit, you can write this conversion in your code as an implicit cast. If you prefer, you could also specify the casting route explicitly:
Currency balance = new(10,50);
long amount = (long)(float)balance;
double amountD = (double)(float)balance;
However, in most cases, this would be seen as needlessly complicating your code. The following code, by contrast, produces a compilation error:
Currency balance = new(10,50);
long amount = balance;
The reason is that the best match for the conversion that the compiler can find is still to convert first to float
and then to long
. The conversion from float
to long
needs to be specified explicitly, though.
Not all of this by itself should give you too much trouble. The rules are, after all, fairly intuitive and designed to prevent any data loss from occurring without the developer knowing about it. However, the problem is that if you are not careful when you define your casts, it is possible for the compiler to select a path that leads to unexpected results. For example, suppose that it occurs to someone else in the group writing the Currency
struct that it would be useful to be able to convert a uint
containing the total number of cents in an amount into a Currency
(cents, not dollars, because the idea is not to lose the fractions of a dollar). Therefore, this cast might be written to try to achieve this:
// Do not do this!
public static implicit operator Currency(uint value) =>
new Currency(value/100u, (ushort)(value%100));
Note the u
after the first 100 in this code ensures that value/100u
is interpreted as a uint
. If you had written value/100
, the compiler would have interpreted this as an int
, not a uint
.
The comment Do not do this!
is clearly noted in this code, and here is why: the following code snippet merely converts a uint
containing 350
into a Currency
and back again; but what do you think bal2
will contain after executing this?
uint bal = 350;
Currency balance = bal;
uint bal2 = (uint)balance;
The answer is not 350
but 3
! Moreover, it all follows logically. You convert 350
implicitly to a Currency
, giving the result balance.Dollars = 3
, balance.Cents = 50
. Then the compiler does its usual figuring out of the best path for the conversion back. Balance
ends up being implicitly converted to a float
(value 3.5
), and this is converted explicitly to a uint
with value 3
. One way to fix this would be to create a user-defined cast to uint
.
Of course, other instances exist in which converting to another data type and back again causes data loss. For example, converting a float
containing 5.8
to an int
and back to a float
again loses the fractional part, giving you a result of 5
, but there is a slight difference in principle between losing the fractional part of a number and dividing an integer by more than 100. Currency
has suddenly become a rather dangerous class that does strange things to integers!
The problem is that there is a conflict between how your casts interpret integers. The casts between Currency
and float
interpret an integer value of 1
as corresponding to one dollar, but the latest uint
-to-
Currency
cast interprets this value as one cent. This is an example of poor design. If you want your classes to be easy to use, you should ensure that all your casts behave in ways that are mutually compatible, in the sense that they intuitively give the same results. In this case, the solution is obviously to rewrite the uint
-to-
Currency
cast so that it interprets an integer value of 1
as one dollar:
public static implicit operator Currency (uint value) =>
new Currency(value, 0);
Incidentally, you might wonder whether this new cast is necessary at all. The answer is that it could be useful. Without this cast, the only way for the compiler to carry out a uint
-to-
Currency
conversion would be via a float
. Converting directly is a lot more efficient in this case, so having this extra cast provides performance benefits, though you need to ensure that it provides the same result as via a float
, which you have now done. In other situations, you may also find that separately defining casts for different predefined data types enables more conversions to be implicit rather than explicit, though that is not the case here.
A good test of whether your casts are compatible is to ask whether a conversion will give the same results (other than perhaps a loss of accuracy as in float
-to-
int
conversions) regardless of which path it takes. The Currency
class provides a good example of this. Consider this code:
Currency balance = new(50, 35);
ulong bal = (ulong) balance;
At present, there is only one way that the compiler can achieve this conversion: by converting the Currency
to a float
implicitly and then to a ulong
explicitly. The float
-to-
ulong
conversion requires an explicit conversion, but that is fine because you have specified one here.
Suppose, however, that you then added another cast to convert implicitly from a Currency
to a uint
. You actually do this by modifying the Currency
struct by adding the casts both to and from uint
(code file CastingSample/Currency.cs
):
public static implicit operator Currency(uint value) =>
new Currency(value, 0);
public static implicit operator uint(Currency value) => value.Dollars;
Now the compiler has another possible route to convert from Currency
to ulong
: to convert from Currency
to uint
implicitly and then to ulong
implicitly. Which of these two routes will it take? C# has some precise rules about the best route for the compiler when there are several possibilities. (The rules are not covered in this book, but if you are interested in the details, see the MSDN documentation.) The best answer is that you should design your casts so that all routes give the same answer (other than possible loss of precision), in which case it doesn't really matter which one the compiler picks. (As it happens in this case, the compiler picks the Currency
-to-
uint
-to-
ulong
route in preference to Currency
-to-
float
-to-
ulong
.)
To test casting the Currency
to uint
, add this test code to the Main
method (code file UserDefinedConversion/Program.cs
):
try
{
Currency balance = new(50,35);
Console.WriteLine(balance);
Console.WriteLine($"balance is {balance}");
uint balance3 = (uint) balance;
Console.WriteLine($"Converting to uint gives {balance3}");
}
catch (Exception ex)
{
Console.WriteLine($"Exception occurred: {ex.Message}");
}
Running the sample now gives you these results:
50
balance is $50.35
Converting to uint gives 50
The output shows that the conversion to uint
has been successful, though, as expected, you have lost the cents part of the Currency
in making this conversion.
However, the output also demonstrates one last potential problem that you need to be aware of when working with casts. The first line of output does not display the balance correctly, displaying 50
instead of 50.35
.
So, what is going on? The problem here is that when you combine casts with method overloads, you get another source of unpredictability.
The WriteLine
statement using the format string implicitly calls the Currency.ToString
method, ensuring that the Currency
is displayed as a string.
The first code line with WriteLine
, however, simply passes a raw Currency
struct to the WriteLine
method. Now, WriteLine
has many overloads, but none of them takes a Currency
struct. Therefore, the compiler starts fishing around to see what it can cast the Currency
to in order to make it match up with one of the overloads of WriteLine
. As it happens, one of the WriteLine
overloads is designed to display uint
s quickly and efficiently, and it takes a uint
as a parameter—you have now supplied a cast that converts Currency
implicitly to uint
.
In fact, WriteLine
has another overload that takes a float
as a parameter and displays the value of that float
. If you look closely at the output running the example previously where the cast to uint
did not exist, you see that the first line of output displayed Currency
as a float
, using this overload. In that example, there wasn't a direct cast from Currency
to uint
, so the compiler picked Currency
-to-
float
as its preferred way of matching up the available casts to the available WriteLine
overloads. However, now that there is a direct cast to uint
available in Currency
, the compiler has opted for that route.
The upshot of this is that if you have a method call that takes several overloads and you attempt to pass it a parameter whose data type doesn't match any of the overloads exactly, then you are forcing the compiler to decide not only what casts to use to perform the data conversion, but also which overload, and hence which data conversion, to pick. The compiler always works logically and according to strict rules, but the results may not be what you expect. If there is any doubt, you are better off specifying which cast to use explicitly.
This chapter looked at the standard operators provided by C#, described the mechanics of object equality, and examined how the compiler converts the standard data types from one to another. It also demonstrated how you can implement custom operator support on your data types using operator overloads. Finally, you looked at a special type of operator overload, the cast operator, which enables you to specify how instances of your types are converted to other data types.
The next chapter dives into arrays where the index operator has an important role.
If you need to work with multiple objects of the same type, you can use collections (see Chapter 8, “Collections”) and arrays. C# has a special notation to declare, initialize, and use arrays. Behind the scenes, the Array
class comes into play, which offers several methods to sort and filter the elements inside the array. Using an enumerator, you can iterate through all the elements of the array.
If you need to use multiple objects of the same type, you can use an array. An array is a data structure that contains a number of elements of the same type.
An array is declared by defining the type of elements inside the array, followed by empty brackets and a variable name. For example, an array containing integer elements is declared like this:
int[] myArray;
After declaring an array, memory must be allocated to hold all the elements of the array. An array is a reference type, so memory on the heap must be allocated. You do this by initializing the variable of the array using the new
operator, with the type and the number of elements inside the array. Here, you specify the size of the array:
myArray = new int[4];
With this declaration and initialization, the variable myArray
references four integer values that are allocated on the managed heap (see Figure 6-1).
Instead of using a separate line to declare and initialize an array, you can use a single line:
int[] myArray = new int[4];
You can also assign values to every array element using an array initializer. The following code samples all declare an array with the same content but with less code for you to write. The compiler can count the number of elements in the array by itself, which is why the array size is left out with the second line. The compiler also can map the values defined in the initializer list to the type used on the left side, so you also can remove the new
operator left of the initializer. The code generated from the compiler is always the same:
int[] myArray1 = new int[4] {4, 7, 11, 2};
int[] myArray2 = new int[] {4, 7, 11, 2};
int[] myArray3 = {4, 7, 11, 2};
After an array is declared and initialized, you can access the array elements using an indexer. Arrays support only indexers that have parameters of type int
.
With the indexer, you pass the element number to access the array. The indexer always starts with a value of 0 for the first element. Therefore, the highest number you can pass to the indexer is the number of elements minus one because the index starts at zero. In the following example, the array myArray
is declared and initialized with four integer values. The elements can be accessed with indexer values 0, 1, 2, and 3.
int[] myArray = new int[] {4, 7, 11, 2};
int v1 = myArray[0]; // read first element
int v2 = myArray[1]; // read second element
myArray[3] = 44; // change fourth element
If you don't know the number of elements in the array, you can use the Length
property, as shown in this for
statement:
for (int i = 0; i < myArray.Length; i++)
{
Console.WriteLine(myArray[i]);
}
Instead of using a for
statement to iterate through all the elements of the array, you can also use the foreach
statement:
foreach (var val in myArray)
{
Console.WriteLine(val);
}
In addition to being able to declare arrays of predefined types, you also can declare arrays of custom types. Let's start with the following Person
record using positional record syntax to declare the init-only setter properties FirstName
and LastName
(code file SimpleArrays/Person.cs
):
public record Person(string FirstName, string LastName);
Declaring an array of two Person
elements is similar to declaring an array of int
:
Person[] myPersons = new Person[2];
However, be aware that if the elements in the array are reference types, memory must be allocated for every array element. If you use an item in the array for which no memory was allocated, a NullReferenceException
is thrown.
You can allocate every element of the array by using an indexer starting from 0. When you create the second object, you make use of C# 9 target-typed new as the type (code file SimpleArrays/Program.cs
):
myPersons[0] = new Person("Ayrton", "Senna");
myPersons[1] = new("Michael", "Schumacher");
Figure 6-2 shows the objects in the managed heap with the Person
array. myPersons
is a variable that is stored on the stack. This variable references an array of Person
elements that is stored on the managed heap. This array has enough space for two references. Every item in the array references a Person
object that is also stored in the managed heap.
As with the int
type, you can use an array initializer with custom types:
Person[] myPersons2 =
{
new("Ayrton", "Senna"),
new("Michael", "Schumacher")
};
Ordinary arrays (also known as one-dimensional arrays) are indexed by a single integer. A multidimensional array is indexed by two or more integers.
Figure 6-3 shows the mathematical notation for a two-dimensional array that has three rows and three columns. The first row has the values 1, 2, and 3, and the third row has the values 7, 8, and 9.
To declare this two-dimensional array with C#, you put a comma inside the brackets. The array is initialized by specifying the size of every dimension (also known as rank). Then the array elements can be accessed by using two integers with the indexer (code file SimpleArrays/Program.cs
):
int[,] twodim = new int[3, 3];
twodim[0, 0] = 1;
twodim[0, 1] = 2;
twodim[0, 2] = 3;
twodim[1, 0] = 4;
twodim[1, 1] = 5;
twodim[1, 2] = 6;
twodim[2, 0] = 7;
twodim[2, 1] = 8;
twodim[2, 2] = 9;
You can also initialize the two-dimensional array by using an array indexer if you know the values for the elements in advance. To initialize the array, one outer curly bracket is used, and every row is initialized by using curly brackets inside the outer curly brackets:
int[,] twodim = {
{1, 2, 3},
{4, 5, 6},
{7, 8, 9}
};
By using two commas inside the brackets, you can declare a three-dimensional array by placing initializers for two-dimensional arrays inside brackets separated by commas:
int[,,] threedim = {
{ { 1, 2 }, { 3, 4 } },
{ { 5, 6 }, { 7, 8 } },
{ { 9, 10 }, { 11, 12 } }
};
Console.WriteLine(threedim[0, 1, 1]);
Using a foreach
loop, you can iterate through all the items of a multidimensional array.
A two-dimensional array has a rectangular size (for example, 3 × 3 elements). A jagged array provides more flexibility in sizing the array. With a jagged array, every row can have a different size.
Figure 6-4 contrasts a two-dimensional array that has 3 × 3 elements with a jagged array. The jagged array shown contains three rows: the first row contains two elements, the second row contains six elements, and the third row contains three elements.
A jagged array is declared by placing one pair of opening and closing brackets after another. To initialize the jagged array, in the following code snippet an array initializer is used. The first array is initialized by items of arrays. Each of these items again is initialized with its own array initializer (code file SimpleArrays/Program.cs
):
int[][] jagged =
{
new[] { 1, 2 },
new[] { 3, 4, 5, 6, 7, 8 },
new[] { 9, 10, 11 }
};
You can iterate through all the elements of a jagged array with nested for
loops. In the outer for
loop, every row is iterated, and the inner for
loop iterates through every element inside a row:
for (int row = 0; row < jagged.Length; row++)
{
for (int element = 0; element < jagged[row].Length; element++)
{
Console.WriteLine($"row: {row}, element: {element}, " +
$"value: {jagged[row][element]}");
}
}
The output of the iteration displays the rows and every element within the rows:
row: 0, element: 0, value: 1
row: 0, element: 1, value: 2
row: 1, element: 0, value: 3
row: 1, element: 1, value: 4
row: 1, element: 2, value: 5
row: 1, element: 3, value: 6
row: 1, element: 4, value: 7
row: 1, element: 5, value: 8
row: 2, element: 0, value: 9
row: 2, element: 1, value: 10
row: 2, element: 2, value: 11
Declaring an array with brackets is a C# notation using the Array
class. Using the C# syntax behind the scenes creates a new class that derives from the abstract base class Array
. This makes it possible to use methods and properties that are defined with the Array
class with every C# array. For example, you've already used the Length
property or iterated through the array by using the foreach
statement. By doing this, you are using the GetEnumerator
method of the Array
class.
Other properties implemented by the Array
class are LongLength
, for arrays in which the number of items doesn't fit within an integer, and Rank
, to get the number of dimensions.
Let's take a look at other members of the Array
class by getting into various features.
The Array
class is abstract, so you cannot create an array by using a constructor. However, instead of using the C# syntax to create array instances, it is also possible to create arrays by using the static CreateInstance
method. This is extremely useful if you don't know the type of elements in advance because the type can be passed to the CreateInstance
method as a Type
object.
The following example shows how to create an array of type int
with a size of 5
. The first argument of the CreateInstance
method requires the type of the elements, and the second argument defines the size. You can set values with the SetValue
method and read values with the GetValue
method (code file SimpleArrays/Program.cs
):
Array intArray1 = Array.CreateInstance(typeof(int), 5);
for (int i = 0; i < 5; i++)
{
intArray1.SetValue(3 * i, i);
}
for (int i = 0; i < 5; i++)
{
Console.WriteLine(intArray1.GetValue(i));
}
You can also cast the created array to an array declared as int[]
:
int[] intArray2 = (int[])intArray1;
The CreateInstance
method has many overloads to create multidimensional arrays and to create arrays that are not 0 based. The following example creates a two-dimensional array with 2 × 3 elements. The first dimension is 1 based; the second dimension is 10 based:
int[] lengths = { 2, 3 };
int[] lowerBounds = { 1, 10 };
Array racers = Array.CreateInstance(typeof(Person), lengths, lowerBounds);
Setting the elements of the array, the SetValue
method accepts indices for every dimension:
racers.SetValue(new Person("Alain", "Prost"), 1, 10);
racers.SetValue(new Person("Emerson", "Fittipaldi", 1, 11);
racers.SetValue(new Person("Ayrton", "Senna"), 1, 12);
racers.SetValue(new Person("Michael", "Schumacher"), 2, 10);
racers.SetValue(new Person("Fernando", "Alonso"), 2, 11);
racers.SetValue(new Person("Jenson", "Button"), 2, 12);
Although the array is not 0 based, you can assign it to a variable with the normal C# notation. You just have to take care not to cross the array boundaries:
Person[,] racers2 = (Person[,])racers;
Person first = racers2[1, 10];
Person last = racers2[2, 12];
Because arrays are reference types, assigning an array variable to another variable just gives you two variables referencing the same array. For copying arrays, the array implements the interface ICloneable
. The Clone
method that is defined with this interface creates a shallow copy of the array.
If the elements of the array are value types, as in the following code segment, all values are copied (see Figure 6-5):
int[] intArray1 = {1, 2};
int[] intArray2 = (int[])intArray1.Clone();
If the array contains reference types, only the references are copied, not the elements. Figure 6-6 shows the variables beatles
and beatlesClone
, where beatlesClone
is created by calling the Clone
method from beatles
. The Person
objects that are referenced are the same for beatles
and beatlesClone
. If you change a property of an element of beatlesClone
, you change the same object of beatles
(code file SimpleArray/Program.cs
):
Person[] beatles = {
new("John", "Lennon"),
new("Paul", "McCartney")
};
Person[] beatlesClone = (Person[])beatles.Clone();
Instead of using the Clone
method, you can use the Array.Copy
method, which also creates a shallow copy. However, there's one important difference between Clone
and Copy
: Clone
creates a new array; with Copy
you have to pass an existing array with the same rank and enough elements.
The Array
class uses the Quicksort algorithm to sort the elements in the array. The Sort
method requires the interface IComparable
to be implemented by the elements in the array. Simple types such as System.String
and System.Int32
implement IComparable
, so you can sort elements containing these types.
With the sample program, the array names
contains elements of type string
, and this array can be sorted (code file SortingSample/Program.cs
):
string[] names = {
"Lady Gaga",
"Shakira",
"Beyonce",
"Ava Max"
};
Array.Sort(names);
foreach (var name in names)
{
Console.WriteLine(name);
}
The output of the application shows the sorted result of the array:
Ava Max
Beyonce
Lady Gaga
Shakira
If you are using custom classes with the array, you must implement the interface IComparable
. This interface defines just one method, CompareTo
, which must return 0 if the objects to compare are equal; a value smaller than 0 if the instance should go before the object from the parameter; and a value larger than 0 if the instance should go after the object from the parameter.
Change the Person
record to implement the interface IComparable<Person>
. The comparison is first done on the value of the LastName
by using the Compare
method of the String
class. If the LastName
has the same value, the FirstName
is compared (code file SortingSample/Person.cs
):
public record Person(string FirstName, string LastName) : IComparable<Person>
{
public int CompareTo(Person? other)
{
if (other == null) return 1;
int result = string.Compare(this.LastName, other.LastName);
if (result == 0)
{
result = string.Compare(this.FirstName, other.FirstName);
}
return result;
}
//…
Now it is possible to sort an array of Person
objects by the last name (code file SortingSample/Program.cs
):
Person[] persons = {
new("Damon", "Hill"),
new("Niki", "Lauda"),
new("Ayrton", "Senna"),
new("Graham", "Hill")
};
Array.Sort(persons);
foreach (var p in persons)
{
Console.WriteLine(p);
}
Using Array.Sort
with Person
objects, the output returns the names sorted by last name:
Damon Hill
Graham Hill
Niki Lauda
Ayrton Senna
If the Person
object should be sorted differently than the implementation within the Person
class, a comparer type can implement the interface IComparer<T>
. This interface specifies the method Compare
, which defines two arguments that should be compared. The return value is similar to the result of the CompareTo
method that's defined with the IComparable
interface.
With the sample code, the class PersonComparer
implements the IComparer<Person>
interface to sort Person
objects either by FirstName
or by LastName
. The enumeration PersonCompareType
defines the different sorting options that are available with PersonComparer
: FirstName
and LastName
. How the compare should be done is defined with the constructor of the class PersonComparer
, where a PersonCompareType
value is set. The Compare
method is implemented with a switch
statement to compare either by LastName
or by FirstName
(code file SortingSample/PersonComparer.cs
):
public enum PersonCompareType
{
FirstName,
LastName
}
public class PersonComparer : IComparer<Person>
{
private PersonCompareType _compareType;
public PersonComparer(PersonCompareType compareType) =>
_compareType = compareType;
public int Compare(Person? x, Person? y)
{
if (x is null && y is null) return 0;
if (x is null) return 1;
if (y is null) return -1;
return _compareType switch
{
PersonCompareType.FirstName => x.FirstName.CompareTo(y.FirstName),
PersonCompareType.LastName => x.LastName.CompareTo(y.LastName),
_ => throw new ArgumentException("unexpected compare type")
};
}
}
Now you can pass a PersonComparer
object to the second argument of the Array.Sort
method. Here, the people are sorted by first name (code file SortingSample/Program.cs
):
Array.Sort(persons, new PersonComparer(PersonCompareType.FirstName));
foreach (var p in persons)
{
Console.WriteLine(p);
}
The persons
array is now sorted by first name:
Ayrton Senna
Damon Hill
Graham Hill
Niki Lauda
Arrays can be passed as parameters to methods and returned from methods. To return an array, you just have to declare the array as the return type, as shown with the following method GetPersons
:
static Person[] GetPersons() =>
new Person[] {
new Person("Damon", "Hill"),
new Person("Niki", "Lauda"),
new Person("Ayrton", "Senna"),
new Person("Graham", "Hill")
};
When passing arrays to a method, the array is declared with the parameter, as shown with the method DisplayPersons
:
static void DisplayPersons(Person[] persons)
{
//…
}
By using the foreach
statement, you can iterate elements of a collection (see Chapter 8) without needing to know the number of elements inside the collection. The foreach
statement uses an enumerator. Figure 6-7 shows the relationship between the client invoking the foreach
method and the collection. The array or collection implements the IEnumerable
interface with the GetEnumerator
method. The GetEnumerator
method returns an enumerator implementing the IEnumerator
interface. The interface IEnumerator
is then used by the foreach
statement to iterate through the collection.
The foreach
statement uses the methods and properties of the IEnumerator
interface to iterate all elements in a collection. For this, IEnumerator
defines the property Current
to return the element where the cursor is positioned and defines the method MoveNext
to move to the next element of the collection. MoveNext
returns true
if there's an element and false
if no more elements are available.
The generic version of the interface IEnumerator<T>
derives from the interface IDisposable
and thus defines a Dispose
method to clean up resources allocated by the enumerator.
The C# foreach
statement is not resolved to a foreach
statement in the IL code. Instead, the C# compiler converts the foreach
statement to methods and properties of the IEnumerator
interface. Here's a simple foreach
statement to iterate all elements in the persons
array and display them person by person:
foreach (var p in persons)
{
Console.WriteLine(p);
}
The foreach
statement is resolved to the following code fragment. First, the GetEnumerator
method is invoked to get an enumerator for the array. Inside a while
loop, as long as MoveNext
returns true
, the elements of the array are accessed using the Current
property:
IEnumerator<Person> enumerator = persons.GetEnumerator();
while (enumerator.MoveNext())
{
Person p = enumerator.Current;
Console.WriteLine(p);
}
Using the foreach
statement, it's easy to use the IEnumerable
and IEnumerator
interfaces—the compiler converts the code to use the members of these interfaces. To create classes implementing these interfaces, the compiler offers the yield
statement. When you use yield return
and yield break
, the compiler generates a state machine to iterate through a collection implementing the members of these interfaces. yield return
returns one element of a collection and moves the position to the next element; yield break
stops the iteration. The iteration also ends when the method is completed, so a yield break
is only needed to stop earlier.
The next example shows the implementation of a simple collection using the yield return
statement. The class HelloCollection
contains the method GetEnumerator
. The implementation of the GetEnumerator
method contains two yield return
statements where the strings Hello
and World
are returned (code file YieldSample/Program.cs
):
class HelloCollection
{
public IEnumerator<string> GetEnumerator()
{
yield return "Hello";
yield return "World";
}
}
Now it is possible to iterate through the collection using a foreach
statement:
public void HelloWorld()
{
HelloCollection helloCollection = new();
foreach (string s in helloCollection)
{
Console.WriteLine(s);
}
}
In a slightly larger and more realistic way than the Hello World example, you can use the yield return
statement to iterate through a collection in different ways. The class MusicTitles
enables iterating the titles in a default way with the GetEnumerator
method, in reverse order with the Reverse
method, and through a subset with the Subset
method (code file YieldSample/MusicTitles.cs
):
public class MusicTitles
{
string[] names = {"Tubular Bells", "Hergest Ridge", "Ommadawn", "Platinum"};
public IEnumerator<string> GetEnumerator()
{
for (int i = 0; i < 4; i++)
{
yield return names[i];
}
}
public IEnumerable<string> Reverse()
{
for (int i = 3; i>= 0; i--)
{
yield return names[i];
}
}
public IEnumerable<string> Subset(int index, int length)
{
for (int i = index; i < index + length; i++)
{
yield return names[i];
}
}
}
The client code to iterate through the string array first uses the GetEnumerator
method, which you don't have to write in your code because it is used by default with the implementation of the foreach
statement. Then the titles are iterated in reverse, and finally a subset is iterated by passing the index and number of items to iterate to the Subset
method (code file YieldSample/Program.cs
):
MusicTitles titles = new();
foreach (var title in titles)
{
Console.WriteLine(title);
}
Console.WriteLine();
Console.WriteLine("reverse");
foreach (var title in titles.Reverse())
{
Console.WriteLine(title);
}
Console.WriteLine();
Console.WriteLine("subset");
foreach (var title in titles.Subset(2, 2))
{
Console.WriteLine(title);
}
For a fast way to access managed or unmanaged continuous memory, you can use the Span<T>
struct. One example where Span<T>
can be used is an array; the Span<T>
struct holds continuous memory behind the scenes. Another example of a use for Span<T>
is a long string.
Using Span<T>
, you can directly access array elements. The elements of the array are not copied, but they can be used directly, which is faster than a copy.
In the following code snippet, first a simple int
array is created and initialized. A Span<int>
object is created, invoking the constructor and passing the array to the Span<int>
. The Span<T>
type offers an indexer, and thus the elements of the Span<T>
can be accessed using this indexer. Here, the second element is changed to the value 11
. Because the array arr1
is referenced from the span, the second element of the array is changed by changing the Span<T>
element. Finally, the span is returned from this method because it is used within top-level statements to pass it on to the next methods that follow (code file SpanSample/Program.cs
):
Span<int> IntroSpans()
{
int[] arr1 = { 1, 4, 5, 11, 13, 18 };
Span<int> span1 = new(arr1);
span1[1] = 11;
Console.WriteLine($"arr1[1] is changed via span1[1]: {arr1[1]}");
return span1;
}
A powerful feature of Span<T>
is that you can use it to access parts, or slices, of an array. By using the slices, the array elements are not copied; they're directly accessed from the span.
The following code snippet shows two ways to create slices. With the first one, a constructor overload is used to pass the start and length of the array that should be used. With the variable span3
that references this newly created Span<int>
, it's only possible to access three elements of the array arr2
, starting with the fourth element. Another overload of the constructor exists where you can pass just the start of the slice. With this overload, the remains of the array are taken until the end. You can also create a slice from a Span<T>
object, invoking the Slice
method. Similar overloads exist here. With the variable span4
, the previously created span1
is used to create a slice with four elements starting with the third element of span1
(code file SpanSample/Program.cs
):
private static Span<int> CreateSlices(Span<int> span1)
{
Console.WriteLine(nameof(CreateSlices));
int[] arr2 = { 3, 5, 7, 9, 11, 13, 15 };
Span<int> span2 = new(arr2);
Span<int> span3 = new(arr2, start: 3, length: 3);
Span<int> span4 = span1.Slice(start: 2, length: 4);
DisplaySpan("content of span3", span3);
DisplaySpan("content of span4", span4);
Console.WriteLine();
return span2;
}
You use the DisplaySpan
method to display the contents of a span. The following code snippet makes use of the ReadOnlySpan
. You can use this span type if you don't need to change the content that the span references, which is the case in the DisplaySpan
method. ReadOnlySpan<T>
is discussed later in this chapter in more detail:
private static void DisplaySpan(string title, ReadOnlySpan<int> span)
{
Console.WriteLine(title);
for (int i = 0; i < span.Length; i++)
{
Console.Write($"{span[i]}.");
}
Console.WriteLine();
}
When you run the application, the content of span3
and span4
is shown—a subset of the arr2
and arr1
:
content of span3
9.11.13.
content of span4
6.8.10.12.
You've seen how to directly change elements of the array that are referenced by the span using the indexer of the Span<T>
type. There are more options as shown in the following code snippet.
You can invoke the Clear
method, which fills a span containing int
types with 0
; you can invoke the Fill
method to fill the span with the value passed to the Fill
method; and you can copy a Span<T>
to another Span<T>
. With the CopyTo
method, if the destination span is not large enough, an exception of type ArgumentException
is thrown. You can avoid this outcome by using the TryCopyTo
method. This method doesn't throw an exception if the destination span is not large enough; instead, it returns false
as being not successful with the copy (code file SpanSample/Program.cs
):
private static void ChangeValues(Span<int> span1, Span<int> span2)
{
Console.WriteLine(nameof(ChangeValues));
Span<int> span4 = span1.Slice(start: 4);
span4.Clear();
DisplaySpan("content of span1", span1);
Span<int> span5 = span2.Slice(start: 3, length: 3);
span5.Fill(42);
DisplaySpan("content of span2", span2);
span5.CopyTo(span1);
DisplaySpan("content of span1", span1);
if (!span1.TryCopyTo(span4))
{
Console.WriteLine("Couldn't copy span1 to span4 because span4 is " +
"too small");
Console.WriteLine($"length of span4: {span4.Length}, length of " +
$"span1: {span1.Length}");
}
Console.WriteLine();
}
When you run the application, you can see the content of span1
where the last two numbers have been cleared using span4
, the content of span2
where span5
was used to fill the value 42
with three elements, and again the content of span1
where the first three numbers have been copied over from span5
. Copying span1
to span4
was not successful because span4
has just a length of 4, whereas span1
has a length of 6:
content of span1
2.11.6.8.0.0.
content of span2
3.5.7.42.42.42.15.
content of span1
42.42.42.8.0.0.
Couldn't copy span1 to span4 because span4 is too small
length of span4: 2, length of span1: 6
If you need only read-access to an array segment, you can use ReadOnlySpan<T>
as was already shown in the DisplaySpan
method. With ReadOnlySpan<T>
, the indexer is read-only, and this type doesn't offer Clear
and Fill
methods. You can, however, invoke the CopyTo
method to copy the content of the ReadOnlySpan<T>
to a Span<T>
.
The following code snippet creates readOnlySpan1
from an array with the constructor of ReadOnlySpan<T>
. readOnlySpan2
and readOnlySpan3
are created by direct assignments from Span<int>
and int[]
. Implicit cast operators are available with ReadOnlySpan<T>
(code file SpanSample/Program.cs
):
void ReadonlySpan(Span<int> span1)
{
Console.WriteLine(nameof(ReadonlySpan));
int[] arr = span1.ToArray();
ReadOnlySpan<int> readOnlySpan1 = new(arr);
DisplaySpan("readOnlySpan1", readOnlySpan1);
ReadOnlySpan<int> readOnlySpan2 = span1;
DisplaySpan("readOnlySpan2", readOnlySpan2);
ReadOnlySpan<int> readOnlySpan3 = arr;
DisplaySpan("readOnlySpan3", readOnlySpan3);
Console.WriteLine();
}
Starting with C# 8, indices and ranges based on the Index
and Range
types were included, along with the range and hat operators. Using the hat operator, you can access elements counting from the end.
Let's start with the following array, which consists of nine integer values (code file IndicesAndRanges/Program.cs
):
int[] data = { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
The traditional way to access the first and last elements of this array is to use the indexer implemented with the Array
class, pass an integer value for the nth element starting with 0 for the first element, and use the length minus 1 for the last element:
int first1 = data[0];
int last1 = data[data.Length - 1];
Console.WriteLine($"first: {first1}, last: {last1}");
With the hat operator (^)
, you can use ^1
to access the last element, and the calculation based on the length is no longer necessary:
int last2 = data[^1];
Console.WriteLine(last2);
Behind the scenes, the Index
struct type is used. An implicit cast from int
to Index
is implemented, so you can assign int
values to the Index
type. Using the hat operator, the compiler creates an Index
that initializes the IsFromEnd
property to true
. Passing the Index
to an indexer, the compiler converts the value to an int
. If the Index
starts from the end, calculation is done with either a Length
or a Count
property (depending on what property is available):
Index firstIndex = 0;
Index lastIndex = ^1;
int first3 = data[firstIndex];
int last3 = data[lastIndex];
Console.WriteLine($"first: {first3}, last: {last3}");
To access a range of the array, the range operator (..
) can be used with the underlying Range
type. In the sample code, the ShowRange
method is implemented to display the values of an array with a string output (code file IndicesAndRanges/Program.cs
):
void ShowRange(string title, int[] data)
{
Console.WriteLine(title);
Console.WriteLine(string.Join(" ", data));
Console.WriteLine();
}
By invoking this method with different values passed using the range operator, you can see the various forms of ranges. A range is defined with ..
embedded with an Index
on the left and an Index
on the right. Starting with ..
and omitting the Index
from the left side just starts from the beginning. Omitting the Index
from the right side, the range goes up to the end. Using ..
with the array just returns the complete array.
The Index
on the left side specifies an inclusive value, whereas the Index
on the right side is exclusive. With the end of the range, you need to specify the element following the last element you want to access. When the Index
type was used before, you've seen that ^1
references the last value of the collection. When using the Index
on the right side of a range, you must specify ^0
to address the element after the last element (remember the right side of the range is exclusive).
With the code sample, a full range is used (..
), the first three elements are passed with 0..3
; the fourth to the sixth elements are passed with 3..6
; and counting from the end, the last three elements are passed with ^3..^0
:
ShowRange("full range", data[..]);
ShowRange("first three", data[0..3]);
ShowRange("fourth to sixth", data[3..6]);
ShowRange("last three", data[^3..^0]);
Behind the scenes, the Range
struct type is used, and you can assign ranges to variables:
Range fullRange = ..;
Range firstThree = 0..3;
Range fourthToSixth = 3..6;
Range lastThree = ^3..^0;
The Range
type specifies a constructor that passes two Index
values for the start and the end, End
and Start
properties that return an Index
, and a GetOffsetAndLength
method that returns a tuple consisting of the offset and length of a range.
Using a range of an array, the array elements are copied. Changing values within the range, the original values of the array do not change. However, as described in the section “Using Span with Arrays,” a Span
allows accessing a slice of an array directly. The Span
type also supports indices and ranges, and you can change the content of an array by accessing a range of the Span
type.
The following code snippet demonstrates accessing a slice of an array and changing the first element of the slice; the original value of the array didn't change because a copy was done. In the code lines that follow, a Span
is created to access the array using the AsSpan
method. With this Span
, the range operator is used, which in turn invokes the Slice
method of the Span
. Changing values from this slice, the array is directly accessed and changed using an indexer on the slice (code file IndicesAndRanges/Program.cs
):
var slice1 = data[3..5];
slice1[0] = 42;
Console.WriteLine($"value in array didn't change: {data[3]}, " +
$"value from slice: {slice1[0]}");
var sliceToSpan = data.AsSpan()[3..5];
sliceToSpan[0] = 42;
Console.WriteLine($"value in array: {data[3]}, value from slice: {sliceToSpan[0]}");
To support indices and ranges with custom collections, not a lot of work is required. To support the hat operator, the MyCollection
class implements an indexer and the Length
property. To support ranges, you can either create a method that receives a Range
type or—a simpler way—create a method with the name Slice
that has two int
parameters and can have the return type you need. The compiler converts the range to calculate the start and length (code file IndicesAndRanges/MyCollection.cs
):
using System;
using System.Linq;
public class MyCollection
{
private int[] _array = Enumerable.Range(1, 100).ToArray();
public int Length => _array.Length;
public int this[int index]
{
get => _array[index];
set => _array[index] = value;
}
public int[] Slice(int start, int length)
{
var slice = new int[length];
Array.Copy(_array, start, slice, 0, length);
return slice;
}
}
The collection is initialized. With just the few lines that have been implemented, the hat operator can be used with the indexer, and with the range operator, the compiler converts this to invoke the Slice
method (code file IndicesAndRanges/Program.cs
):
MyCollection coll = new();
int n = coll[^20];
Console.WriteLine($"Item from the collection: {n}");
ShowRange("Using custom collection", coll[45..^40]);
If you have an application where a lot of arrays are created and destroyed, the garbage collector has some work to do. To reduce the work of the garbage collector, you can use array pools with the ArrayPool
class (from the namespace System.Buffers
). ArrayPool
manages a pool of arrays. Arrays can be rented from and returned to the pool. Memory is managed from the ArrayPool
itself.
You can create an ArrayPool<T>
by invoking the static Create
method. For efficiency, the array pool manages memory in multiple buckets for arrays of similar sizes. With the Create
method, you can define the maximum array length and the number of arrays within a bucket before another bucket is required:
ArrayPool<int> customPool = ArrayPool<int>.Create(
maxArrayLength: 40000, maxArraysPerBucket: 10);
The default for the maxArrayLength
is 1024 × 1024 bytes, and the default for maxArraysPerBucket
is 50. The array pool uses multiple buckets for faster access to arrays when many arrays are used. Arrays of similar sizes are kept in the same bucket as long as possible, and the maximum number of arrays is not reached.
You can also use a predefined shared pool by accessing the Shared
property of the ArrayPool<T>
class:
ArrayPool<int> sharedPool = ArrayPool<int>.Shared;
Requesting memory from the pool happens by invoking the Rent
method. The Rent
method accepts the minimum array length that should be requested. If memory is already available in the pool, it is returned. If it is not available, memory is allocated for the pool and returned afterward. In the following code snippet, an array of 1024, 2048, 3096, and so on elements is requested in a for
loop (code file ArrayPoolSample/Program.cs
):
private static void UseSharedPool()
{
for (int i = 0; i < 10; i++)
{
int arrayLength = (i + 1) << 10;
int[] arr = ArrayPool<int>.Shared.Rent(arrayLength);
Console.WriteLine($"requested an array of {arrayLength} " +
$"and received {arr.Length}");
//…
}
}
The Rent
method returns an array with at least the requested number of elements. The array returned could have more memory available. The shared pool keeps arrays with at least 16 elements. The element count of the managed arrays always doubles—for example, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192 elements, and so on.
When you run the application, you can see that larger arrays are returned if the requested array size doesn't fit the arrays managed by the pool:
requested an array of 1024 and received 1024
requested an array of 2048 and received 2048
requested an array of 3072 and received 4096
requested an array of 4096 and received 4096
requested an array of 5120 and received 8192
requested an array of 6144 and received 8192
requested an array of 7168 and received 8192
requested an array of 8192 and received 8192
requested an array of 9216 and received 16384
requested an array of 10240 and received 16384
After you no longer need the array, you can return it to the pool. After the array is returned, you can later reuse it by renting it again.
You return the array to the pool by invoking the Return
method of the array pool and passing the array to the Return
method. With an optional parameter, you can specify whether the array should be cleared before it is returned to the pool. Without clearing it, the next one renting an array from the pool could read the data. By clearing the data, you avoid this, but you need more CPU time (code file: ArrayPoolSample/Program.cs
):
ArrayPool<int>.Shared.Return(arr, clearArray: true);
If you need to work with an array of bits, you can use the BitArray
type (from the namespace System.Collections
). BitArray
is a reference type that contains an array of int
s, where for every 32 bits a new integer is used. BitArray
defines Count
and Length
properties, an indexer, a SetAll
method to set all the bits according to the parameters passed, a Not
method to inverse the bits, as well as And
, Or
, and Xor
methods for binary AND, binary OR, and exclusive OR.
With the code sample, the extension method GetBitsFormat
iterates through a BitArray
and writes 1
or 0
to a StringBuilder
, depending on whether the bit is set. For better readability, a separator character is added every four bits (code file BitArraySample/BitArrayExtensions.cs
):
public static class BitArrayExtensions
{
public static string GetBitsFormat(this BitArray bits)
{
StringBuilder sb = new();
for (int i = bits.Length - 1; i>= 0; i--)
{
sb.Append(bits[i] ? 1 : 0);
if (i != 0 && i % 4 == 0)
{
sb.Append("_");
}
}
return sb.ToString();
}
}
The following example demonstrates the BitArray
class creating a bit array with nine bits, indexed from 0 to 8. The SetAll
method sets all nine bits to true
. Then the Set
method changes bit 1
to false
. Instead of the Set
method, you can also use an indexer, as shown with index 5
and 7
(code file BitArraySample/Program.cs
):
BitArray bits1 = new(9);
bits1.SetAll(true);
bits1.Set(1, false);
bits1[5] = false;
bits1[7] = false;
Console.Write("initialized: ");
Console.WriteLine(bits1.GetBitsFormat());
Console.WriteLine();
This is the displayed result of the initialized bits:
initialized: 1_0101_1101
The Not
method generates the inverse of the bits of the BitArray
:
Console.WriteLine($"NOT {bits1.FormatString()}");
bits1.Not();
Console.WriteLine($" = {bits1.FormatString()}");
Console.WriteLine();
The result of Not
is all bits inverted. If the bit were true
, it is false
; and if it were false
, it is true
:
NOT 1_0101_1101
= 0_1010_0010
In the following example, a new BitArray
is created. With the constructor, the variable bits1
is used to initialize the array, so the new array has the same values. Then the values for bits 0, 1, and 4 are set to different values. Before the Or
method is used, the bit arrays bits1
and bits2
are displayed. The Or
method changes the values of bits1
:
BitArray bits2 = new(bits1);
bits2[0] = true;
bits2[1] = false;
bits2[4] = true;
Console.WriteLine($" {bits1.FormatString()}");
Console.WriteLine($"OR {bits2.FormatString()}");
bits1.Or(bits2);
Console.WriteLine($"= {bits1.FormatString()}");
Console.WriteLine();
With the Or
method, the set bits are taken from both input arrays. In the result, the bit is set if it was set with either the first or the second array:
0_1010_0010
OR 0_1011_0001
= 0_1011_0011
Next, the And
method is used to operate on bits2
and bits1
:
Console.WriteLine($" {bits2.FormatString()}");
Console.WriteLine($"AND {bits1.FormatString()}");
bits2.And(bits1);
Console.WriteLine($"= {bits2.FormatString()}");
Console.WriteLine();
The result of the And
method only sets the bits where the bit was set in both input arrays:
0_1011_0001
AND 0_1011_0011
= 0_1011_0001
Finally, the Xor
method is used for an exclusive OR
:
Console.WriteLine($" {bits1.FormatString()} ");
Console.WriteLine($"XOR {bits2.FormatString()}");
bits1.Xor(bits2);
Console.WriteLine($"= {bits1.FormatString()}");
Console.ReadLine();
With the Xor
method, the resultant bit is set only if the bit was set either in the first or second input, but not both:
0_1011_0011
XOR 0_1011_0001
= 0_0000_0010
This chapter covered how to use the C# notation to create and use simple, multidimensional, and jagged arrays. The Array
class is used behind the scenes of C# arrays, enabling you to invoke properties and methods of this class with array variables.
You saw how to sort elements in the array by using the IComparable
and IComparer
interfaces; and you learned how to create and use enumerators, the interfaces IEnumerable
and IEnumerator
, and the yield
statement.
With the Span<T>
type, you saw efficient ways to access a slice of the array. You also saw range and index enhancements with C#.
The last sections of this chapter showed you how to efficiently use arrays with the ArrayPool
, as well as how to use the BitArray type to deal with an array of bits.
The next chapter gets into details of more important features of C#: delegates, lambdas, and events.
Delegates are the .NET variant of addresses to methods. A delegate is an object-oriented type-safe pointer to one or multiple methods. Lambda expressions are directly related to delegates. When the parameter is a delegate type, you can use a lambda expression to implement a method that's referenced from the delegate.
This chapter explains the basics of delegates and lambda expressions, and it shows you how to implement methods called by delegates with lambda expressions. It also demonstrates how .NET uses delegates as the means of implementing events.
In Chapter 4, “Object-Oriented Programming in C#,” you read about using interfaces as contracts. If the parameter of a method has the type of an interface, with the implementation of the method any members of the interface can be used without being dependent on any interface implementation. Indeed, the implementation of the interface can be done independently of the method implementation. Similarly, a method can be declared to receive a parameter of a delegate type. The method receiving the delegate parameter can invoke the method that's referenced from the delegate. Similar to interfaces, the implementation of the method that's referenced by the delegate can be done independently of the method that's invoking the delegate.
The concept of passing delegates to methods can become clearer with some examples:
Run
method of a Task
and pass the address of a method via a delegate to invoke this method from the task. Tasks are explained in Chapter 11, “Tasks and Asynchronous Programming.”When you want to use a class in C#, you do so in two stages. First, you need to define the class—that is, you need to tell the compiler what fields and methods make up the class. Then (unless you are using only static methods), you instantiate an object of that class. With delegates, it is the same process. You start by declaring the delegates you want to use. Declaring delegates means telling the compiler what kind of method a delegate of that type will represent. Then, you have to create one or more instances of that delegate. Behind the scenes, a delegate type is a class, but there's specific syntax for delegates that hide details.
The syntax for declaring delegates looks like this:
delegate void IntMethodInvoker(int x);
This declares a delegate called IntMethodInvoker
and indicates that each instance of this delegate can hold a reference to a method that takes one int
parameter and returns void
. The crucial point to understand about delegates is that they are type-safe. When you define the delegate, you have to provide full details about the signature and the return type of the method that it represents.
Suppose that you want to define a delegate called TwoLongsOp
that represents a method that takes two long
s as its parameters and returns a double
. You could do so like this:
delegate double TwoLongsOp(long first, long second);
Or, to define a delegate that represents a method that takes no parameters and returns a string
, you might write this (code file GetAStringDemo/Program.cs
):
//…
delegate string GetAString();
The syntax is similar to that for a method definition, except there is no method body and the definition is prefixed with the keyword delegate
. Because what you are doing here is basically defining a new class, you can define a delegate in any of the same places that you would define a class—that is to say, either inside another class, outside of any class, or in a namespace as a top-level object. Depending on how visible you want your definition to be and the scope of the delegate, you can apply any of the access modifiers that also apply to classes to define its visibility:
public delegate string GetAString();
After you have defined a delegate, you can create an instance of it so that you can use it to store details about a particular method.
The following code snippet demonstrates the use of a delegate. It is a rather long-winded way of calling the ToString
method on an int
(code file GetAStringDemo/Program.cs
):
int x = 40;
GetAString firstStringMethod = new GetAString(x.ToString);
Console.WriteLine($"String is {firstStringMethod()}");
//…
This code instantiates a delegate of type GetAString
and initializes it so it refers to the ToString
method of the integer variable x
. Delegates always take a one-parameter constructor, which is the address of a method. This method must match the signature and return type with which the delegate was defined. Because ToString
is an instance method (as opposed to a static method), the instance needs to be supplied with the parameter.
The next line invokes the delegate to display the string. In any code, supplying the name of a delegate instance, followed by parentheses containing any parameters, has exactly the same effect as calling the method wrapped by the delegate.
In fact, supplying parentheses to the delegate instance is the same as calling the Invoke
method of the delegate class. Because firstStringMethod
is a variable of a delegate type, the C# compiler replaces firstStringMethod
with firstStringMethod.Invoke
:
firstStringMethod();
firstStringMethod.Invoke();
For less typing, at every place where a delegate instance is needed, you can just pass the name of the address. This is known by the term delegate inference. This C# feature works as long as the compiler can resolve the delegate instance to a specific type. The example initialized the variable firstStringMethod
of type GetAString
with a new instance of the delegate GetAString
:
GetAString firstStringMethod = new GetAString(x.ToString);
You can write the same just by passing the method name with the variable x
to the variable firstStringMethod
:
GetAString firstStringMethod = x.ToString;
The code that is created by the C# compiler is the same. The compiler detects that a delegate type is required with firstStringMethod
, so it creates an instance of the delegate type GetAString
and passes the address of the method with the object x
to the constructor.
Delegate inference can be used anywhere a delegate instance is required. Delegate inference can also be used with events because events are based on delegates (as you'll see later in this chapter).
One feature of delegates is that they are type-safe to the extent that they ensure that the signature of the method being called is correct. However, interestingly, they don't care what type of object the method is being called against or even whether the method is a static method or an instance method.
To demonstrate this, the following example expands the previous code snippet so that it uses the firstStringMethod
delegate to call a couple of other methods on another object—an instance method and a static method. For this, the Currency
struct is defined. This type has its own overload of ToString
and a static method with the same signature to GetCurrencyUnit
. This way, the same delegate variable can be used to invoke these methods (code file GetAStringDemo/Currency.cs
):
struct Currency
{
public uint Dollars;
public ushort Cents;
public Currency(uint dollars, ushort cents)
{
Dollars = dollars;
Cents = cents;
}
public override string ToString() => $"${Dollars}.{Cents,2:00}";
public static string GetCurrencyUnit() => "Dollar";
public static explicit operator Currency (float value)
{
checked
{
uint dollars = (uint)value;
ushort cents = (ushort)((value — dollars) * 100);
return new Currency(dollars, cents);
}
}
public static implicit operator float (Currency value) =>
value.Dollars + (value.Cents / 100.0f);
public static implicit operator Currency (uint value) =>
new Currency(value, 0);
public static implicit operator uint (Currency value) =>
value.Dollars;
}
Now you can use the GetAString
instance as follows (code file GetAStringDemo/Program.cs
):
private delegate string GetAString();
//…
var balance = new Currency(34, 50);
// firstStringMethod references an instance method
firstStringMethod = balance.ToString;
Console.WriteLine($"String is {firstStringMethod()}");
// firstStringMethod references a static method
firstStringMethod = new GetAString(Currency.GetCurrencyUnit);
Console.WriteLine($"String is {firstStringMethod()}");
This code shows how you can call a method via a delegate and subsequently reassign the delegate to refer to different methods on different instances of classes, even static methods or methods against instances of different types of class, provided that the signature of each method matches the delegate definition.
When you run the application, you get the output from the different methods that are referenced by the delegate:
String is 40
String is $34.50
String is Dollar
Now that you've been introduced to the foundations of delegates, it's time to move onto something more useful and practical: passing delegates to methods.
This example defines a MathOperations
class that uses a couple of static methods to perform two operations on doubles. Then you use delegates to invoke these methods. The MathOperations
class looks like this (code file SimpleDelegates/MathOperations
):
public static class MathOperations
{
public static double MultiplyByTwo(double value) => value * 2;
public static double Square(double value) => value * value;
}
You invoke these methods as follows (code file SimpleDelegates/Program.cs
):
using System;
DoubleOp[] operations =
{
MathOperations.MultiplyByTwo,
MathOperations.Square
};
for (int i=0; i < operations.Length; i++)
{
Console.WriteLine($"Using operations[{i}]");
ProcessAndDisplayNumber(operations[i], 2.0);
ProcessAndDisplayNumber(operations[i], 7.94);
ProcessAndDisplayNumber(operations[i], 1.414);
Console.WriteLine();
}
void ProcessAndDisplayNumber(DoubleOp action, double value)
{
double result = action(value);
Console.WriteLine($"Value is {value}, result of operation is {result}");
}
delegate double DoubleOp(double x);
In this code, you instantiate an array of DoubleOp
delegates (remember that after you have defined a delegate class, you can basically instantiate instances just as you can with normal classes, so putting some into an array is no problem). Each element of the array is initialized to refer to a different operation implemented by the MathOperations
class. Then, you loop through the array, applying each operation to three different values. This illustrates one way of using delegates—to group methods together into an array so that you can call several methods in a loop.
The key lines in this code are the ones in which you actually pass each delegate to the ProcessAndDisplayNumber
method, such as this:
ProcessAndDisplayNumber(operations[i], 2.0);
This passes in the name of a delegate but without any parameters. Given that operations[i]
is a delegate, syntactically the following is true:
operations[i]
means the delegate (that is, the method represented by the delegate).operations[i](2.0)
means actually calling this method, passing in the value in parentheses.The ProcessAndDisplayNumber
method is defined to take a delegate as its first parameter:
void ProcessAndDisplayNumber(DoubleOp action, double value)
Then, within the implementation of this method, you call this:
double result = action(value);
This actually causes the method that is wrapped up by the action
delegate instance to be called, and its return result is stored in Result
. Running this example gives you the following:
Using operations[0]:
Value is 2, result of operation is 4
Value is 7.94, result of operation is 15.88
Value is 1.414, result of operation is 2.828
Using operations[1]:
Value is 2, result of operation is 4
Value is 7.94, result of operation is 63.043600000000005
Value is 1.414, result of operation is 1.9993959999999997
Instead of defining a new delegate type with every parameter and return type, you can use the Action<T>
and Func<T>
delegates. The generic Action<T>
delegate is meant to reference a method with void
return. This delegate class exists in different variants so that you can pass up to 16 different parameter types. The Action
class without the generic parameter is for calling methods without parameters. Action<in T>
is for calling a method with one parameter; Action<in T1, in T2>
is for a method with two parameters; and Action<in T1, in T2, in T3, in T4, in T5, in T6, in T7, in T8>
is for a method with eight parameters.
The Func<T>
delegates can be used in a similar manner. Func<T>
allows you to invoke methods with a return type. Similar to Action<T>
, Func<T>
is defined in different variants to pass up to 16 parameter types and a return type. Func<out TResult>
is the delegate type to invoke a method with a return type and without parameters. Func<in T, out TResult>
is for a method with one parameter, and Func<in T1, in T2, in T3, in T4, out TResult>
is for a method with four parameters.
The example in the preceding section declared a delegate with a double
parameter and a double
return type:
delegate double DoubleOp(double x);
Instead of declaring the custom delegate DoubleOp
, you can use the Func<in T, out TResult>
delegate. You can declare a variable of the delegate type or, as shown here, an array of the delegate type:
Func<double, double>[] operations =
{
MathOperations.MultiplyByTwo,
MathOperations.Square
};
and use it with the ProcessAndDisplayNumber
method as a parameter:
static void ProcessAndDisplayNumber(Func<double, double> action,
double value)
{
double result = action(value);
Console.WriteLine($"Value is {value}, result of operation is {result}");
}
So far, each of the delegates you have used wraps just one method call. Calling the delegate amounts to calling that method. If you want to call more than one method, you need to make an explicit call through a delegate more than once. However, it is possible for a delegate to wrap more than one method. Such a delegate is known as a multicast delegate. When a multicast delegate is called, it successively calls each method in order. For this to work, the delegate signature should return a void
; otherwise, you would only get the result of the last method invoked by the delegate.
With a void
return type, you can use the Action<double>
delegate (code file MulticastDelegates/Program.cs
):
Action<double> operations = MathOperations.MultiplyByTwo;
operations += MathOperations.Square;
In the earlier example, you wanted to store references to two methods, so you instantiated an array of delegates. Here, you simply add both operations into the same multicast delegate. Multicast delegates recognize the operators +
, +=
, and -=
. Alternatively, you can expand the last two lines of the preceding code, as in this snippet:
Action<double> operation1 = MathOperations.MultiplyByTwo;
Action<double> operation2 = MathOperations.Square;
Action<double> operations = operation1 + operation2;
With the sample project MulticastDelegates
, the MathOperations
type from SimpleDelegates
has been changed to return void and to display the results on the console (code file MulticastDelegates/MathOperations.cs
):
public static class MathOperations
{
public static void MultiplyByTwo(double value) =>
Console.WriteLine($"Multiplying by 2: {value} gives {value * 2}");
public static void Square(double value) =>
Console.WriteLine($"Squaring: {value} gives {value * value}");
}
To accommodate this change, you also have to rewrite ProcessAndDisplayNumber
(code file MulticastDelegates/Program.cs
):
static void ProcessAndDisplayNumber(Action<double> action, double value)
{
Console.WriteLine($"ProcessAndDisplayNumber called with value = {value}");
action(value);
Console.WriteLine();
}
Now you can try your multicast delegate:
Action<double> operations = MathOperations.MultiplyByTwo;
operations += MathOperations.Square;
ProcessAndDisplayNumber(operations, 2.0);
ProcessAndDisplayNumber(operations, 7.94);
ProcessAndDisplayNumber(operations, 1.414);
Each time ProcessAndDisplayNumber
is called, it displays a message saying that it has been called. Then the following statement causes each of the method calls in the action
delegate instance to be called in succession:
action(value);
Running the preceding code produces this result:
ProcessAndDisplayNumber called with value = 2
Multiplying by 2: 2 gives 4
Squaring: 2 gives 4
ProcessAndDisplayNumber called with value = 7.94
Multiplying by 2: 7.94 gives 15.88
Squaring: 7.94 gives 63.043600000000005
ProcessAndDisplayNumber called with value = 1.414
Multiplying by 2: 1.414 gives 2.828
Squaring: 1.414 gives 1.9993959999999997
If you are using multicast delegates, be aware that the order in which methods chained to the same delegate will be called is formally undefined. Therefore, avoid writing code that relies on such methods being called in any particular order.
Invoking multiple methods by one delegate might cause an even bigger problem. The multicast delegate contains a collection of delegates to invoke one after the other. If one of the methods invoked by a delegate throws an exception, the complete iteration stops. Consider the following MulticastIteration
example. Here, the simple delegate Action
is used. This delegate is meant to invoke the methods One
and Two
, which fulfill the parameter and return type requirements of the delegate. Be aware that method One
throws an exception (code file MulticastDelegatesUsingInvocationList/Program.cs
):
static void One()
{
Console.WriteLine("One");
throw new Exception("Error in One");
}
static void Two()
{
Console.WriteLine("Two");
}
With the top-level statements, delegate d1
is created to reference method One
; next, the address of method Two
is added to the same delegate. d1
is invoked to call both methods. The exception is caught in a try
/
catch
block:
Action d1 = One;
d1 += Two;
try
{
d1();
}
catch (Exception)
{
Console.WriteLine("Exception caught");
}
Only the first method is invoked by the delegate. Because the first method throws an exception, iterating the delegates stops here, and method Two
is never invoked. The result might differ because the order of calling the methods is not defined:
One
Exception Caught
In such a scenario, you can avoid the problem by iterating the list on your own. The Delegate
class defines the method GetInvocationList
that returns an array of Delegate
objects. You can now use these delegates to invoke the methods associated with them directly, catch exceptions, and continue with the next iteration (code file MulticastDelegatesUsingInvocationList/Program.cs
):
Action d1 = One;
d1 += Two;
Delegate[] delegates = d1.GetInvocationList();
foreach (Action d in delegates)
{
try
{
d();
}
catch (Exception)
{
Console.WriteLine("Exception caught");
}
}
When you run the application with the code changes, you can see that the iteration continues with the next method after the exception is caught:
One
Exception caught
Two
Up to this point, a method must already exist for the delegate to work (that is, the delegate is defined with the same signature as the method(s) it will be used with). However, there is another way to use delegates—with anonymous methods. An anonymous method is a block of code that is used as the parameter for the delegate.
The syntax for defining a delegate with an anonymous method doesn't change. It's when the delegate is instantiated that things change. The following simple console application shows how using an anonymous method can work (code file AnonymousMethods/Program.cs
):
string mid = ", middle part,";
Func<string, string> anonDel = delegate(string param)
{
param += mid;
param += " and this was added to the string.";
return param;
};
Console.WriteLine(anonDel("Start of string"));
The delegate Func<string, string>
takes a single string parameter and returns a string. anonDel
is a variable of this delegate type. Instead of assigning the name of a method to this variable, a simple block of code is used, prefixed by the delegate
keyword and followed by a string parameter.
As you can see, the block of code uses a method-level string variable, mid
, which is defined outside of the anonymous method and adds it to the parameter that was passed in. The code then returns the string value. When the delegate is called, a string is passed in as the parameter, and the returned string is output to the console.
The benefit of using anonymous methods is that it reduces the amount of code you have to write. You don't need to define a method just to use it with a delegate. This becomes evident when you define the delegate for an event (events are discussed later in this chapter), and it helps reduce the complexity of the code, especially where several events are defined. With anonymous methods, the code does not perform faster. The compiler still defines a method; the method just has an automatically assigned name that you don't need to know.
You must follow a couple of rules when using anonymous methods. An anonymous method can't have a jump statement (break
, goto
, or continue
) that has a target outside of the anonymous method. The reverse is also true: a jump statement outside the anonymous method cannot have a target inside the anonymous method.
If you have to write the same functionality more than once, don't use anonymous methods. In this case, instead of duplicating the code, write a named method. You have to write it only once and reference it by its name.
One way lambda expressions are used is to assign code—using a lambda expression—to a parameter. You can use lambda expressions whenever you have a delegate parameter type. The previous example using anonymous methods is modified in the following snippet to use a lambda expression:
string mid = ", middle part,";
Func<string, string> lambda = param =>
{
param += mid;
param += " and this was added to the string.";
return param;
};
Console.WriteLine(lambda("Start of string"));
The left side of the lambda operator, =>
, lists the necessary parameters. The right side following the lambda operator defines the implementation of the method assigned to the variable lambda
.
With lambda expressions, there are several ways to define parameters. If there's only one parameter, just the name of the parameter is enough. The following lambda expression uses the parameter named s
. Because the delegate type defines a string
parameter, s
is of type string
. The implementation returns a formatted string that is finally written to the console when the delegate is invoked: change uppercase TEST
(code file LambdaExpressions/Program.cs
):
Func<string, string> oneParam = s => $"change uppercase {s.ToUpper()}";
Console.WriteLine(oneParam("test"));
If a delegate uses more than one parameter, you can combine the parameter names inside brackets. Here, the parameters x
and y
are of type double
as defined by the Func<double, double, double>
delegate:
Func<double, double, double> twoParams = (x, y) => x * y;
Console.WriteLine(twoParams(3, 2));
For convenience, you can add the parameter types to the variable names inside the brackets. If the compiler can't match an overloaded version, using parameter types can help resolve the matching delegate:
Func<double, double, double> twoParamsWithTypes =
(double x, double y) => x * y;
Console.WriteLine(twoParamsWithTypes(4, 2));
If the lambda expression consists of a single statement, a method block with curly brackets and a return statement are not needed. There's an implicit return
added by the compiler:
Func<double, double> square = x => x * x;
It's completely legal to add curly brackets, a return
statement, and semicolons. Usually it's just easier to read without them:
Func<double, double> square = x =>
{
return x * x;
};
However, if you need multiple statements in the implementation of the lambda expression, curly brackets and the return
statement are required:
Func<string, string> lambda = param =>
{
param += mid;
param += " and this was added to the string.";
return param;
};
With lambda expressions, you can access variables outside the block of the lambda expression. This is known as closure. Closures are a great feature, but they can also be dangerous if not used correctly.
In the following example, a lambda expression of type Func<int, int>
requires one int
parameter and returns an int
. The parameter for the lambda expression is defined with the variable x
. The implementation also accesses the variable someVal
, which is outside the lambda expression. As long as you do not assume that the lambda expression creates a new method that is used later when f
is invoked, this might not look confusing at all. Looking at this code block, the returned value calling f
should be the value from x
plus 5, but this might not be the case (code file LambdaExpressions/Program.cs
):
int someVal = 5;
Func<int, int> f = x => x + someVal;
Assuming the variable someVal
is later changed and then the lambda expression is invoked, the new value of someVal
is used. The result of invoking f(3)
is 10
:
someVal = 7;
Console.WriteLine(f(3));
Similarly, when you're changing the value of a closure variable within the lambda expression, you can access the changed value outside of the lambda expression.
Now, you might wonder how it is possible at all to access variables outside of the lambda expression from within the lambda expression. To understand this, consider what the compiler does when you define a lambda expression. With the lambda expression x => x + someVal
, the compiler creates an anonymous class that has a constructor to pass the outer variable. The constructor depends on how many variables you access from the outside. With this simple example, the constructor accepts an int
. The anonymous class contains an anonymous method that has the implementation as defined by the lambda expression, with the parameters and return type:
public class AnonymousClass
{
private int _someVal;
public AnonymousClass(int someVal) => _someVal = someVal;
public int AnonymousMethod(int x) => x + _someVal;
}
In case a value outside of the scope of the lambda expression needs to be returned, a reference type is used.
Using the lambda expression and invoking the method creates an instance of the anonymous class and passes the value of the variable from the time when the call is made.
Events are based on delegates and offer a publish/subscribe mechanism to delegates. You can find events everywhere across the framework. In Windows applications, the Button
class offers the Click
event. This type of event is a delegate. A handler method that is invoked when the Click
event is fired needs to be defined and to include parameters as defined by the delegate type.
In the code example shown in this section, events are used to connect the CarDealer
and Consumer
classes. The CarDealer
class offers an event when a new car arrives. The Consumer
class subscribes to the event to be informed when a new car arrives.
You start with a CarDealer
class that offers a subscription based on events. CarDealer
defines the event named NewCarCreated
of type EventHandler<CarInfoEventArgs>
with the event
keyword. Inside the method CreateANewCar
, the event NewCarCreated
is fired by invoking the method RaiseNewCarCreated
. The implementation of this method verifies whether the delegate is not null and raises the event (code file EventsSample/CarDealer.cs
):
public class CarInfoEventArgs: EventArgs
{
public CarInfoEventArgs(string car) => Car = car;
public string Car { get; }
}
public class CarDealer
{
public event EventHandler<CarInfoEventArgs>? NewCarInfo;
public void CreateANewCar(string car)
{
Console.WriteLine($"CarDealer, new car {car}");
RaiseNewCarCreated(car);
}
private void RaiseNewCarCreated(string car) =>
NewCarCreated?.Invoke(this, new CarInfoEventArgs(car));
}
The class CarDealer
offers the event NewCarCreated
of type EventHandler<CarInfoEventArgs>
. As a convention, events typically use methods with two parameters; the first parameter is an object and contains the sender of the event, and the second parameter provides information about the event. The second parameter is different for various event types. You could create a specific delegate type such as
public delegate void NewCarCreatedHandler(object sender, CarInfoEventArgs e);
or use the generic type EventHandler
as shown in the sample code. With EventHandler<TEventArgs>
, the first parameter needs to be of type object
, and the second parameter is of type T
. EventHandler<TEventArgs>
also defines a constraint on T
; it must derive from the base class EventArgs
, which is the case with CarInfoEventArgs
.
public event EventHandler<CarInfoEventArgs> NewCarInfo;
The delegate EventHandler<TEventArgs>
is defined as follows:
public delegate void EventHandler<TEventArgs>(object sender, TEventArgs e)
where TEventArgs: EventArgs
Defining the event in one line is a C# shorthand notation. The compiler creates a variable of the delegate type EventHandler<CarInfoEventArgs
> and adds methods to subscribe and unsubscribe from the delegate. The long form of the shorthand notation is shown next. This is similar to auto-properties and full properties. With events, the add
and remove
keywords are used to add and remove a handler to the delegate:
private EventHandler<CarInfoEventArgs>? _newCarCreated;
public event EventHandler<CarInfoEventArgs>? NewCarCreated
{
add => _newCarCreated += value;
remove => _newCarCreated -= value;
}
The class CarDealer
fires the event by calling the Invoke
method of the delegate. This invokes all the handlers that are subscribed to the event. Remember, as previously shown with multicast delegates, the order of the methods invoked is not guaranteed. To have more control over calling the handler methods, you can use the Delegate
class method GetInvocationList
to access every item in the delegate list and invoke each on its own, as shown earlier.
NewCarCreated?.Invoke(this, new CarInfoEventArgs(car));
Firing the event requires only a one-liner. Prior to C# 6, firing the event was more complex—checking the delegate for null
(if no subscriber was registered) before invoking the method, which should have been done in a thread-safe manner. Now, checking for null is done using the ?.
operator.
The class Consumer
is used as the event listener. This class subscribes to the event of the CarDealer
and defines the method NewCarIsHere
that in turn fulfills the requirements of the EventHandler<CarInfoEventArgs>
delegate with parameters of type object
and CarInfoEventArgs
(code file EventsSample/Consumer.cs
):
public record Consumer(string Name)
{
public void NewCarIsHere(object? sender, CarInfoEventArgs e) =>
Console.WriteLine($"{Name}: car {e.Car} is new");
}
Now the event publisher and subscriber need to connect. You do this by using the NewCarInfo
event of the CarDealer
to create a subscription with +=
. The consumer sebastian
subscribes to the event, and after the car Williams is created, the consumer max
subscribes. After the car Aston Martin is created, sebastian
unsubscribes with -=
(code file EventsSample/Program.cs
):
CarDealer dealer = new();
Consumer sebastian = new("Sebastian");
dealer.NewCarInfo += sebastian.NewCarIsHere;
dealer.NewCar("Williams");
Consumer max = new("Max");
dealer.NewCarInfo += max.NewCarIsHere;
dealer.NewCar("Aston Martin");
dealer.NewCarInfo -= sebastian.NewCarIsHere;
dealer.NewCar("Ferrari");
When you run the application, a Williams arrives, and Sebastian is informed. After that, Max registers for the subscription as well, and both Sebastian and Max are informed about the new Aston Martin. Then Sebastian unsubscribes, and only Max is informed about the Ferrari:
CarDealer, new car Williams
Sebastian: car Williams is new
CarDealer, new car Aston Martin
Sebastian: car Aston Martin is new
Max: car Aston Martin is new
CarDealer, new car Ferrari
Max: car Ferrari is new
This chapter provided the basics of delegates, lambda expressions, and events. You learned how to declare a delegate and add methods to the delegate list; you learned how to implement methods called by delegates with lambda expressions; and you learned the process of declaring event handlers to respond to an event, as well as how to create a custom event and use the patterns for raising the event.
Using delegates and events in the design of a large application can reduce dependencies and the coupling of layers. This enables you to develop components that have a higher reusability factor.
Lambda expressions are C# language features based on delegates. With these, you can reduce the amount of code you need to write.
The next chapter covers the use of different forms of collections.
Chapter 6, “Arrays,” covers arrays and the interfaces implemented by the Array
class. The size of arrays is fixed. If the number of elements is dynamic, you should use a collection class instead of an array.
List<T>
is a collection class that can be compared to arrays, but there are also other kinds of collections: queues, stacks, linked lists, dictionaries, and sets. The other collection classes have partly different APIs to access the elements in the collection and often a different internal structure for how the items are stored in memory. This chapter covers all of these collection classes and their differences, including performance differences.
Most collection classes are in the System.Collections
and System.Collections.Generic
namespaces. Generic collection classes are located in the System.Collections.Generic
namespace. Collection classes that are specialized for a specific type are located in the System.Collections.Specialized
namespace. Thread-safe collection classes are in the System.Collections.Concurrent
namespace. Immutable collection classes are in the System.Collections.Immutable
namespace.
Of course, there are also other ways to group collection classes. Collections can be grouped into lists, collections, and dictionaries based on the interfaces that are implemented by the collection class.
The following table describes the most important interfaces implemented by collections and lists:
INTERFACE | DESCRIPTION |
---|---|
IEnumerable<T> |
The interface IEnumerable is required by the foreach statement. This interface defines the method GetEnumerator , which returns an enumerator that implements the IEnumerator interface. |
ICollection<T> |
ICollection<T> is implemented by generic collection classes. With this you can get the number of items in the collection (Count property) and copy the collection to an array (CopyTo method). You can also add and remove items from the collection (Add , Remove , Clear ). |
IList<T> |
The IList<T> interface is for lists where elements can be accessed from their position. This interface defines an indexer, as well as ways to insert or remove items from specific positions (Insert , RemoveAt methods). IList<T> derives from ICollection<T> . |
ISet<T> |
This interface is implemented by sets. Sets allow combining different sets into a union, getting the intersection of two sets, and checking whether two sets overlap. ISet<T> derives from ICollection<T> . |
IDictionary<TKey, TValue> |
The interface IDictionary<TKey, TValue> is implemented by generic collection classes that have a key and a value. With this interface, all the keys and values can be accessed, items can be accessed with an indexer of type TKey , and items can be added or removed. |
ILookup<TKey, TValue> |
Similar to the IDictionary<TKey, TValue> interface, lookups have keys and values. However, with lookups, the collection can contain multiple values with one key. |
IComparer<T> |
The interface IComparer<T> is implemented by a comparer and used to sort elements inside a collection with the Compare method. |
IEqualityComparer<T> |
IEqualityComparer<T> is implemented by a comparer that can be used for keys in a dictionary. With this interface, the objects can be compared for equality. |
For resizable lists, .NET offers the generic class List<T>
. This class implements the IList
, ICollection
, IEnumerable
, IList<T>
, ICollection<T>
, and IEnumerable<T>
interfaces.
The following examples use the members of the record Racer
as elements to be added to the collection to represent a Formula 1 racer. This type has five properties: Id
, FirstName
, LastName
, Country
, and the number of Wins
as specified with the positional record constructor. An overloaded constructor allows you to specify only four values when initializing the object. The method ToString
is overridden to return the name of the racer. The record Racer
also implements the generic interface IComparable<T>
for sorting racer elements and IFormattable
to allow passing custom format strings (code file ListSamples/Racer.cs
):
public record Racer(int ID, string FirstName, string LastName, string Country,
int Wins) : IComparable<Racer>, IFormattable
{
public Racer(int id, string firstName, string lastName, string country)
: this(id, firstName, lastName, country, Wins: 0)
{ }
public override string ToString() => $"{FirstName} {LastName}";
public string ToString(string? format, IFormatProvider? formatProvider) =>
format?.ToUpper() switch
{
null => ToString(),
"N" => ToString(),
"F" => FirstName,
"L" => LastName,