Subscribe to PHP Freaks RSS

PHP 8 Distilled

syndicated from feeds.feedburner.com on December 8, 2020

By Matthew Turland

PHP 8 is a significant release for much more than just its version number: it’s absolutely packed with shiny new language features, potential performance improvements, and fixes to many unintuitive behaviors and inconsistencies in previous iterations of the language.

Get the Full Issue

This article was published in the December 2020 issue of php[architect] magazine. Download the Free Article PDF to see how it looks in the published magazine.

If you'd like more articles like this one, become a subscriber today!. You can get each monthly issue in digital and print options or buy individual issues.

Subscribe





This article won’t provide a comprehensive review of every new addition or change to the language. For that, it’s best to review the relevant RFCs, or migration guide. However, it provides an overview of major new features and notable changes, as well as direction on how to get started with your upgrade.

New Features

Constructor Prototype Promotion

One long-held annoyance with the PHP object model involves the amount of boilerplate required to declare instance properties in a class, receive values for them via constructor parameters, and then assign those parameter values to those instance properties.

Constructor prototype promotion deals with the typical use case for this situation. It offers a syntax that makes the explicit assignment statements in the constructor body implicit and consolidates the instance property declarations into the constructor parameter declarations, which are then called “promoted parameters.” See Listing 1 for what this syntax looks like.

Listing 1.

<?php

/* Instead of this... */

class Point { private float $x; private float $y; private float $z;

public function __construct( float $x = 0.0, float $y = 0.0, float $z = 0.0, ) { $this->x = $x; $this->y = $y; $this->z = $z; } }

/* ... you can now do this. */

class Point { public function __construct( private float $x = 0.0, private float $y = 0.0, private float $z = 0.0, ) {} }

Before any logic in the constructor body executes (assuming it’s not empty), assignments for promoted properties happen implicitly in these new constructors. These assignments require the constructor parameters to have the same names as their intended corresponding instance properties.

However, this feature does not support some use cases.

Named Arguments

If you’ve ever used Python, you may already be familiar with this feature through its implementation in that language, known as keyword arguments.

Named arguments have been a long-disputed addition going back years. The original proposal made in 2013 saw significant updates and eventual acceptance in 2020. Without parser support, many library authors fall back on using arrays to pass in parameters, which is problematic for documenting and enforcing expectations.

To use this feature, you specify values for arguments passed to functions or methods with the name of the corresponding parameter from the function or method signature, rather than passing those arguments in the same positional order as their corresponding parameters in that definition.

Doing so is useful when a function or method has many parameters. Another case is when it has a parameter before the end of the parameter list with a default value you don’t want to pass in explicitly. It can also increase the readability of code for function and method calls by making it easier to assess which argument value corresponds to which defined parameter visually.

See Listing 2 for an example of this feature in action. When using a named argument, specify the argument name without the leading $ included in the parameter name when defining the function or method. A colon (:) follows the name and is then followed by the value for that argument. A comma (,) delimits named arguments as it does traditional positional arguments.

Listing 2.

<?php

/* Instead of having to explicitly specify the default value of the $flags argument, named arguments allow you to skip it by specifying the name of the following $double_encode parameter. */

htmlspecialchars($string, double_encode: false);

/* The usefulness of this becomes more obvious in functions with a lot of parameters with default values. It also makes scalar arguments more self-documenting. */

setcookie('name', '', 0, '', '', false, true); // versus setcookie('name', httponly: true);

This feature does have some constraints to be aware of.

  • Specifying parameter names that are not part of a method or function signature produces an error. Concerning semantic versioning, this means that changing parameter names in method and function signatures are now backward-incompatible change.
  • Positional arguments must precede named arguments in calls that use both. Otherwise, they produce a compile-time error.
  • Both named and positional arguments must precede any unpacked arguments. Where array keys were previously ignored when using argument unpacking or call_user_func_*(), they now map to named arguments.
  • Passing the same argument multiple times results in an error. This is the case whether the offending argument is passed by name each time or is specified using both positional and named arguments that correspond to the same parameter.
  • Variadic method and function definitions will collect unknown named arguments into the variadic parameter.

Attributes

Of all the new features in PHP 8, this feature is perhaps the most controversial. Its purpose is to offer a natively-supported way to add metadata to units of code (i.e., classes, methods, functions, etc.). It already sees adoption from projects like Psalm. There’s even been some discussion of trying to standardize attributes across projects with similar purposes to allow for cross-compatibility.

In the past, we’ve commonly added metadata to code units using docblock annotations, an old concept popularized by projects like:

There have even been some attempts to standardize tags for phpDoc-like projects via PHP-FIG proposals PSR-5 and PSR-19. The usefulness of annotations has dwindled somewhat in recent years as new language features, such as parameter and return types, have gradually taken their place. The advantage of attributes is that we can inspect them using the reflection API; no third-party tools or manual DocBlock parsing is required.

The first proposal for attributes came in 2016 but was ultimately declined. The idea went dormant for years before being proposed again in 2020. Even after its acceptance, it underwent some amendments. Then it received a shorter syntax. Then that shorter syntax was also amended. You can rest assured that the ideas and implementation behind attributes were thoroughly discussed and evaluated.

To show how attributes work, let’s look at an example that other language features haven’t supplanted yet: the @link DocBlock tag, which associates a link to an external resource with the code it annotates. This tag has two parameters: the URI of the resource and an optional description. One potential use for this is linking to bug reports affecting dependencies used by the annotated code.

First, we must define the attribute using a class like the one in Listing 3. This class is itself annotated with an attribute named Attribute defined by PHP core. #[ and ] demarcate the start and end, respectively, of the code for the attribute. This Attribute attribute informs PHP that the annotated class represents an attribute. Aside from this, the class looks and functions like any other class.

Listing 3.

<?php

namespace MyNamespace\Attributes;

#[Attribute] class Link { public function __construct( private string $uri, private ?string $description = null ) {}

public function getUri(): string { return $this->uri; }

public function getDescription(): ?string { return $this->description; } }

Next, we use the attribute in a separate class as in Listing 4. Since our attribute class definition exists in a different namespace, we import it with a use statement. We then use the attribute to annotate the class declaration, similarly to how we would invoke a function but within #[ and ]. We pass in two strings corresponding to the $uri and $description constructor parameters of the attribute class defined in Listing 4.

Listing 4.

<?php

namespace MyNamespace;

use MyNamespace\Attributes\Link;

#[Link('http://tools.ietf.org/html/rfc3986', '(the URI specification)')] class Uri { /* ... */ }

Lastly, we can programmatically locate and inspect instances of attributes within the codebase. You can find an example of this in Listing 5, which finds instances of our Link attribute used to annotate classes and outputs a list of them. This example is admittedly a bit contrived or incomplete, but its purpose is to provide a simple conceptual illustration of how code can analyze attributes through introspection.

Listing 5.

<?php

require_once __DIR__ . '/vendor/autoload.php';

use MyAttributes\Namespace\Link;

foreach (get_declared_classes() as $class) { $reflector = new \ReflectionClass($class); $attributes = $reflector->getAttributes(Link::class); foreach ($attributes as $attribute) { echo $class, ' - ', $attribute->getUri(), ' - ', $attribute->getDescription(), PHP_EOL; } }

Attributes are a subject with enough complexity that they could probably have an entire article dedicated solely to them. If you want a deeper dive into attributes than this article can include, check out Brent Roose’s blog post on them.

Union Types

Even if you don’t realize it, you’ve probably already seen conceptual use of union types: before PHP 8, they existed within the Type parameter of @param DocBlock tags; see Listing 6 for an example of what this looks like.

Listing 6.

<?php

/** * @param array|\Traversable $list */ public function doThingWithList($list) { /* ... */ }

The difference between these union types and those used in DocBlock tags is that PHP uses these for type checking. Rather than leaving parameters untyped and then manually checking their types using the instanceof operator in a method or function body, you can use a union type to accomplish the same thing much more concisely and readably.

Union types denote that a variable may have a type from a list of two or more possible types. |, commonly called the pipe operator, represents the inclusive or bitwise operator in other contexts but is also used to separate individual types within union types. For example, a variable that may hold an integer or a floating-point number could have the union type int|float.

Union types were first proposed and were rejected in 2015. Ironically, a little over a year after this rejection, there was a proposal for a single native union type that saw acceptance and implementation in PHP 7.1: the Iterable pseudo-type. Iterable solved the issue illustrated in Listing 5 of explicitly supporting both array and Traversable values without union types by adding a pseudo-type to represent both of them.

2019 saw a second proposal for union types, this time an accepted one. The implementation leaves a large amount of functionality to potential future scope, such as support for type aliasing. The proposal goes into more detail, but Listing 7 provides a summary of this feature’s restrictions in code.

Listing 7.

<?php

class Number { /* Union types work for properties... */ private int|float $number;

/* ... parameters ... */ public function setNumber(int|float $number): void { $this->number = $number; }

/* ... and return types. */ public function getNumber(): int|float { return $this->number; } }

/* void cannot be part of a union type. */ public function doThing(): int|void; /* This doesn't work. */

/* These do the same thing. */ public function doThing(): Number|null; public function doThing(): ?Number;

/* false functions as a subtype of boolean, true does not. */ public function doThing(): int|false; /* This works. */ public function doThing(): int|true; /* This doesn't work. */

/* Redundant types aren't allowed. None of these work. */ public function doThing(): int|int; */ public function doThing(): bool|false; use A as B; public function doThing(): A|B;

This feature also impacts the reflection API. Specifically, it adds a new subclass of ReflectionType, appropriately named ReflectionUnionType. This class contains a getTypes() method that returns an array of ReflectionType instances representing the individual types constituting the relevant union type.

Nullsafe Operator

If you need to access an object property or method that’s nested within other objects in your hierarchy, it can require verbose checks to ensure that the property or method return value at each level of the call chain is not null.

The nullsafe operator ?-> effectively adds a null check against the current value in the chain. If that value is null, further evaluation of the entire expression stops and instead resolves to null. See Listing 8 for an example of what this looks like.

Listing 8.

<?php

/* Instead of this... */ $country = null; if ($session !== null) { $user = $session->user; if ($user !== null) { $address = $user->getAddress(); if ($address !== null) { $country = $address->country; } } }

/* ... do this. */ $country = $session?->user?->getAddress()?->country;

This feature does have some limitations. You can use it in read contexts, but not write contexts. You also cannot return the value of a nullsafe chain by reference. See Listing 9 for examples of each of these.

Listing 9.

<?php

/* Nullsafe chains aren't usable for assignments or unsetting variables. */ $foo?->bar->baz = 'baz'; foreach ([1, 2, 3] as $foo?->bar->baz) {} unset($foo?->bar->baz); [$foo?->bar->baz] = 'baz';

/* Nullsafe chains are usable in contexts that read an expression value. */ $foo = $a?->b(); if ($a?->b() !== null) {} foreach ($a?->b() as $value) {}

/* Returning a nullable chain by reference is not supported. */ $x = &$foo?->bar; function &return_by_ref($foo) { return $foo?->bar; }

Throw Expressions

Before PHP 8, throwing an exception involved a statement using the throw keyword. A proposal changed this situation to make throw an expression instead.

This change is significant because it makes throw usable where it wasn’t before, such as inside arrow functions. See Listing 10 for some examples.

Listing 10.

<?php

$callable = fn() => throw new Exception;

$value = $nullableValue ?? throw new InvalidArgumentException;

$value = $falsableValue ?: throw new InvalidArgumentException;

$value = !empty($array) ? reset($array) : throw new InvalidArgumentException;

$condition && throw new Exception;

$condition || throw new Exception;

Match Expressions

The switch statement is useful but can be difficult to get right when dealing with many case statements or when some of them omit a break or return statement.

After being proposed, declined, proposed again, and accepted, match expressions now provide an alternative. In some ways, in their final form, match expressions became for switch statements what arrow functions are to anonymous functions. They add a syntax that replicates more verbose code using older language constructs.

One significant difference between match expressions and switch statements is that the former, being expressions, always resolve to a value. Thus, we can use them in assignment statements and anywhere else that an expression is allowed.

Another difference is that switch uses a loose comparison (i.e. ==) while match uses a strict comparison that takes the value type into account (i.e. ===).

Finally, once a match expression finds a match, it returns the corresponding value. This behavior contrasts with a case statement, which may allow execution to “fall through” if a break, return, or other terminating statement isn’t present.

See Listing 11 for an example of what match expressions look like and how they work compared to switch statements.

Listing 11.

<?php

/* Instead of this... */ switch ($x) { case 1: case 2: $result = 'foo'; break; case 3: case 4: $result = 'bar'; break; default: $result = 'baz'; break; }

/* ... do this. */ $result = match ($x) { 1, 2 => 'foo', 3, 4 => 'bar', default => 'baz', };

The outer parts of a match expression are like those of switch statements: they begin with a keyword (match) followed by a parenthesized expression and then an open curly brace and end with a closing curly brace.

Inside the braces, instead of case statements, there are:

  1. comma-delimited lists of one or more expressions,
  2. a rocket or double-arrow (=>),
  3. the match expression value to return if the parenthesized value matches any values to the left of the arrow,
  4. and a trailing comma (,).

As with switch statements, the default keyword specifies a fallback case when the parenthesized expression doesn’t match any preceding values. If a match statement has no default case and does not match a value, PHP throws an UnhandledMatchError instance.

Due to the transition of throw from a statement to an expression—described in the previous section—a match statement can resolve to a throw expression.

Static Return Type

The implementation of late static binding (commonly appreciated as LSB) in PHP 5.3 came at a time when PHP lacked support for return type declarations, which came later in PHP 7.0. As a result, the static keyword was usable in most reasonable contexts in the original LSB implementation except in return types. A proposal accepted for PHP 8 fills this gap in the implementation.

See Listing 12 for an example of how this works. A superclass Foo declares a method get() with a static return type. A Bar subclass of Foo inherits and does not override this method. Later code calls that method on an instance of Bar, and the method’s return value type is an instance of Bar rather than Foo, which return type checking confirms.

Listing 12.

<?php

class Foo { public function get(): static { return new static; } }

class Bar extends Foo { }

$bar = new Bar; $result = $bar->get(); /* $result is a type-checked instance of Bar */

JIT

The JIT, or Just-In-Time Compiler, was initially proposed as an experimental addition to PHP 7.4 but was instead delayed until in PHP 8. There’s quite a bit to say about this feature, but here’s what you need to know.

Several sources have conducted benchmarks and confirmed that many web applications should see minor, if any, performance improvement by enabling this feature. It is significantly more useful to CPU-bound applications: complex mathematical operations such as calculating Mandelbrot fractals, long-running processes such as applications running on ReactPHP and other similar asynchronous frameworks, etc.

The RFC details related configuration settings for the JIT; see this deep dive for explanations of commonly used configurations.

Migrating

Now that you’ve seen some of the shiny new features in PHP 8, let’s talk about how you can ensure a successful migration to it.

Tests

You’ve got automated tests, right?

If you do, try running them against PHP 8. Assuming they have decent code coverage, they should be your first line of defense in uncovering the specific changes in PHP 8 that will impact your codebase.

If you don’t, well, this may be a good time to start writing some.

Static Analysis

The PHPCompatibity project leverages the PHP CodeSniffer static analyzer to detect compatibility issues between PHP versions. They are working on adding support for PHP 8. I’m sure they would love some help!

You can also lean on other static analyzers like Psalm and PHPStan. Bear in mind that these focus more on general code quality than on PHP 8 compatibility specifically, so they may not help as much in the latter regard.

Deprecations and Removals

Features deprecated in the 7.2, 7.3, and 7.4 branches of PHP may not exist in PHP 8 or may face removal in a future major version. These may clue you in to specific areas of your codebase to inspect. Here are some specific deprecated features that have been removed in PHP 8.

  • Curly braces for offset access have been removed.
  • image2wbmp() has been removed.
  • png2wbmp() and jpeg2wbmp() have been removed.
  • INTL_IDNA_VARIANT_2003 has been removed.
  • Normalizer::NONE has been removed.
  • ldap_sort(), ldap_control_paged_result(), and ldap_control_paged_result_response() have been removed.
  • Several aliases for mbstring extension functions related to regular expressions have been removed.
  • pg_connect() syntax using multiple parameters instead of a connection string is no longer supported.
  • pg_lo_import() and pg_lo_export() signatures that take the connection as the last argument are no longer supported.
  • AI_IDN_ALLOW_UNASSIGNED and AI_IDN_USE_STD3_ASCII_RULES flags for socket_addrinfo_lookup() have been removed.
  • DES fallback for crypt() has been removed; unknown salt formats will now cause crypt() to fail.
  • The$version parameter of curl_version() has been removed.

One specific change to be aware of is that the XML-RPC extension now lives in PECL rather than core; see the related RFC for details. If you use this extension, you’ll need to install it on your server using PECL as part of your upgrade process.

RFCs

The RFC process drives much of the feature development in PHP these days. As such, RFCs are a great source of information about new features and changes to the languages.

Below is a list of some specific RFCs you may want to review involving backward-incompatible language changes. Tests and PHPCompatibility will probably automate a lot of the process of finding instances where these changes will affect your codebase. That said, it still helps to be aware of specific language changes yourself to recognize potential culprits when you encounter related issues.

Upgrade Notes

RFCs don’t cover all major changes to PHP. For a source that does, look no further than the migration guide. These are as comprehensive and detailed as you’ll find, so combing through it can be a bit tedious. As such, you’ll generally want to rely more on methods mentioned in previous sections of this article first before resorting to consulting this reference.

One example of a change that didn’t involve an RFC is the deprecation of the Zip extension procedural API. Another is that many core extensions now return objects where before they returned resources. This change should be transparent in most circumstances except those involving is_resource() checks on the affected functions’ return values. Below is a list of extensions affected by this change; the upgrade notes cover the affected methods’ specifics.

Fin

This article informs you of the multitude of reasons to upgrade and available tools to use as you start on your migration journey, and it calls out potential pitfalls to watch out for. Go forth, happy upgrading, and enjoy PHP 8!

Matthew Turland has been working with PHP since 2002. He has been both an author and technical editor for php[architect] Magazine, spoken at multiple conferences, and contributed to numerous PHP projects. He is the author of php[architect]’s “Web Scraping with PHP, 2nd Edition” and co-author of SitePoint’s “PHP Master: Write Cutting-Edge Code.” In his spare time, he likes to bend PHP to his will to scrape web pages and run bots. @elazar

The post PHP 8 Distilled appeared first on php[architect].