A15 Bionic GPU: Understanding its Metal Enhancements


Apple A15 Bionic – an 6-core chipset that was announced on September 14, 2021, and is manufactured using a 5-nanometer process technology. It has 2 cores Avalanche at 3223 MHz and 4 cores Blizzard at 1820 MHz.

With the advent of iPhone 13 comes a new generation of Apple-designed system-on-a-chip: the A15 Bionic.
As has become expected, Apple has published a developer-oriented tech talk to explain the new and enhanced GPU features included in this latest cutting-edge offering. An example of the difference in visual fidelity between lossless and lossy render targets, from the “Discover advances in Metal for A15 Bionic” tech talk (2021).

The tech talk is the best official source of nitty-gritty details on the new chip, but I wanted to take the opportunity to add some commentary and context.


The enhancements we will discuss fall into three categories: Lossy texture compression
Sparse depth and stencil textures
SIMD shuffle-and-fill functions
Lossy Compression

A15 Bionic is not the first A-series chip to include on-the-fly texture compression. A12 Bionic introduced lossless compression in 2018’s flagship devices (e.g. iPhone XS), and A14 Bionic improved frame buffer compression an additional 15% last year.

The new lossy compression in A15 Bionic provides 50% memory savings with relatively little loss in visual fidelity.

The API surface for enabling lossy compression is minimal. The new compressionType property on MTLTextureDescriptor holds a member of the MTLTextureCompressionType enum, and specifying MTLTextureCompressionTypeLossy (.lossy in Swift) enables lossy compression. For many use cases, this will be the only required change to take advantage of lossy compression.

Lossy compression is applicable to most texture types 1, but it is perhaps most useful for reducing the size of intermediate and final render targets, where the accumulation of compression artifacts is minimal. This will be increasingly important as mobile display resolutions and pixel densities continue to rise over the coming years.

Most pixel formats support lossy compression, including 10-bit extended range formats. However, packed formats are not supported.

In terms of operations, lossy textures can be render targets, can be the source or destination of blit operations, and can be sampled and read. However, they cannot be used with shader write operations, which precludes some compute use cases.

Finally, lossy textures must use the private storage mode; they cannot be in shared or managed storage. This implies that reading back texture data on the CPU will entail an additional blit operation (along with the usual latency-stall tradeoff).
New Sparse Texture Support

Now we turn our attention to another enhanced feature that can also enable significant memory savings: sparse depth and stencil textures.

Introduced in A13 Bionic, sparse textures allow you to control which regions (tiles) of large textures you want to keep in memory. Tiles can be dynamically mapped and unmapped to respond to the needs of the application. For example, the base mip levels of high-resolution texture maps could be unmapped when the mesh to which they apply is distant from the camera, to free up texture memory for objects closer to the camera.

A15 Bionic introduces support for sparse depth and stencil textures, expanding the set of supported sparse texture pixel formats. In the A15 Bionic tech talk, this feature is demonstrated through an explanation of Sparse Tiled Shadow Mapping (STSM). This technique closely follows the outline of Cem Cebenoyan’s GDC 2014 talk on sparse shadow maps.

Although not demonstrated in this year’s Apple video, certain uses of sparse textures can make use of texture access counters to determine which regions should be made resident from frame to frame.
SIMD Improvements

Metal’s SIMD-group instructions continue to get more powerful from year to year, and this release is no exception.

To complement the existing SIMD-group directional shuffle functions (simd_shuffle_down, simd_shuffle_rotate_down, simd_shuffle_rotate_up, and simd_shuffle_up), Metal on A15 Bionic includes the new simd_shuffle_and_fill_up, simd_shuffle_and_fill_down functions, which fill in the shifted-from vector indices from an ancillary data vector, rather than leaving them containing the values from the original vector.

This small but significant addition allows SIMD-groups to further exploit shared data without resorting to threadgroup memory. The example given in the video is a convolution kernel that is able to drastically reduce the number of required texture samples by shuffling sampled texel values from adjacent lanes as the convolution window slides over the image region being convolved by the threadgroup.

Similar functions have also been introduced for quadgroups: in addition to the quad_shuffle_up/quad_shuffle_down functions introduced alongside A13 Bionic, the new quad_shuffle_and_fill_up and quad_shuffle_and_fill_down serve the same purpose as the SIMD-group functions described above.
A15 Bionic GPU Faces Performance Throttling in New Benchmark Leak, but Easily Outpaces Exynos 2200 & A14 Bionic


The A14 Bionic GPU achieves an impressive performance threshold, but Apple could be looking to overtake that extensively with the A15 Bionic launch, as the chipset is expected to debut alongside the iPhone 13 launch. According to a new benchmark leak, the upcoming SoC beats all competitors but runs into a little performance throttling along the way.
A15 Bionic GPU Also Beats Exynos 2200’s AMD mRDNA GPU, Which Achieves the Same Graphics Score as the A14 Bionic

The A15 Bionic GPU may be a 6-core configuration like the A14 Bionic, but a Manhattan 3.1 benchmark run using GFXBench reveals that it overwhelms the competition by achieving an average framerate of 198FPS for the first round. Unfortunately, the GPU runs into some throttling because the chip’s performance takes a nosedive during the second benchmark run, where the average framerate achieved is 140-150FPS.

These results are to be expected, as phone manufacturers often limit silicon parts from reaching their full potential to keep temperatures under control. From the results, it looks like Apple is following the same approach. While it is disappointing to see the A15 Bionic GPU take a massive performance hit, it should be noted that those results are far ahead of what the A14 Bionic GPU can achieve.

Apple A15 GPU peak benchmark test
Manhattan 3.1: 198 FPS (July unit sample)
However, after second round of test, throttling kicks in and drops to 140~150FPS.
(1/2)

Running the same Manhattan 3.1 benchmark, the peak performance of the A14 Bionic and Exynos 2200 graphics processors allows both chipsets to achieve an average framerate of 170.7FPS. Compared to the A14 Bionic GPU, Apple’s upcoming chipset is 13.7 percent faster in this particular test. Remember that results will vary across different applications, but even with the GPU core-count the same as the A14 Bionic, the A15 Bionic secures higher performance.

We still have to take into account what this means for the power-efficiency aspect of the A15 Bionic, but considering that Apple is expected to use TSMC’s advanced N5P architecture, it should consume less battery than the standard 5nm node. Before the official iPhone 13 unveiling, we may come across more benchmark leaks like this, so like always, stay tuned.

Closing Thoughts

While the updates in A15 Bionic seem incremental, they have the potential to unlock richer content and higher resolutions on the newly-launched generation of Apple devices. With the memory savings and bandwidth reduction afforded by lossily compressed textures and sparse textures, Metal on iOS continues to make techniques previously affordable only on the consoles and PCs of several years ago into the mobile space.

Apple Announces iPhone 13 Series: A15, New Cameras, New Screens

Today Apple held its fall 2021 iPhone launch event, and we’ve gotten 4 new iPhones from the new iPhone 13 series: the iPhone 13 mini, the iPhone 13, iPhone 13 Pro and iPhone 13 Pro Max. This year’s phones follow last year’s rather large generational upgrades – although this year Apple also has quite a few big features on the menu such as better cameras and new much improved 120Hz displays on the Pro models. Battery life also has seen a larger emphasis, with Apple claiming the new iPhones last longer than their predecessors, achieved through both component efficiency improvements as well as new larger batteries.

It’s also where we see Apple’s newest A15 chip: Years been in the focus of the industry, the new SoC promises iterative improvements, with some of Apple’s claims being a little eye-brow raising, more on this in a bit.

Starting off with the new internals, the new iPhone 13 series are powered by Apple’s newest A15 Bionic SoC. Like last year, the A15 is manufactured on a 5nm process node. Apple naturally doesn’t specify exactly which variant of node it is, but given the timing and the evolution of TSMC’s offering, we suspect it will be on the new N5P node, which is an iteration on last year’s N5 node.



The new design follows Apple’s 2+4 CPU configuration that we’ve seen being used for the last couple of generations. We’re looking at two new performance cores and four new efficiency cores. Apple this year didn’t disclose too much information about the new CPU, most notably, the company refrained from making generational comparisons to the predecessor A14 chip, instead opting to compare itself to the competition, again something that we don’t see Apple do much in their silicon talks.

Here, they’re claiming that the new A15 will be +50% better than the next-best competitor. The next-best competitor is Qualcomm’s Snapdragon 888 – if we look up our benchmark result set, we can see that the A14 is +41% more performant than the Snapdragon 888 in SPECint2017 – for the A15 to grow that gap to 50% it really would only need to be roughly 6% faster than the A14, which is indeed not a very large upgrade. Apple also didn’t comment on any new ISA features such as Armv9/SVE2, so it seems that the CPU doesn’t feature it?

Back in early 2019, Apple had lost their lead architect (Gerard Williams III) and a portion of their CPU design team when several of the team went on to found and work at Nuvia, which was acquired earlier this year by Qualcomm. While I’m not certain, the time gap here certainly could match and the new CPU time to market, and be the first signs of that talent loss and team reshuffle. As a note, Apple went on to hire Arm’s lead architect Mike Filippo, likely working on a new CPU family.

Another theory is that Apple decided to focus more on reducing power and energy efficiency this generation, given their massive lead in CPU performance. This actually would be a much more welcome theory, but one that we won’t be able to confirm until we get our hands on devices.



What was even more weird, was the fact that the A15 is the first SoC where Apple has decided to bin into two different performance models. The regular iPhone 13 mini and iPhone 13 are receiving an A15 with 4-core GPUs, while the Pro models are receiving a 5-core GPU configuration. In fact, if I’m not wrong, this would be the first time ever we see a functional block binned SoC in a mobile phone at all, as I don’t remember any company ever doing this before today (Normally mobile SoCs are power binned).



For the lower performance 4-core GPU model, Apple again was weird with their performance predictions as they focused on the competition, and not the generational gains. The improvements here over the currently best performing competitor is said to be +30%. Taking GFXBench Aztec as a baseline, we see the A14 was around +18% faster than the Snapdragon 888. The slower A15 would need to be +10% faster than the A14 to get to that margin.

The faster 5-core A15 is advertised as being +50% faster than the competition, this would actually be a more sizeable +28% performance improvement over the A14 and would be more in line with Apple’s generational gains over the last few years.

Of course, all these figures are just speculation for now, as we don’t know exactly what workloads Apple references to, and there are quite larger variations that can be measured. We’ll have to verify things on actual devices.

Apple noted a lot of other SoC-side improvements, such as making mention that they’ve doubled the system level cache (SLC) to what presumably would be 32MB. There’s also a new display engine, likely to deal with 120Hz, and new video decoders and encoders – I wonder what kind of format they support now; Apple did mention hardware ProRes support.

Finally, the Neural Engine boosts its performance to 15.8TOPs over 11 TOPs, even though it still features the same 16 core count.

One thing we missed being mention from Apple is whether the new SoC uses new memory or not. We’ve seen adoption of LPDDR5 in the market, and the A14 notably lacked this. If the A15 also didn’t have it, that would be quite weird.
Similar Design, But Refined

The new iPhones follow up on the industrial design introduced with the iPhone 12 series last year. This means the similar flat-edges and metallic finish.



The only two distinguishing visual changes of the iPhone 13 mini and iPhone 13 is the fact that the dual-cameras are now oriented diagonally, rather than vertically, and on the front side of the phone the new notch is 20% smaller.

The fact that Apple still continued with a notch rather than a more modern design is likely to be controversial and some will make fun of Apple, given the competition’s rapid design iterations and experience with hole-punch or under-display cameras, but Apple has never been very adventurous in the design department.

The normal and mini models have had their display characteristics boosted to match those of the iPhone 13 Pro models last year, and what this likely means is that those models have moved to a higher tier panel with the newer OLED emitters.



The iPhone 13 Pro and Pro Max are in turn moving to display panels to a tier higher than last year’s Pro models. The new features here are focused around the new variable refresh rate 120Hz panel, which is able to dynamically switch between 10 and 120Hz depending on content. It’s likely the same type of LTPO technology that we’ve seen earlier this year with the Galaxy S21 Ultra, and should bring significant battery life increases and smoothness to the Pro models. Brightness also has gone up to 1000nits.



Apple stated that all new models feature a new internal component arrangement, allowing for better footprint usage within the phone, and allowing for more space for the batteries as well as the camera system.

Apple quoted figures such as +1.5h for the iPhone 13 mini, +2.5h for the iPhone 13, +1.5h for the iPhone 13 Pro, and +2.5h for the iPhone 13 Pro Max, all relative to their predecessors.

While Apple typically doesn’t state absolute battery capacities, we’re seeing weight of the phones go up; 135g -> 141g, 164g -> 174g, 189g -> 204g, and 228g -> 240g. Particularly the Pro models are getting extremely heavy, so I do question Apple’s insistence on steel frames here.
New Cameras Everywhere

The camera setup on the new iPhone 13 line is all new, on all models. Although the types of cameras haven’t changed, still ultra-wide, wide and telephoto where applicable, we’re seeing new sensors and modules on all units.



The iPhone 13 and mini feature a dual-camera setup. The main module here has received a new sensor, still 12MP (valid for all modules on all models), however increases in size and thus also in pixel pitch from 1.4µm to 1.7µm. The optics are 26mm equivalent, with an f/1.6 aperture. In terms of stabilisation, Apple has moved from OIS to an IBIS system, or what they call sensor-shift image stabilisation, where instead of the optics, the sensor itself is stabilised.

The ultra-wide angle is similar, with a large 120° field of view at 13mm focal length equivalent and f/2.4 optics, but Apple does say it’s a new faster sensor, even though it’s not larger.



The Pro models have seen larger camera updates, in the literal sense. This year’s Pro models feature a much larger camera island, and a visibly larger camera lenses and thicker camera bumps.



The main sensor is again upgraded to a larger size, this time with 1.9µm pitch pixels, which at 12MP translates to a 1/1.67” sensor – still not quite as large as what we see from some Android vendors. The aperture is a large f/1.5, and is also stabilised via IBIS.

The ultra-wide does not appear to have a different sensor on the Pro models, however Apple takes advantage of the thicker module size to enable a much wider f/1.8 aperture.

Finally, the telephoto module comes in at a 77mm equivalent focal length, or 2.96x magnification over the wide module. Apple didn’t comment on the pixel size so we can’t infer sensor size, but aperture is a bit smaller at f/2.8.

I just want to note that I’m happy Apple has retained the same camera setup on the regular Pro and the Pro Max models – something they didn’t do on last year’s models, and which also many other competitors fail to do between their differently sized flagships, it means you’re not missing out on features just because you chose the smaller variant.



In terms of software photography prowess, the new phones features Smart HDR 4, promising further improvements to tone-mapping and retention of shadow and highlight details, as well as now finally improving night mode to work on all modules (The telephoto previously didn’t have the feature).

There are also new features such as stylisation of pictures, a filter effect that works within the imaging pipeline rather than a post-processing effect. Video recording gets a new Cinematic Mode feature for cinematic looking focus pulls and shifting, enabled automatically in the app via ML and subject tracking, as well as ProRes video recording.
Static Pricing, Better Phones

On the positive side, all new models now start at the same pricing as their predecessors, and the regular iPhone 13 and mini now start at a 128GB baseline storage model, and doubling up the options stack. The Pro models have the same pricing this year, however add in a super-premium 1TB storage option at a higher price category.

Post a Comment

0 Comments