Intel Unveils Lunar Lake Architecture: New P and E cores, Xe2-LPG Graphics, New NPU 4 Brings More AI Performance
by Gavin Bonshor on June 3, 2024 11:00 PM ESTNew Graphics: Intel Xe2, 2nd Gen Arc Xe Core For Mobile
Along with Lunar Lake, Intel has just unveiled its Xe2 graphics architecture for mobile, supported by the 2nd Generation Arc Xe Core. On paper, it offers an extraordinary bump in performance and efficiency. Aside from gaming, which we don't think the 4P+4E part would cut it, we've opted to focus on the critical takeaways of Intel's presentation for graphics, including the media engine within.
Intel's introduction of the Xe2 architecture significantly improves computational capabilities by providing up to 67 TOPs and offering increased ray tracing units compared to Xe-LPG on Meteor Lake. According to Intel, the 2nd gen Xe-cores offer 1.5x faster graphics performance than Meteor Lake, which is helped and achieved by the new XMX engines. Enhanced XeSS kernels deliver improved graphics and compute performance.
One element that Intel looks to have changed from Meteor Lake is that it offers more flexible and higher-quality display outputs. Within the Display engine, the streams in the dual-pixel pipeline can be combined for multi-stream transport. With this architecture, ports will be available in four locations, which will be flexible for connectivity. An eDP port is also provided in Intel's configuration, which will augment the display to set high resolutions and refresh rates for the output on high-end, premium, and capable displays.
Intel's eDisplayPort 1.5 includes the panel replay feature integrated with adaptive sync and selective update mechanisms. This helps decrease power consumption by refreshing only the parts of the screen that change instead of the entire display. These innovations save not only energy but also improve visual experiences by reducing display lag and increasing sync precision.
Portraying the pixel processing pipeline is one of the fundamental bases on which Intel's display engine sits, enabling six planes per pipeline for advanced color conversions and compositions. In addition, it integrates hardware support for color enhancement, display scaling, pixel tuning, and HDR perceptual quantization, ensuring that the graphics on the screen is vibrant and accurate. The design is quite flexible, highly power-efficient, and performance-engineered to support various input and output formats, at least on paper. Intel hasn't provided any quantifiable power metrics, TDPs, or other power elements so far.
When considering compression and encoding, the architecture Xe2 extends up to 3:1 display stream compression visually losslessly, including transport encoding for HDMI and DisplayPort protocols. These chip features further reduce the data load and maintain high resolution at the output without losing visual quality.
Intel's adoption of the VVC codec is a big deal for video compression technology improvement. This codec offers up to a 10% reduction in file size compared to AV1, supported by adaptive resolution streaming and advanced content coding for 360-degree and panoramic videos. This will ensure lower bitrates for streaming without losing quality—an essential aspect for modern multimedia applications.
The Windows GPU software stack is robust, from top to bottom, with the support of D3D, Vulkan, and Intel VPL APIs and frameworks. This means that combining these qualities provides comprehensive support for the varied runtimes and drivers in the market, thus increasing its overall efficiency and compatibility in different software climates.
Intel's Xe2 and second-generation Arc Xe Core improve performance, efficiency, and flexibility significantly. These innovations strengthen Intel's position in the competitive landscape of solutions for mobile graphics, with reinforced capabilities across display, media, and compute operations.
91 Comments
View All Comments
thestryker - Monday, June 3, 2024 - link
I'm curious what the overall E-core performance is going to look like since the cluster won't have L3 cache access. Chips and Cheese did some analysis of the LP E-cores on MTL and found this specifically to be a big negative. I'm guessing this design is going to be limited to just LNL and is predominantly for the power savings.ET - Tuesday, June 4, 2024 - link
Interestingly, Intel is comparing Skymont to Raptor Cover. I agree that we have to wonder how the L3 (or lack thereof) affect this, but from the Chips and Cheese figures alongside Intel's performance improvement figures, it looks like Skymont without L3 cache will be faster than Crestmont with L3 cache.kwohlt - Tuesday, June 4, 2024 - link
There's 8MB of "SOC cache", separate from both the P and E cores, that should in practice function as the E cores' L3thestryker - Tuesday, June 4, 2024 - link
That's my assumption as well as I think the GPU would be the other part predominantly using it and they shouldn't really both be hitting it at the same time.sharath.naik - Monday, June 10, 2024 - link
Side cache is not the same as L3, or I think they would have called it that. shared L3 is where the memory sync can happen across cores. if not, it needs to go all the way back to ram. So, side caches really cannot be considered as L3, more like expanded L2 for E-core and expanded l3 for P-Core? is my guess. Yes, it means things that run on both E-Core and P-Core, at the same time, will take a hit on performance. I think they were targeting the majority use case. where most won't need more than 4 threads or threads won't be working on the same data.powerarmour - Thursday, June 6, 2024 - link
I can see this being an embarrassing launch if it gets slapped around by Qualcomm's SDx Elitemode_13h - Friday, June 7, 2024 - link
Well, they're on a better node that Qualcomm, so there's that.sharath.naik - Monday, June 10, 2024 - link
It absolutely will. Because this is going to be slower than meteor lake in CPU. Elite is supposed to be 30% faster. Intel should have released 8 P-core version to compete in performance. But I think they wanted to reserve that to be produced on their own fabs.lmcd - Monday, June 17, 2024 - link
Snap Elite is supposed to be 30% faster at essentially-undisclosed power. Lunar Lake will ironically undercut the Snapdragon Elite on power and cost while delivering good performance.Drumsticks - Tuesday, June 4, 2024 - link
I hate to ask this, but was this article fully written by Gavin and proof'ed by another editor? Was there a deadline push to get it out as soon as Intel released the information on Lunar Lake? It just reads so, so disjointed. It feels like there are so many issues in this paragraph alone on the P-core overview; it feels jarring to read."This Lion Cove architecture **also aligns with performance increases**, boasting a predicted double-digit bump in IPC over the older Redwood Cove generation. This uplift is noticed, especially **in the betterment of its hyper-threading, whereby improved IPC** by 30%, dynamic power efficiency improved by 20%, **and previous technologies, in balancing**, without increasing the core area, **in a commitment of Intel to better performance**, within existing physical constraints."
I've seen so much better work from Gavin, and Anandtech in general, that I almost hope that this page was heavily written by software. I know it's a press release, and there's not a whole lot of information, but the level of first party detail here feels similar to the Architecture Day 2021 presentations Intel did on Alder Lake, which got fantastic coverage from Andrei and Dr. Cuttress, and here it feels like we are getting a poorly worded restating of the slides with hardly any analysis or greater than surface level understanding.
I've been reading Anandtech since I was 15, and the level of detail in the Sandy Bridge era articles honestly had a huge influence on my choice to pursue a career in CPU Design. I've mountains of respect for what Anandtech has published in the past, but this article feels rushed.