The driver stage uses the same TO3P transistors as the output stage. By running a large current through the driver, it instantly compensates for hfe fluctuations caused by current changes in the output stage, generating powerful driving force. This is the biggest difference from the original and a breakthrough.

I’d like to explain this in more detail. This is because I should first answer the question about the differences from the original, which is sure to be asked.
In a non-feedback power amplifier, there is something absolutely necessary to achieve powerful driving force. I’ll explain it here. I’m not sure if I can explain it well, but…
First, let’s briefly explain the basics of transistors. A transistor is a device that can output an input current several times higher. This expression isn’t exact, but think of it that way. So, how many times can it multiply the current? But that’s the current gain (hfe). That’s exactly what it sounds like.
The higher the hfe and the higher the maximum current that can be passed, the greater the speaker’s driving power, but it’s not that simple. In fact, hfe is tricky. It’s not often discussed, but hfe changes depending on conditions such as output current. Even a power transistor with a spec of around hfe 100 can have an hfe of around 10 as the output current approaches its limit.
In other words, in order for the final stage power transistor that directly drives the speaker to function properly, the input current to the power transistor must always be sufficient. From my experience with the 3 Series, the driver stage (the transistor that drives the final-stage transistor) needs to be designed with an hfe of 10 or less in mind. This is especially true when dealing with modern speakers, whose impedance drops dramatically in the low frequency range.
Generally, even in high-output power amplifiers with multiple final-stage transistors, a single transistor one size smaller is usually used in the driver stage. Why is that okay? You might be worried that the input current to the final-stage transistor will dry up when a large current is output, but there’s a trick to this. This is because these amplifiers usually use feedback. In other words, when the driver stage is no longer able to drive the final-stage transistor, the output current dries up and amplitude is lost, and feedback kicks in, boosting the driver stage’s amplitude to compensate for the lack of drive. Strictly speaking, there is a delay, but it doesn’t show up in measurements, and this is part of the unique characteristics of feedback amplifiers. It’s like a turbo engine in a car, so there is a certain amount of turbo lag. Of course, whether you find this strange or not is up to you.
Now, back to the topic of non-feedback amplifiers. In the case of non-feedback power amplifiers that don’t rely on feedback boost, the key is the “ultra-powerful driver stage.” To achieve this, the driver stage is given the ability to directly drive the speaker. In other words, the output-stage power transistor is driven by the same power transistor. And what’s more, the driver stage is pre-loaded with 10 times the current of the final stage. This enables linear and instantaneous current supply with no turbo lag.
Incidentally, the original A-2 had four parallel power transistors, while the driver stage used transistors one class smaller. Even so, the driver stage was running current right up to the ASO limit, and even now, eight years later, it’s still a fantastic amplifier. However, when compared to Ver.2, the difference is overwhelming. The A-2 Ver.2 breaks new ground by combining the open space characteristic of a non-feedback amplifier with the grip that makes you feel like a wall of air is flying at you.
↓Although it looks like a three-parallel large TO3P package, it is actually a paired two-parallel + driver.
