The Active Parameters
A 35B parameter model that activates only 3B per token isn't a compromise. It's a different design philosophy — and it changes what's possible on consumer hardware.
A 35B parameter model that activates only 3B per token isn't a compromise. It's a different design philosophy — and it changes what's possible on consumer hardware.