IBM Granite 4.0 Tiny Preview
A preliminary version of the smallest model in the upcoming Granite 4.0 family, released May 2025. It utilizes a novel hybrid Mamba-2/Transformer, fine-grained mixture of experts (MoE) architecture (7B total parameters, 1B active at inference). This preview version is partially trained (2.5T tokens) but demonstrates significant memory efficiency and performance potential, validated for at least 128K context length without positional encoding.
Context 128K