Boxplot evaluation of FDT (Y-Axis) when pruning lowest weights of components [(2xMLP+4xAttention)*40 layers] individually on LLama2-13B (X-Axis).
It can be observed that the 75% quantile contains the most variance.
It can be observed that the 75% quantile contains the most variance.