packed_simd is deprecated, but we don't need very much SIMD:
* _mm256_set1_epi8 / vpbroadcastb (splat)
* _mm256_cmpgt_epi8 / vpcmpgtb (comparison)
* _mm256_movemask_epi8 / vpmovmskb (compress to bitmask)
This also simplifies the code by only vectorising the bare minimum,
since we just get a bitmask and operate in scalar mode as soon as
possible.
We don't need nightly Rust anymore: we're using only stable intrinsics.
Change-Id: Id410b5fef2549f3c97f48049f722f1e643e68553
Reviewed-on: https://cl.tvl.fyi/c/depot/+/7687
Reviewed-by: edef <edef@edef.eu>
Tested-by: BuildkiteCI