Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In contrast, GNU grep uses libc’s memchr, which is standard C code with no explicit use of SIMD instructions. However, that C code will be autovectorized to use xmm registers and SIMD instructions, which are half the size of ymm registers.

I don't think this is correct. glibc has architecture specific hand rolled (or unrolled if you will lol) assembly for x64 memchr. See here: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86...



Drats, you're totally right. It's easy to mess up that kind of thing.

Thankfully, it looks like my analysis remains mostly unchanged. I don't see any AVX2 in there (and indeed, I didn't when I looked at the profile either, in contrast to Go's implementation).

I updated the blog, thanks again for the clarification.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: