Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I was trying to figure out how this is so much faster than FNV https://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo... Is it only because of the parallelism? Or are the operations really that much cheaper somehow?


Both.

The pseudocode for FNV looks like this:

   hash = FNV_offset_value
   for each byte_of_data to be hashed
   {
        hash = hash XOR byte_of_data
        hash = hash × FNV_prime
   }
   return hash
The pseudocode for seahash looks like this (with '×' as the wrapping multplier operator, and some simplification for padding if the data length in bytes is not a multiple of 8 bytes per word × 4 words in the hash state):

    hash = {offset_1, offset_2, offset_3, offset_4}

    for (int data_index = 0; 
             data_index < data.length_in_64_bit_words; 
             data_index = data_index + 4_words_in_hash)
    {        
        for (int hash_index = 0; hash_index < 4; hash_index++)
        {
            // Mix in data
            hash[hash_index] = hash[hash_index] XOR data[data_index + hash_index]
            // Diffuse
            hash[hash_index] = hash[hash_index] XOR (hash[hash_index] RSHIFT 32)
            hash[hash_index] = hash[hash_index] × seahash_prime
            hash[hash_index] = hash[hash_index] XOR (hash[hash_index] RSHIFT 32)
            hash[hash_index] = hash[hash_index] × seahash_prime
            hash[hash_index] = hash[hash_index] XOR (hash[hash_index] RSHIFT 32)
        }
    }

    result = hash[0] XOR hash[1] XOR hash[2] XOR hash[3] XOR data.length_in_bytes

    result = result XOR (result RSHIFT 32)
    result = result × seahash_prime
    result = result XOR (result RSHIFT 32)
    result = result × seahash_prime
    result = result XOR (result RSHIFT 32)

    return result
FNV is operating on bytes of data, while seahash is operating on 64-bit words. A modern processor will be able to handle 64 bits at once. True, it can probably handle 8 bits independently in one instruction without having to create a temporary value, but it still needs to do more operations.

FNV is completely sequential. Until the first byte is hashed, no work can be done on the second byte. In seahash, as you observed, parallelism can be exploited. The second, third, and fourth bytes are all completely independent of the first byte, as bytes 6, 7, and 8 are independent of byte 5, and so on. You can have four independent threads each do a quarter of the work, and then put the result back together at the end.


So what you're saying is, is that this could be heavily optimized using OpenCL?

I might have to bookmark this as a project to take on. A heavily parallel hashing algorithm that takes advantage of OpenCL would immensely increase the efficiency wouldn't it?

(I'm dipping my toes in parallel processing as of late and am genuinely curious in this topic and question)


Is FNV algorithm limited in the same way as a ripple carry adder is then, where every operation relies on the previous result, and can't be (or maybe just isn't but could be) done in parallel due to this implementation?


Hardware threads, just to clarify.

> SeaHash achieves the performance by heavily exploiting Instruction-Level Parallelism.

> This means that almost always the CPU will be able to run the instructions in parallel.


Execution units, to be precise. Separate threads of execution are not involved, hardware or software.


Presumably because SeaHash operates on 8-byte blocks while FNV acts on single bytes.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: