Thursday, September 1, 2011

Computing a SHA-1 Message Digest

This morning I was talking to some people about the IHE XDR profile.  One of the questions was about the hash metadata element.  One speaker was familiar with MD5 but not SHA-1.  It's pretty simple to compute a SHA-1 hash in Java, C# or any programming language.  I promised to share the code, so here it is:


import java.io.IOException;
import java.io.InputStream;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;


public class SHA1
{
    public static byte[] computeHash(InputStream is) 
        throws NoSuchAlgorithmException, IOException
    {
        // Use a 16Kb buffer to read bytes from the URL, 
        // and compute the message digest.
        byte buffer[] = new byte[16384];
        int  length;
        // create a digester using the SHA-1 algorithm.
        MessageDigest digester = 
            MessageDigest.getInstance("SHA-1");


        while ((length = is.read(buffer)) >= 0)
            digester.update(buffer, 0, length);
        return digester.digest();
    }


    public static String toHex(byte[] data)

    {
        StringBuffer b = new StringBuffer(data.length * 2);
        String toHex = "0123456789abcdef";
        for (int i = 0; i < data.length; i++)
        {   b.append(toHex.charAt((data[i]>>8) & 0x0F));
            b.append(toHex.charAt(data[i] & 0x0F));
        }
        return b.toString();
    }

}


You can use this to generate your hash for XDS, XDM, XDR or XCA since they all share the same metadata definitions.

It's also pretty easy to change this code to use SHA-224 or higher as well (or MD5) for other purposes (XDS requires SHA-1).  Just change the string identifying the algorithm.  MessageDigest does all the heavy lifting so you don't have to.  As with all things security related, I leave implementation of the algorithm to the experts.

If someone sends me the C# code, I'll publish that here as well.

   Keith

P.S.  MD5 has vulnerabilities, so if you are thinking about using this to compute an MD5 hash, ask yourself if you can use any of the SHA-2 variants.

P.P.S.  Just fixed it to address Arien's comments below.  If you want the string, just call toHex() on the returned result.



3 comments:

  1. That doesn't give you what you want -- you need to stringify it with a hex binary representation. You also can't use MD5 or SHA-"anything but 1" and comply with 4.1.7

    I wrote a C# implementation here (line 59)

    http://code.google.com/p/nhin-d/source/browse/csharp/common/Metadata/DocumentMetadata.cs

    ReplyDelete
  2. Arian:

    Thanks for the comments and the link to the C# code. I reordered a sentence above, and added a toHex() method to address your comments.

    ReplyDelete
  3. Ruby code

    Sha1.rb
    require 'digest/sha1'

    class Sha1
    def compute_hash(string)
    Digest::SHA1.digest(string)
    end

    def to_hex(is)
    Digest::SHA1.hexdigest(string)
    end
    end

    ReplyDelete