Okay, not to beat the dead horse but...

I've been looking at md4/md5 lately, and as many have done in the past, attempting to rewrite this algorithm in assembly for use in a library. I've looked at the various version posted on the board / on Hutch's site, but ultimately i've been trying to understand the original RFC, without a whole lot of luck. Of course, I realize this question is more related to C than assembly, but I was hoping someone could clarify exactly what the purpose of part of the following lines is:



void MD5Update (context, input, inputLen)
MD5_CTX *context; /* context */
unsigned char *input; /* input block */
unsigned int inputLen; /* length of input block */
{
unsigned int i, index, partLen;

/* Compute number of bytes mod 64 */
index = (unsigned int)((context->count[0] >> 3) & 0x3F);

/* Update number of bits */
if ((context->count[0] += ((UINT4)inputLen << 3))
< ((UINT4)inputLen << 3))
context->count[1]++;
context->count[1] += ((UINT4)inputLen >> 29);

partLen = 64 - index;

/* Transform as many times as possible.
*/
if (inputLen >= partLen) {
MD5_memcpy
((POINTER)&context->buffer, (POINTER)input, partLen);
MD5Transform (context->state, context->buffer);

for (i = partLen; i + 63 < inputLen; i += 64)
MD5Transform (context->state, &input);

index = 0;
}
else
i = 0;

/* Buffer remaining input */
MD5_memcpy
((POINTER)&context->buffer, (POINTER)&input,
inputLen-i);
}


Now, the majority of the code I understand, but the problem comes in at the area labeled "Updating number of bits". The if checks to see if the number of "bits" + count is less than the number of "bits". This seems strange to me, but not a showstopper. However the following:



context->count[1] += ((UINT4)inputLen >> 29);



Has me completely baffled. What is the purpose of dividing the inputLen by 2^29? This equates to something like .5GB, but no matter what I throw at this statement, I can't an particular purpose. (Maybe i'm just going braindead in my old age).

Hopefully someone will be able to point me in the right direction, but if not, thanks for taking a look anyway ;)

-----
Domain
Posted on 2003-11-10 17:15:12 by Domain
The number of bits in an MD5 data stream is stored in two dwords and placed at the end of the stream.
These two dwords are stored in context->count[0] and context->count[1].

The C code you have, first shifts the inputlen left by 3, effectively multiplying the value by 8. (8 bits per byte)
It then adds the result to the low dword of count[]. If the result of the addition is less than the original value then that means an overflow occured and the high dword, count[1], is incremented.

But what happens when you shift the inputlen by 3, is you lose the most significant 3 bits of inputlen.
So the line "context->count[1] += ((UINT4)inputLen >> 29);" is to compensate for that by acknowledging the high 3 bits.


I wrote a small ready-to-use MD5 library awhile ago and posted it here. I wrote it with size in mind, so if you can live without speed (though it is probably still faster than the C implementation) then I suggest you try it. I forget which thread it's in but search around and you'll find it.

Good luck :alright:
Posted on 2003-11-10 20:28:31 by iblis
iblis,

Thanks so much for your reply, it makes perfect sense now.

As for your implementation, i've looked at it (it looks very nice), but the primary purpose of this excerise is to do a little learning, so I wanted to roll my own on this one.

Thanks again,

-----
Domain
Posted on 2003-11-11 15:10:49 by Domain