When generating an assembly program from a c program using GCC, I have observed that after the stack is initialized, the stack pointer is anded with -16 (effectively zeroing out the 4 lsb's) before allocating memory for variables.

Is this to insure proper memory alignment for 32 bits (32 bit machine)?  If so, why -16 and not -4?

pushl   %ebp
movl    %esp,  %ebp
andl    $-16,  %esp
subl    $1024,  %esp
Posted on 2011-01-01 11:35:18 by Allasso
Some of the SSE instructions come in "aligned" and "unaligned" forms (MOVDQA vs MOVDQU, for example) - the "aligned" forms require 16 byte alignment, and are faster. I think that's why gcc does that. There may be other advantages (cacheline is 16 bytes?). A 4 byte stack alignment is probably "good enough" for your purposes - you wouldn't want anything less than that, but it won't happen unless you do something "unusual"...


Posted on 2011-01-01 15:28:12 by fbkotler
great, thanks.

Aren't most cachelines 64 bytes?

Posted on 2011-01-01 17:12:28 by Allasso
You're probably right. I really don't know. I imagine it depends on how much of a "valuable antique" your hardware is. :) I'm not very up-to-date!


Posted on 2011-01-01 17:40:00 by fbkotler