> > So what, the optimized code goes only once through the loop, and then > bails out? If so, what is the value of 'size' when the loop ends? > OK, so there was one more detail that I forgot to mention. It looks like I also had "-funroll-loops". After removing it, "-O3" works fine too (without "volatile"). I think this is still a toolchain bug because "-O3 -funroll-loops" combination works fine in the x64 build and what's more important --- it *should* work fine. Anyway, maybe it's a good idea to somehow prevent people from passing "-funroll-loops" by filtering it out from "CFLAGS" in "Makefile" for example?