Less-than optimization

Constraining values to small 0 <= N <= limit is something that many compilers now optimize for you to reduce the number of comparisons and to eliminate the branch implied by the “also”:

bool checkConstraint(/*signed*/ int i)
{
    return ( i >= 0 && i <= 10 ) ;
}

bool muchFasterCheckConstraint(/*signed*/ int i)
{
    return ( (unsigned int)i <= (unsigned int)10 ) ;
}

GCC 4.6.2 compiles the above to the following (I’ve omitted boilerplate code)

checkCapacity:
    cmpl    $0, 8(%ebp)
    js      .L2
    cmpl    $10, 8(%ebp)
    jg      .L2
    movl    $1, %eax
    jmp     .L3
.L2:
    movl    $0, %eax
.L3:
    popl    %ebp
    ret

muchFasterCheckCapacity:
    movl    8(%ebp), %eax
    cmpl    $10, %eax
    setbe   %al
    popl    %ebp
    ret

The optimization here is that converting “i” from signed to unsigned translates negative values into very large values.

This optimization has a couple of problems: Your constraints must be <= the largest positive value the signed type holds, and you’re introducing a cast which might break if the function argument is changed.

Lastly, it’s a bloody obtuse bit of code that needs a good comment or a nice macro to ensure maintainers know why the hell you were using it.

Many compilers now make this optimization for you, for example, GCC 4.6.2 with -O1 will compile both functions above as:

checkCapacity:
    cmpl    $10, 4(%esp)
    setbe   %al
    ret

muchFasterCheckCapacity:
    cmpl    $10, 4(%esp)
    setbe   %al
    ret

This works, because “cmpl” performs an unsigned comparison in the first place, so if “i” is -1 and the local int is 32 bits, then you will be comparing “4294967295” with “10”.

I don’t recommend using this optimization by hand, but I leave the reader with the exercise of understanding what gains this optimization performs other than just the reduction of a couple of extra instructions – it can potentially save hundreds of CPU cycles per call.

Leave a comment

Name and email address are required. Your email address will not be published.

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <pre> <q cite=""> <s> <strike> <strong>