There are many reasons leverage integers instead of more expensive data structures, like strings. On top of being much more memory efficient without readability sacrifice, compilers can apply numerous optimizations to improve use-cases in which they are used. Bit-testing is one of such optimizations, that uses masks to perform multiple checks with a single comparison, somewhat similar to SIMD instructions. For example, a series of conditional expressions that compare the same variable
int isHtmlWhitespace(int ch) {
return ch == 0x0009 || ch == 0x000A ||
ch == 0x000C || ch == 0x000D ||
ch == 0x0020;
}
can be transformed into a switch statement if each of them contains a comparison expression. This statement can be transformed into a switch statement and then expanded into a bit-test. Below is the assembly that GCC with enabled optimizations is able to produce
isHtmlWhitespace(int):
cmp edi, 32 ; compare ch with 0x0020
ja .L42 ; and return false if ch is larger, since 32 is the largest possible match
movabs rax, 4294981120
mov ecx, edi
shr rax, cl
and eax, 1
ret
.L42:
xor eax, eax
ret
Here is what happens:
cmp edi, 32
andja .L42
handle the case whench
is outside of the range of the values we are interested in.32
is the largest value that can be a match, so ifch
is greater, we can safely returnfalse
;- then we use a mask
0b100000000000000000011011000000000
where1
is set at positions that correspond to the values we are looking for shifted by one to the left. So0x0009
is responsible for 10th set bit,0x000A
for the 11th bit and so on; shr rax, cl
shifts the mask value to the right bych
bits, so for0x0009
0b100000000000000000011011000000000
becomes0b100000000000000000011011
and for0x000A
-0b10000000000000000001101
;- and finally
and eax, 1
checks if the rightmost bit after above shift is set, in which casech
is in the bitset and we can returntrue
, otherwise, say, forch
==0x000B
, the shifted mask becomes0b1000000000000000000110
with the rightmost bit set to0
and function returnsfalse
.
Clang generates different but still very compact and interesting assembly
isHtmlWhitespace(int): # @isHtmlWhitespace(int)
lea eax, [rdi - 9]
cmp eax, 5
jae .LBB2_3
mov ecx, 27
bt ecx, eax
jb .LBB2_2
.LBB2_3:
xor eax, eax
cmp edi, 32
sete al
ret
.LBB2_2:
mov eax, 1
ret
lea eax, [rdi - 9]
is doing something likenew_ch = ch - 9
to shift the range of possiblech
values from[9, 32]
to[0, 23]
;cmp eax, 5
andjae .LBB2_3
checks ifnew_ch
can include0x0009
,0x000A
,0x000C
,0x000D
in the new[0,23]
interval and if that's the case, we have to check the last potential match,32
usingcmp edi, 32
;mov ecx, 27
creates a bitset mask0b11011
that has1
at each position corresponding tonew_ch
values we are interested in - 0, 1, 3 and 4.bt ecx, eax
uses a dedicated bit test instructionbt
that performs the actual check and returnstrue
in such case.
Unfortunately not all compilers are using this clever optimization. Unfortunately Go's compiler is one of such compilers and
func is_html_whitespace(ch int) bool {
return ch == 0x0009 || ch == 0x000A ||
ch == 0x000C || ch == 0x000D ||
ch == 0x0020
}
is translated into a set of if
s
v14
00003 (+4) MOVQ "".ch(SP), AX
v30
00004 (4) CMPQ AX, $9
b1
00005 (4) JNE 9
v29
00006 (4) MOVL $1, AX
v28
00007 (+4) MOVB AX, "".~r1+8(SP)
b8
00008 (4) RET
v10
00009 (4) CMPQ AX, $10
b2
00010 (4) JEQ 6
v34
00011 (+5) CMPQ AX, $12
b4
00012 (5) JEQ 6
v31
00013 (5) CMPQ AX, $13
b6
00014 (5) JEQ 6
v7
00015 (+6) CMPQ AX, $32
v24
00016 (6) SETEQ AX
b9
00017 (6) JMP 7
which is very unfortunate and serves as a reminder that HPC is not Go's area of focus.
And what about Rust? Interestingly its assembly is different from the one generated by Clang compiler for C++ code and impressively contains no branches
is_html_whitespace: # @is_html_whitespace
# %bb.0:
# kill: def $edi killed $edi def $rdi
leal -9(%rdi), %eax
cmpl $2, %eax
setb %al
movl %edi, %ecx
andl $-2, %ecx
cmpl $12, %ecx
sete %cl
orb %al, %cl
cmpl $32, %edi
sete %al
orb %cl, %al
retq
where
leal -9(%rdi), %eax
is similar to Clang'snew_ch = ch - 9
;cmpl $2, %eax
andsetb %al
setsal
in casech
matches0x0009
or0x000A
;andl $-2, %ecx
andcmpl $12, %ecx
handle0x000C
and0x000D
cases.andl $-2, %ecx
unsets the rightmost bit so that12
and13
could be checked with a single comparisoncmpl $12, %ecx
;cmpl $32, %edi
handlesch == 0x0020
;- and finally
orb %cl, %al
setsal
totrue
in case we had at least one match.