First of all, I apologize if you’re subscribed to the feed and are seeing duplicates of the last two posts. In both cases I changed things after publication that resulted in a different feed entry ID, causing many feed readers to treat the changed entry as a new entry. I think I have things sorted out well enough now that it won’t be an issue going forward.
Last time we wrote constructors and accessors for a handful of basic data types, all of them dynamically allocated in the heap (as opposed to on the stack). This time we’re going to change the representation of integers and characters so that they’re immediate values and don’t require dynamic allocation. You can see the code as of this post on the sky-0.0.2 tag, and the log of changes since the first time here.
The technique we’ll use to do this is widely used in programming language
implementations, operating systems, and “low level” software more generally. It
starts from the observation that the memory returned by functions like malloc
must be suitably aligned for any (C) object.
As of C11, there’s a type max_align_t
(defined in stddef.h
)
whose alignment is at least as large as that of every scalar type, plus an
operator alignof
(defined in stdalign.h
) which will tell us the
alignment requirement of a given type. We can use them to write a small test
program that will tell us what alignment we can count on from malloc
:
#include <stdio.h>
#include <stddef.h>
#include <stdalign.h>
int main(void)
{
printf("alignof(max_align_t) = %zu\n", alignof(max_align_t));
return 0;
}
In practice, the answer on a 64-bit machine will be 16 (bytes). The upshot is
that memory addresses returned by functions like malloc
will always be
divisible by 16. Put another way, the 4 least significant bits of such memory
addresses will always be zero (because 16 is 2^4
). On a 32-bit machine, the
minimum alignment will be 8 bytes, making the 3 least significant bits zero. You
can also requested more highly-aligned memory using special
allocation functions.
The bottom line is that at least the three least significant bits of our struct
object *
pointers will always be zero1, which means we can use
those bits for something else. If we use all three bits we can store 8 possible
values there, but for now we’ll only use two bits, for four possible values.
We’ll use them to store a type tag, with the following scheme:
Binary | Decimal | Type
-------|---------|-----------
00 | 0 | Pointer
01 | 1 | Integer
10 | 2 | Character
11 | 3 | Unused
If the two least significant bits are 00
, the value represents a pointer
(specifically, a struct object *
). If they’re 01
the object is an integer,
if they’re 10
it’s a character, and we’ll leave 11
unused for now.
Taking a step back for a moment, while we could tag the struct object *
values
directly, things will be cleaner and clearer if we instead introduce a C type
that’s either a pointer or an immediate value. We introduce value_t
to
fill that roll:
typedef intptr_t value_t;
It’s a typedef
of intptr_t
, which is a signed integer large enough to hold a
pointer. The downside is a bunch of extra casts, but they’ll be well-contained
in the constructors and accesssors.
Okay, so how do we construct one of these tagged objects? We shift the value left by two bits and then add the tag.
#define TAG_BITS 2
#define PTR_BITS (sizeof(intptr_t) * CHAR_BIT)
#define VAL_BITS (PTR_BITS - TAG_BITS)
#define VAL_MASK (-(1 << TAG_BITS))
value_t make_integer(intptr_t value)
{
return ((uintptr_t)value << TAG_BITS) + TAG_INT;
}
And then to get the original value back we simply shift right by two bits.
intptr_t integer_data(value_t value)
{
return value >> TAG_BITS;
}
Of course we also need to teach get_type_tag
about these immediate values.
enum type_tag get_type_tag(value_t value)
{
if (value == NIL) return TAG_LIST;
enum type_tag tag = value & ~VAL_MASK;
if (!tag)
tag = ((struct object *)value)->tag;
return tag;
}
If your background is in higher-level dynamic programming languages like Python,
you may not be very familiar with these bitwise operations. I wasn’t,
anyway, until I started poking around in programming language implementations.
My advice is to read up on them a bit and then spend some time playing around in
GDB. Conveniently, you can use print /t
(which can be abbreviated to p/t
) to
print the binary representation of a value, which makes it easy to see the
effect of bitwise operations.
Here’s an example session experimenting with the bitwise operations we use to tag and untag an integer:
(gdb) set $i = 123
(gdb) p/t $i
$1 = 1111011
(gdb) p/t $i << 2
$2 = 111101100
(gdb) set $t = ($i << 2) + 1
(gdb) p/t $t
$3 = 111101101
(gdb) p/t $t >> 2
$4 = 1111011
(gdb) p $t >> 2
$5 = 123
Since we’re using two bits for the type tag, we only have either 62 or 30 bits left over for integer values. We can define preprocessor macros to identify the largest and smallest values we can represent.
#define MOST_POSITIVE_INT (INTPTR_MAX >> TAG_BITS)
#define MOST_NEGATIVE_INT (-1 - MOST_POSITIVE_INT)
We’re not going to do any error checking in make_integer
yet, but eventually
we’ll want to signal in error if an operation would result in an integer too
large for us to represent. (In a production language, you would hopefully
promote it to a “bignum” - an arbitrary precision integer - but we probably
won’t go there).
For objects that will still be dynamically allocated on the heap, the only
change is that we cast the struct object *
to value_t
when we return it, and
then in the accessors cast back to struct object *
so we can dereference it.
value_t make_string(const char *data, ptrdiff_t len)
{
// Construct the `struct object *` the same as before
return (value_t)obj;
}
const unsigned char *string_data(value_t value)
{
return ((struct object *)value)->u.string.data;
}
To sum things up, we now have the following arrangement: At the C level, a Sky
value is represented by a value_t
. Each value_t
is either an immediate value
(an integer or character), or a struct object *
. To access the immediate
value, we untag it; to access a struct object
, we perform the cast and
dereference the pointer.
You can take this technique a lot further than we have here. We could get rid of
the tag
member of struct object
entirely and tag all pointers rather than
just immediate values. However, my hope is that this more minimal version gets
us the low-hanging fruit and demonstrates the technique while keeping things as
clear as possible.
-
I am ignoring sub-32 bit machines, which I think is reasonable in this context. ↩︎