D Programming/Garbage collector/Thoughts about better GC implementations

Requirements
It should be possible to have different GC Implementations. E.g.
 * precise
 * concurrent
 * generational
 * compacting/moving

This leads to more concrete requirements

Which word is a pointer
A precise GC should only mark that memory, that is referenced from a pointer. It should not mark memory "referenced" from non-pointer types. A moving GC also needs to modify pointer values, if an object was relocated.

Which address belongs to which Object
If a pointer is scanned, the start address of an object should be know. So it should be possible to get the PLI (pointer location information), which is needed to go on.

Moving references
In concurrent/incremental GC implementation, the moving reference is a problem. Object refa = new Object; Object refb = null;


 * 1) start of gc collecting cycle
 * 2) gc scans refb, which is null
 * 3) gc pauses or is interrupted from another thread
 * 4) somewhere else: refb = refa; refa = null;
 * 5) gc continues ('thinking' refb is null)
 * 6) gc scans refa, which is now null also
 * GC: Oh, no ref to obj, I can free it!
 * 1) the appl tries to reference refb which is not null and crashes

This is the cause for the need of a write barrier. In a concurrent/incremental GC, every write to a reference needs to invalidate the scans of the old target reference value.

Conservative Stack Scanning
Since scanning the stack precisely could be a hard problem (especially when considerign C interfaces, i.e. callbacks from C code to D code), one could simply conservatively scan the stack and "pin" all objects that are possibly referenced from the stack. The new Mono GC does this.

See section "Conservative Scanning". The stack is treated in a similar way like the current mark & sweep algorithm does it.

Object hash codes
In order to support Object.toHash in presence of a moving GC, one needs to store the hash code in the object itsself. The hash code can be taken from the object's address, but since its address can be changed, it needs to be stored.

Weak Pointers
It should be considered to support weak pointers, which are a potentially useful and an often requested feature. A weak pointer is set to null, if there are no more non-weak pointers to an object. Java and C# also support weak pointers.

Pointer to native types and member variable
Pointer to native types should be possible by the language, but should not be GC relevant. This means, they should not prevent the GC from deleting an object, and they are not marked as "pointers".

So the GC does not need to find the object start. The programmer just needs to be sure, that he does not make references to member, that live longer than the complete object.

GC specific per object data
A specific GC implementation might be implemented more efficiently if it is possible to store a limited amount of GC data inside the object (i.e. flags). But note that most management data can be stored in the memory block which contains the object.

Object Layout
Brain storming what concrete object layouts (only wrt. to GC) would be possible:

Typeinfo pointer
The first 4 bytes would contain a typeinfo pointer, and the type info contains the layout information for the object. This layout information would exist as a bitmap or a region list (a list of (offset, length) pairs)).

Bidirectional memory layout
A very specialized idea to separate pointer and integer data, so no bitmap or similar complex data structures would be required for the GC to scan an object. Pointers and integers would just be completely separated:

+--+    + integers + .  ...    .     +--+     + reflen   + +--+  <-- object ptr .  ...    .     + pointers + +--+

A pointer to an object always points into the middle of the object. Above object ptr, all data is integer data, and below the object ptr, everything is pointer data. The "reflen" member can be used by the GC to find the size of the region it needs to scan.

Inherited classes just can attach their integer/pointer data above/below the data of the base address. The "reflen" member would need to be set correctly if the object is constructed.

Although this would be very fast (as opposed to bitmaps), it also maybe would be very hard to integrate into existing compilers/environments/toolchains. With interrior pointers, the class head would be harder to find. Arrays of structs also would be a (maybe solvable) problem.

Modified bidirectional memory layout
One also could organize pointer data in regions and link that pointer-data regions like a singly linked list (i.e. in presence of inheritance, where pointers can't be grouped into a single region). Every pointer-data region would start with a non-pointer word containing the size of the region, and either the first word of an object would contain a pointer to the first pointer-data region, or the object would start with such a region by itsself.

Read/Write Barriers
Some GC need to run some code each time a reference is overwritten and/or if a reference is read. A flexible way can be, to give the compiler an function, to use for read and write accesses. These functions should always be inlined and optimized. e.g.: ___ref_assign( void * trg, void * src ){ trg = src; } void* ___ref_read( void * src ){ return src; }

Management of pointer location info
The allocation function of the GC should not only receive the size of memory, which is required. It should also receive a bitfield with the information, which words in this memory are references.

Reference info for the stack
each stack frame begins with the bitfield with reference information. If the GC scans the stack, the frames have to be recognized.

Register values
The compiler should ensure, that reference values are always present in memory also. So the GC do not need to scan the registers, and it does not need to know which one contains pointer values.

enable/disable
Implemented for concurrent. Should be recursive usable. This means it should work with a counter. The counter is incremented with each call of disable and decremented with calls to enable.

pin/unpin
Implemented for moving/compacting GCs.

fullcollect
Forces a collection cycle. The call returns after the collection cycle finished. In a Stop-the-world implementation this means, that all threads will pause. In a concurrent implementation not. A reference counting solution will return immediately.