added INCRP macro - now does 32-bit increments of RP for speed
added ADDRP macro to return RP incremented by n (CGT)
changed globals to static (didn't help speed much - thought it might)
moved around some functions
changed shift instructions to create bitmask at runtime (faster)
manually inlined mathexception (but used inline keyword in later revs)