1
0
mirror of https://github.com/wfjm/w11.git synced 2026-01-12 00:43:01 +00:00

tbit trap overhaul - part 2

- rtl/w11a
  - pdp11_sequencer.vhd: tbit logic overhaul 2, now fully 11/70 compatible
- tools/tcode
  - cpu_details.mac: add A4.4 part 0,8,9,10
  - cpu_mmu.mac: use m*pd
This commit is contained in:
wfjm 2022-12-26 12:26:51 +01:00
parent dc9005e98a
commit c9d447f2be
9 changed files with 256 additions and 49 deletions

View File

@ -1,4 +1,4 @@
# ECO-035: STKLIM, yellow and tbit trap fixes (2022-12-06)
# ECO-035: STKLIM, yellow and tbit trap fixes (2022-12-06,2022-12-26)
### Scope
- mostly in w11a since 2008
@ -59,10 +59,13 @@ as a result, some J11 behaviors crept into the w11.
was implanted by ignoring any break condition. This gives the expected
behavior in all most all cases but deviates in a few corner cases like
single stepping code.
- fix: implement the approach used by 11/70, but also J11, to set a request
- fix 1: implement the approach used by 11/70, but also J11, to set a request
flag at the beginning of instruction processing, in state `s_idecode`,
and take the tbit trap decision based on that flag at the end of
instruction execution.
- fix 2: use the 11/70 logic that the request flag _loads_ the `PSW` tbit
instead of _setting_ it when `PSW` tbit=1. This ensures that a traced
`RTI` or `RTT` does not tbit trap when a `PS` with tbit=0 is loaded.
- `RESET` wait time
- issues: on the w11 the `RESET` instruction caused a one-cycle `breset`
pulse and continued immediately. The clearing of pending interrupts takes

View File

@ -18,7 +18,7 @@ all PDP-11 models.
The w11 implements the 11/70 service order.
This is verified in a [tcode](../tools/tcode/README.md), the test is
skipped when executed on SimH
modified when executed on SimH
(see [cpu_details.mac](../tools/tcode/cpu_details.mac) test A4.4 part 3).
See also [traced `WAIT`](simh_diff_traced-wait.md).

View File

@ -20,6 +20,8 @@ ones are listed here:
- service order and trap handling
- [SimH: trap and interrupt service order has J11 behavior](simh_diff_service-order.md)
- [SimH: traced `WAIT` has J11 behavior](simh_diff_traced-wait.md)
- [SimH: traced `RTI`/`RTT` that clears tbit does trap](simh_diff_traced-rti-rtt.md)
- [SimH: vector flow that sets tbit does not trap](simh_diff_traced-vector.md)
- memory management behavior
- [SimH: `MMR1` recording has J11 behavior](simh_diff_mmr1.md)
- [SimH: MMU traps not suppressed when MMU register accessed](simh_diff_mmu_trap_suppression.md)

View File

@ -0,0 +1,18 @@
## Known differences between SimH, 11/70, and w11a
### SimH: traced `RTI`/`RTT` that clears tbit does trap
On an 11/70 and on a J11, a traced `RTI` or `RTT` loading a new `PS` with
tbit=0 does not cause a tbit trap. More precisely:
- an `RTT` will never end with a tbit trap
- an `RTI` ends with a tbit trap only when the new `PS` has tbit=0.
The Processor Handbook documentation is misleading and at one point simply wrong.
On SimH, a traced `RTI` or `RTT` does trap. Confirmed bug, will be fixed.
The w11 implements traced `RTI` or `RTT` correctly, the corresponding test
is skipped when executed on SimH
(see [cpu_details.mac](../tools/tcode/cpu_details.mac) test A4.4 part 8).
Tested with SimH V3.12-3.

View File

@ -0,0 +1,13 @@
## Known differences between SimH, 11/70, and w11a
### SimH: vector flow that sets tbit does not trap
On an 11/70, a vector flow loading a new `PS` with tbit=1 ends with a tbit trap.
On SimH, a vector flow loading a new `PS` with tbit=1 does not tbit trap.
The w11 implements the 11/70 behavior, the corresponding tests
are skipped when executed on SimH
(see [cpu_details.mac](../tools/tcode/cpu_details.mac) test A4.4 part 9,10).
Tested with SimH V3.12-3.

View File

@ -2,7 +2,7 @@
### SimH: traced `WAIT` has J11 behavior
On an 11/70 (and an 11/45) a traced `WAIT` will wait until an interrupt happens
On an 11/70 (and an 11/45), a traced `WAIT` will wait until an interrupt happens
and finish without raising a trace trap because the interrupt has higher
service precedence. The trace trap related to the `WAIT` will happen when the
interrupt driver exits with an `RTI`.

View File

@ -1,4 +1,4 @@
-- $Id: pdp11_sequencer.vhd 1330 2022-12-16 17:52:40Z mueller $
-- $Id: pdp11_sequencer.vhd 1337 2022-12-26 11:14:21Z mueller $
-- SPDX-License-Identifier: GPL-3.0-or-later
-- Copyright 2006-2022 by Walter F.J. Mueller <W.F.J.Mueller@gsi.de>
--
@ -13,6 +13,7 @@
--
-- Revision History:
-- Date Rev Version Comment
-- 2022-12-26 1337 1.6.26 tbit logic overhaul 2, now fully 11/70 compatible
-- 2022-12-12 1330 1.6.25 implement MMR0,MMR2 instruction complete
-- 2022-12-10 1329 1.6.24 BUGFIX: get correct PS after vector push abort
-- 2022-12-05 1324 1.6.23 tbit logic overhaul; use treq_tbit; cleanups
@ -951,9 +952,8 @@ begin
nstatus.resetcnt := "111"; -- set RESET wait timer
if PSW.tflag='1' then -- if PSW tbit set
nstatus.treq_tbit := '1'; -- request tbit
else
nstatus.treq_tbit := PSW.tflag; -- copy PSW.tflag to treq_bit
if PSW.tflag = '0' then -- if PSW tbit clear consider prefetch
-- The prefetch decision path can be critical (and was on s3). It uses
-- R_STATUS.intpend instead of int_pending, using the status latched
-- at the previous state is OK. It uses R_STATUS.treq_mmu because
@ -2219,7 +2219,6 @@ begin
when s_vec_getpc => -- -----------------------------------
nmmumoni.vstart := '1'; -- signal vstart
nstatus.in_vecflow := '1'; -- signal vector flow
nstatus.treq_tbit := '0'; -- cancel pending tbit request
nvmcntl.mode := c_psw_kmode; -- fetch PC from kernel D space
do_memread_srcinc(nstate, ndpcntl, nvmcntl, s_vec_getpc_w, nmmumoni);
@ -2326,6 +2325,7 @@ begin
nstate := s_vec_pushpc_w;
do_memcheck(nstate, nstatus, imemok);
if imemok then
nstatus.treq_tbit := PSW.tflag; -- copy PSW.tflag to treq_bit
nstatus.in_vecflow := '0'; -- signal end vector flow
nstatus.in_vecser := '0'; -- signal end of ser flow
nstatus.in_vecysv := '0'; -- signal end of ysv flow
@ -2369,9 +2369,10 @@ begin
end if;
when s_rti_newpc => -- -----------------------------------
if R_IDSTAT.op_rti = '1' and -- if RTI instruction
PSW.tflag = '1' then -- and PSW tflag set now
nstatus.treq_tbit := '1'; -- request immediate tbit
if R_IDSTAT.op_rti = '1' then -- if RTI instruction
nstatus.treq_tbit := PSW.tflag; -- copy PSW.tflag to treq_bit
else -- else RTT
nstatus.treq_tbit := '0'; -- no tbit trap
end if;
ndpcntl.ounit_asel := c_ounit_asel_ddst; -- OUNIT A=DDST
ndpcntl.ounit_bsel := c_ounit_bsel_const; -- OUNIT B=const (0)

View File

@ -1,10 +1,10 @@
; $Id: cpu_details.mac 1332 2022-12-21 11:56:32Z mueller $
; $Id: cpu_details.mac 1337 2022-12-26 11:14:21Z mueller $
; SPDX-License-Identifier: GPL-3.0-or-later
; Copyright 2022- by Walter F.J. Mueller <W.F.J.Mueller@gsi.de>
;
; Revision History:
; Date Rev Version Comment
; 2022-12-10 1329 1.0 Initial version
; 2022-12-25 1337 1.0 Initial version
; 2022-07-18 1259 0.1 First draft
;
; Test CPU details
@ -80,17 +80,24 @@
; part 2: from cm=s,rset=1 mode: set cm=0 and rset=0 (fail!)
; part 3: from cm=s,rset=0 mode: set cm=u and rset=1 (fine!)
; part 4: from cm=u,rset=1 mode: set cm=0 and rset=0 (fail!)
; part 5: from cm=k,pri=0: set pri=6 (fine!)
; part 6: from cm=s,pri=0: set pri=6 (fail!)
; part 7: from cm=u,pri=0: set pri=6 (fail!)
; A4.3 RTI/RTT tbit basics
; part 1: tbit after RTI
; part 2: tbit after RTT
; A4.4 Test A4.4 -- tbit trace tests
; part 1: simple instruction sequence
; part 2: tracing of trap instructions (EMT tested)
; part 3: tbit vs interrupt precedence (via PIRQ)
; part 4: traced WAIT and tbit
; part 5: WAIT and SPL in user mode
; part 6: tbit trap after continuation over s_idle
; part 7: no tbit trap after an abort
; part 0: traced TRAP that clears tbit
; part 1: simple instruction sequence
; part 2: tracing of trap instructions (EMT tested)
; part 3: tbit vs interrupt precedence (via PIRQ)
; part 4: traced WAIT and tbit
; part 5: WAIT and SPL in user mode
; part 6: tbit trap after continuation over s_idle
; part 7: no tbit trap after an abort
; part 8: traced RTI that clears tbit
; part 9: EMT that sets tbit
; part 10: PIRQ that sets tbit
;
; Test A1: PIRQ +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
; This sub-section verifies operation of PIRQ register
@ -1001,7 +1008,7 @@ ta0401:
9999$: iot ; end of test A4.1
;
; Test A4.2 -- PSW write/read via RTI/RTT ++++++++++++++++++++++++++++
; Verifies cm and rset priviledge escalation protection
; Verifies cm, rset and priority priviledge escalation protection
;
ta0402:
;
@ -1083,6 +1090,50 @@ ta0402:
bic #cpnzvc,4110$ ; discard NZVC
hcmpeq #cp.cmu!cp.ars,4110$ ; check expected 2nd PS (same !)
;
; part 5: from cm=k,pri=0: set pri=6 (fine!) -------------------------
;
rtijmp #cp.pr6,#5100$ ; new PS cm=k,pri=6
;
5100$: hcmpeq #cp.pr6,cp.psw ; check PS
clr cp.psw ; back to normal
mov #stack,sp ; restore stack
;
; part 6: from cm=s,pri=0: set pri=6 (fail!) -------------------------
;
rtijmp #cp.cms,#6200$ ; PS cm=s
;
.word 0,0 ; temporary stack
6100$: .word 0 ; saved PS
;
; now in supervisor mode, try sneak to cm=s and pri=6
6200$: mov #6100$,sp ; set up stack
rtijmp #cp.cms!cp.pr6,#6300$ ; new PS cm=s and pri=6
;
; lands here after rti from cm=s
6300$: mov cp.psw,6100$ ; save PS
clr cp.psw ; back to kernel
mov #stack,sp ; restore stack
bic #cpnzvc,6100$ ; discard NZVC
hcmpeq #cp.cms,6100$ ; check expected PS, pri stays 0 !
;
; part 7: from cm=u,pri=0: set pri=6 (fail!) -------------------------
;
rtijmp #cp.cmu,#7200$ ; PS cm=u
;
.word 0,0 ; temporary stack
7100$: .word 0 ; saved PS
;
; now in user mode, try sneak to cm=u and pri=6
7200$: mov #7100$,sp ; set up stack
rtijmp #cp.cmu!cp.pr6,#7300$ ; new PS cm=u and pri=6
;
; lands here after rti from cm=u
7300$: mov cp.psw,7100$ ; save PS
clr cp.psw ; back to kernel
mov #stack,sp ; restore stack
bic #cpnzvc,7100$ ; discard NZVC
hcmpeq #cp.cmu,7100$ ; check expected PS, pri stays 0 !
;
9999$: iot ; end of test A4.2
;
; Test A4.3 -- RTI/RTT tbit basics +++++++++++++++++++++++++++++++++++
@ -1129,7 +1180,26 @@ ta0404: mov #vhtbpt,v..bpt ; BPT handler
mov #vhttrp,v..trp ; TRAP handler
clr v..trp+2 ; run at PR0 (no PIRQ competion)
;
; part 1: simple instruction sequence --------------------------------
; part 0: traced TRAP that clears tbit ------------------------------
; Checks that a traced TRAP which loads a PS with tbit=0 does not trap.
; Tested separately because the trace zone exit of subsequent tests uses TRAP.
;
mov #200$,r5
mov #300$,vhtend
rtijmp #cp.t,#100$ ; RTI used to see bpt before 1st inst
;
100$: trap 100
110$:
;
200$: .word 0,0
.word 0,0
.word -1,-1
;
300$: htinit 200$,2. ; expect 2 items
htitem #014,#100$ ; bpt before trap (none after !)
htitem #036,#110$ ; final trap
;
; part 1: simple instruction sequence -------------------------------
; Checks that trace traps are taken instructions which allow prefetch
; and that the destination PC is saved for flow control instructions.
;
@ -1175,7 +1245,7 @@ ta0404: mov #vhtbpt,v..bpt ; BPT handler
htitem #014,#1180$ ; bpt after jmp (PC is jmp target)
htitem #036,#1190$ ; final trap
;
; part 2: tracing of trap instructions (EMT tested) ------------------
; part 2: tracing of trap instructions (EMT tested) -----------------
;
2000$: mov #2200$,r5
mov #2300$,vhtend
@ -1201,12 +1271,19 @@ ta0404: mov #vhtbpt,v..bpt ; BPT handler
htitem #014,#2130$ ; bpt after nop
htitem #036,#2140$ ; final trap
;
; part 3: tbit vs interrupt precedence (via PIRQ) --------------------
; part 3: tbit vs interrupt precedence (via PIRQ) -------------------
; Checks that interrupt has precedence over tbit traps.
; Skipped on SimH which implements J11 precedence (tbit over interrupt).
; On a 11/70 and the w11 the PIRQ handler is sprung first, and sees as return
; address the PC after the movb that triggers the PIRQ. The tbit trap is
; sprung on return of the PIRQ handler, and also sees as return address
; the PC after the movb that triggers the PIRQ.
; On a J11 and on SimH the BPT handler is sprung first, and sees as return
; address the PC after the movb that triggers the PIRQ. After the signature
; is looged, the handler lower priority to 0, at that point the PIRQ
; handler is sprung and sees as return address the PC after the clr cp.psw
; that enabled the PIRQ interrupt.
;
3000$: cmpb systyp,#sy.sih ; skip on SimH (different service order)
beq 4000$
3000$: mov #1,vhtbp0 ; ask BPT to lower priority
cmpb systyp,#sy.e11 ; skip on e11 (different service order)
beq 4000$
;
@ -1219,23 +1296,30 @@ ta0404: mov #vhtbpt,v..bpt ; BPT handler
3120$:
;
3200$: .word 0,0
.word 0,0
.word 0,0
.word 0,0
.word 0,0
.word -1,-1
;
3300$: htinit 3200$,3. ; expect 3 items
cmpb systyp,#sy.sih ; different checks for SimH service order
beq 3310$
; checks for w11
htitem #240,#3110$ ; pirq (with return address)
htitem #014,#3110$ ; bpt after movb
htitem #036,#3120$ ; final trap
br 4000$
; checks for SimH
3310$: htitem #014,#3110$ ; bpt after movb first
htitem #240,#vhtbpe ; pirq from bpt handler
htitem #036,#3120$ ; final trap
;
; part 4: traced WAIT and tbit ---------------------------------------
; part 4: traced WAIT and tbit --------------------------------------
; Checks that traced WAIT does not produce tbit trap.
; Checks that SPL does not produce a tbit trap.
; Skipped on SimH which implements J11 semantics for SPL, WAIT, precedence.
;
4000$: cmpb systyp,#sy.sih ; skip on SimH
4000$:
cmpb systyp,#sy.sih ; skip on SimH
beq 5000$
cmpb systyp,#sy.e11 ; skip on e11 (for precedence)
beq 5000$
@ -1262,7 +1346,7 @@ ta0404: mov #vhtbpt,v..bpt ; BPT handler
htitem #014,#4120$ ; bpt after wait
htitem #036,#4130$ ; final trap
;
; part 5: WAIT and SPL in user mode ----------------------------------
; part 5: WAIT and SPL in user mode ---------------------------------
; Checks that WAIT and SPL in user mode are traced (are nop)
;
5000$: mov #5200$,r5
@ -1284,7 +1368,7 @@ ta0404: mov #vhtbpt,v..bpt ; BPT handler
htitem #014,#5120$ ; bpt after spl
htitem #036,#5130$ ; final trap
;
; part 6: tbit trap after continuation over s_idle -------------------
; part 6: tbit trap after continuation over s_idle ------------------
; Checks instructions that complete via s_idle are properly traced
; Four instructions branch at completion to s_idle
; WAIT s_op_wait (after interrupt)
@ -1323,7 +1407,7 @@ ta0404: mov #vhtbpt,v..bpt ; BPT handler
clr @#0
clr @#2
;
; part 7: no tbit trap after an abort --------------------------------
; part 7: no tbit trap after an abort -------------------------------
; Checks that an aborted instruction doesnt tbit trap.
; Uses a bus timeout as abort reason (access to 160000).
;
@ -1344,7 +1428,79 @@ ta0404: mov #vhtbpt,v..bpt ; BPT handler
htitem #014,#7110$ ; bpt after 1st clr
;
mov #v..iit+2,v..iit ; restore
;
; part 8: traced RTI that clears tbit -------------------------------
; Checks that a traced RTI loading a PS with tbit=0 does not trap.
; Skipped on SimH which does TRAP (confirmed bug, to be fixed).
;
8000$: cmpb systyp,#sy.sih ; skip on SimH (traced RTI traps)
beq 9000$
;
push2 #0,#8300$ ; frame used by RTI
mov #8200$,r5
rttjmp #cp.t,#8100$
;
8100$: nop ; will tbit trap
8110$: rti ; will not tbit trap (new PS tbit=0)
;
8200$: .word 0,0
.word -1,-1
;
8300$: htinit 8200$,1. ; expect 1 item
htitem #014,#8110$ ; bpt after nop
;
; part 9: EMT that sets tbit ----------------------------------------
; Checks that a vector flow loading a PS with tbit=1 does trap.
; Test case is the EMT instruction.
; Skipped on SimH which does not trap (tbd whether that is J11 behavior or bug).
;
9000$: cmpb systyp,#sy.sih ; skip on SimH (RTI traps, EMT not)
beq 10000$
;
mov #9100$,v..emt
mov #cp.t,v..emt+2
mov #9200$,r5
;
emt 100
br 9300$
;
9100$: nop ; will tbit trap
9110$: rti ; will not tbit trap (new PS tbit=0)
;
9200$: .word 0,0
.word 0,0
.word -1,-1
;
9300$: htinit 9200$,2. ; expect 2 items
htitem #014,#9100$ ; bpt at entry of EMT handler
htitem #014,#9110$ ; bpt after nop
;
; part 10: PIRQ that sets tbit ---------------------------------------
; Checks that a vector flow loading a PS with tbit=1 does trap.
; Test case is a PIRQ interrupt.
; Skipped on SimH which does not trap (tbd whether that is J11 behavior or bug).
;
10000$: cmpb systyp,#sy.sih ; skip on SimH (RTI traps, PIRQ not)
beq 9999$
;
mov #10100$,v..pir
mov #cp.pr7!cp.t,v..pir+2
mov #10200$,r5
;
movb #bit01,cp.pir+1 ; request PIRQ 1
br 10300$
;
10100$: clr cp.pir ; will tbit trap
10110$: rti ; will not tbit trap (new PS tbit=0)
;
10200$: .word 0,0
.word 0,0
.word -1,-1
;
10300$: htinit 10200$,2. ; expect 2 items
htitem #014,#10100$ ; bpt at entry of PIRQ handler
htitem #014,#10110$ ; bpt after movb
;
; restore ------------------------------------------------------------
;
mov #v..bpt+2,v..bpt ; restore v..bpt to catcher
@ -1601,11 +1757,17 @@ vhuhlt: halt
; vhtbpt - handler for BPT tracing +++++++++++++++++++++++++++++++++++++++++++
; Writes signature to data area (ptr in r5).
; Signature is vector address + return PC (PC to test proper context).
; If vhtbp0 is non-zero, the handler lowers priority to PRI=0 before RTT.
;
vhtbpt: htstge (r5) ; r5 at fence ?
mov #014,(r5)+ ; track BPT vector
mov (sp),(r5)+ ; track PC
rtt ; end with RTT (!)
tst vhtbp0 ; should PRI be lowered ?
beq vhtbpe
clr vhtbp0 ; and clear flag
clr cp.psw ; now PRI=0, immediate effect
vhtbpe: rtt ; end with RTT (!)
vhtbp0: .word 0
;
; vhtemt - handler for EMT tracing +++++++++++++++++++++++++++++++++++++++++++
; Writes signature to data area (ptr in r5).

View File

@ -1,10 +1,10 @@
; $Id: cpu_mmu.mac 1332 2022-12-21 11:56:32Z mueller $
; $Id: cpu_mmu.mac 1337 2022-12-26 11:14:21Z mueller $
; SPDX-License-Identifier: GPL-3.0-or-later
; Copyright 2022- by Walter F.J. Mueller <W.F.J.Mueller@gsi.de>
;
; Revision History:
; Date Rev Version Comment
; 2022-12-17 1331 1.0 Initial version
; 2022-12-25 1337 1.0 Initial version
; 2022-07-24 1262 0.1 First draft
;
; Test CPU MMU: all aspects of the MMU
@ -1768,7 +1768,7 @@ td0101:
mov mmr2,(sp) ; roll back PC
bic #<m0.anr!m0.ale!m0.ard>,mmr0 ; clear abort bits
rtt ; return and restart instruction
; MMU abort reruns must use rtt to avoid a
; MMU abort re-runs must use rtt to avoid a
; spurious tbit trap in case traced instruction
;
3000$: .word 0 ; save mmr0
@ -1877,6 +1877,12 @@ td0201: tstb systyp ; skip if not on w11
; It expects an abort with ico=1, an abort with ico=0 and a trap.
; In these three cases, first the expected environment is checked and
; after that the corrective action is taken.
; Notes:
; - the case of simultaneous MMU abort and trap (abort flags set and
; m0.trp newly set) is not handled in this simple demonstrator.
; - the code uses m*pd even though D-space is not enabled.
; - the code uses RTT to return for an instruction re-execution. That avoids
; a spurious tbit trap in case the aborted instruction was traced.
;
3000$: htstge (r5) ; r5 at fence ?
mov #250,(r5)+ ; trace
@ -1906,7 +1912,7 @@ td0201: tstb systyp ; skip if not on w11
;
; use MMR1 to correct SM SP
mov mmr1,r1 ; get mmr1
mfpi sp ; get SM sp
mfpd sp ; get SM sp
pop r2
cmp #^b0000000011110110,r1 ; mmr1: sp -2
bne 3100$
@ -1926,10 +1932,10 @@ td0201: tstb systyp ; skip if not on w11
; move stack frame from kernel to target mode (SM)
;
sub #4,r2 ; final SM SP
mtpi (r2) ; move SM frame PC
mtpi 2(r2) ; move SM frame PS
mtpd (r2) ; move SM frame PC
mtpd 2(r2) ; move SM frame PS
push r2
mtpi sp ; update SM SP
mtpd sp ; update SM SP
;
; use MMR2 to start target handler (PIRQ)
;
@ -1955,7 +1961,9 @@ td0201: tstb systyp ; skip if not on w11
bis #md.arw,sipdr2 ; code page read-writable
mov mmr2,(sp) ; point to failed instruction
bic #m0.anr!m0.ale!m0.ard,mmr0 ; clear abort flags
rtt ; re-run instruction
rtt ; re-run instruction, use rtt to avoid a
; spurious tbit trap in case traced
; instruction
;
; handle MMU trap --------------------------------
; Expect trap from cp.pir register access
@ -1982,13 +1990,13 @@ td0201: tstb systyp ; skip if not on w11
; move stack frame from hander to kernel stack
;
add #4,sp ; drop return frame
mfpi sp ; get SM SP
mfpd sp ; get SM SP
pop r2
mfpi 2(r2) ; move frame PS
mfpi (r2) ; move frame PC
mfpd 2(r2) ; move frame PS
mfpd (r2) ; move frame PC
add #4,r2 ; correct SM SP
push r2
mtpi sp ; update SM SP
mtpd sp ; update SM SP
rti ; finish rti from handler
;
; finally restore ------------------------------------------