Nah, I dislike counting clock cycles and comparing the speeds of register access. Also, the concepts of AI and assembler don't really like go well together. I also found it a bit troublesome (back when I tried my hand at VGA under DOS) that if you want to do something that's not supported on the hardware level, you'll have to optimize it real hard. Scrolling was a pain, I imagine I probably wouldn't have been able to get some unfilled polygons to display in 3D there at decent speeds. And the last thing I'd want to do is optimize mathemathical routines in assembler.
To each his own, I guess.

I hope your stuff will work in VBAdvance.