Somewhere in my OS I need to divide a number by 160 and retrieve the remainder. This might happen a lot so I thought of an alternative to the DIV instruction and wrote the below code in which the EAX holds the numerator and after the computations, it will contain the remainder. I was wondering if any of you could please suggest a better and perhaps faster way. (The code takes 12 clock cycles to execute on my PIII 800 MHZ).

``                                  ; Input -> EAX = Numerator  MOV     EBX , EAX               ; Save the Numerator temporarily in EBX  MOV     ECX , 0xCCCCCCCD        ; Constant Reciprocal Multiplier of 160  MUL     ECX                     ; EDX = EAX DIV 160  MOV     EAX , EBX               ; EAX is the Numerator again  SHR     EDX , 0x00000002        ; SHR EDX 0x07 & SHL EDX 0x05, Step 1  AND     EDX , 0xFFFFFFE0        ; SHR EDX 0x07 & SHL EDX 0x05, Step 1  LEA     EDX ,   ; Multiply EDX by 160, Step 2  SUB     EAX , EDX               ; EAX = Remainder of (EAX DIV 160)                                  ; Output <- EAX = Remainder``
Posted on 2007-03-18 03:54:55 by XCHG

Somewhere in my OS I need to divide a number by 160 and retrieve the remainder. This might happen a lot so I thought of an alternative to the DIV instruction and wrote the below code in which the EAX holds the numerator and after the computations, it will contain the remainder. I was wondering if any of you could please suggest a better and perhaps faster way. (The code takes 12 clock cycles to execute on my PIII 800 MHZ).

``                                  ; Input -> EAX = Numerator  MOV     EBX , EAX               ; Save the Numerator temporarily in EBX  MOV     ECX , 0xCCCCCCCD        ; Constant Reciprocal Multiplier of 160  MUL     ECX                     ; EDX = EAX DIV 160  MOV     EAX , EBX               ; EAX is the Numerator again  SHR     EDX , 0x00000002        ; SHR EDX 0x07 & SHL EDX 0x05, Step 1  AND     EDX , 0xFFFFFFE0        ; SHR EDX 0x07 & SHL EDX 0x05, Step 1  LEA     EDX ,   ; Multiply EDX by 160, Step 2  SUB     EAX , EDX               ; EAX = Remainder of (EAX DIV 160)                                  ; Output <- EAX = Remainder``

What is this instance that requires to divide by 160?
Posted on 2007-03-18 04:30:28 by SpooK
One of the procedures that uses this computation is my __CarriageReturn procedure in my Video Driver. I have defined a global variable called VideoCursor which holds the number of bytes that should be added to the start of the Video Segment in order to retrieve the current Video Cursor so that I will know if I should write a character to the screen, where it should go. Now if the user presses Enter, I should divide this value by 160 (160 bytes per row in 80*25) and then get the remainder. The remainder will be subtracted from this value to get the position of the character at the beginning of the current line.

The code that I have written for this procedure is:

``  __CarriageReturn:    ; void __CarriageReturn (void)    PUSH    EAX                                       ; Push the accumulator onto the stack    PUSH    EBX                                       ; Push the base index onto the stack    PUSH    ECX                                       ; Push the count register onto the stack    PUSH    EDX                                       ; Push the data register onto the stack    MOV     EAX , DWORD PTR              ; Move the value of the  global variable into EAX    TEST    EAX , EAX                                 ; See if we are at the first row, first column?    JE      .EP                                       ; If yes, there is no need to move to the beginning of the line    MOV     EBX , EAX                                 ; Put the value of  in the base index temporarily    MOV     ECX , 0xCCCCCCCD                          ; The reciprocal multiplier for EAX    NOP                                               ; Prevent dependence chain    MUL     ECX                                       ; Multiply EAX by ECX (EDX = EAX/ECX)    SHR     EDX , 0x00000002                          ; SHR EDX 0x07 And SHL EDX 0x07 = SHR EDX 0x02    AND     EDX , 0xFFFFFFE0                          ; Remove bits 0 through 4 (inclusive) in the result (Multiply by 160)    MOV     EAX , EBX                                 ; EAX holds the original number now    LEA     EDX ,                     ; EDX = EDX * 160 (Step 2)    SUB     EAX , EDX                                 ; EAX = EAX Mod 160    SUB     DWORD PTR  , EAX             ; Subtract the remainder from the video cursor, beginning of the line    .EP:                                              ; End of the procedure      POP     EDX                                     ; Restore the data register      POP     ECX                                     ; Restore the count register      POP     EBX                                     ; Push the base index onto the stack      POP     EAX                                     ; Restore the accumulator    RET                                               ; Return to the calling procedure, no parameters to remove``
Posted on 2007-03-18 04:59:32 by XCHG
Sounds like you would be better off keeping track of X/Y instead of an absolute calculation. You can always multiply X*Y when needed.

``;ebx = x;ecx = yxor ebx,ebxinc ecx``

This comes in even more handy when you have to deal with "scrolling" the screen and whatnot ;)
Posted on 2007-03-18 05:38:35 by SpooK
Position calculation for your textmode buffer is not going to be a speed-critical piece of code, you're wasting your time on doing micro-optimizations when that time could be spent better elsewhere.
Posted on 2007-03-18 10:59:57 by f0dder

Sounds like you would be better off keeping track of X/Y instead of an absolute calculation. You can always multiply X*Y when needed.

I would agree with that - with a CR, you would just make x=0, and a LF y+1.

You could also use a lookup table filled with line start addresses. (26*4=only 104 bytes for the table)
Posted on 2007-03-19 03:12:59 by sinsi
I don't know but you see, when using a relative cursor position like what I am doing, you will only access a global variable once, and for example printing a character to the screen, add 0x00000002 to it for having moved forward 1 character (+1 attribute) but when you want to have X and Y separately, you will have to read both of them, do the calculation on both of them and store them both to their memory locations. This is also a problem when applying back spaces to the screen because you are going to have to calculate more than relative addresses. I tried implementing this method today and saw all these problems and just thought maybe I shouldn't do it.

Below is my yet-uncommented Write Character procedure.

``  __WriteChar:    ; void __WriteChar (DWORD TheChar) ; StdCall;    PUSH    EAX                                         ;     PUSH    EBX                                         ;     PUSH    EBP                                         ;     MOV     EBP , ESP                                   ;     MOV     EAX , DWORD PTR                 ;     MOV     AH , 0x0F                                   ;     MOV     EBX , VIDEOSEGMENT                          ;     ADD     EBX , DWORD PTR                ;     CMP     EBX , VIDEOSEGMENT + VIDEOBYTECOUNT         ;     JB      .NoScrolls                                  ;     INVOKE  __ScrollUp                                  ;     SUB     EBX , 160                                   ;     .NoScrolls:                                         ;       MOV     WORD PTR  , AX                       ;       ADD     DWORD PTR  , 02h             ;       POP     EBP                                       ;       POP     EBX                                       ;       POP     EAX                                       ;     RET     0x04                                        ; ``

But if I had separate variables for X and Y, I had to multiply the Y by 80, multiply the X by 2, add the segment to the result and all that stuff. I really like to hear some opinions about this. I would implement the X and Y approach if I knew the advantages. My Scroll procedure is also implemented easily using this approach. Thank you guys in advance.

Posted on 2007-03-19 04:13:12 by XCHG

But if I had separate variables for X and Y, I had to multiply the Y by 80, multiply the X by 2, add the segment to the result and all that stuff.

Use a lookup table - there's your Y. Then shl X, add Y and there's your address. This can work for any text resolution - even 132x50 VESA modes.
Posted on 2007-03-19 04:41:28 by sinsi