Alright, I'm considering coding a scripting language - do I want to code a VM, or do I want to have built-in functions which do the same things, and what are the advantages/disadvantages?

Here is my design so far (before I get further into a hole with it):

R1 thru R15 (15 registers (32-bit), each will be on the stack for manipulation purposes)
IP - Instruction Pointer
SP - Stack Pointer (15 dword + = actual location in cpu stack)
RL1 thru RL8 (8 floating point registers for use in FPU (REAL4 size))
FL1 thru FL8 (8 floating point registers for use in FPU (REAL8 size))
eax thru edx (direct access registers (DARs) but using within scripting could be inefficient with all the
pushing/popping/stack fixing)

simple instructions:

1) what are advantages or disadvantages of using a VM vs function calls that do the same thing, I guess if there is no organization the project could blow up into bloatware.

2) why not use the cpu's registers and stack - is it to protect the cpu from myself?

3) something such as:

:LoadBitmap 0,"test1.bmp";

it would call the win32api function LoadBitmap in the function and copy the value from eax to the VM's eax equivalent R1... probably do ten things within this one function, is this the basic idea I want to have?

Posted on 2004-03-15 12:37:12 by drarem

The main advantage of function calls is speed. You avoid the cost of decoding every instruction. Threaded code can give you the flexibility of a VM, but not necessarily in the most compact form.

The main advantage of VM is portability and flexibility. You can make the instruction set as simple, complex, or compact as you want. You end up building a new VM interpreter, or a new VM-to-machine-code translator, for every new processor you want running your script.

Ease of code generation is generally a highly desirable goal for VMs.


If portability is a concern, mixing the CPU and VM stack is not a good idea. The target processor might have a different stack model, or you may need to implement the VM with a high level language on the target platform.

Otherwise, it's a matter of how simple or complex you want your stack model to be.


For a scripting language, you probably want to emphasize ease of code generation, and the encoding of often-used complex tasks. For example, Perl has embedded within it a regular expression processor and an association table handler. PHP adds security features to that mix. VB hides the message loop and provides an event-oriented interface for windowing applications.
Posted on 2004-03-15 18:30:42 by tenkey
One more question.. I'm trying to initialize a structure, and I keep getting 'line too long' error - is there a way around this? I tried enclosing each line in '<>' but it didn't help. I could make one init function but this would be easier for me to maintain.

VPU struct
X1 dd 0
S1 db 8 dup(0)
V1 dd 0
VPU ends

VPU CHANKWARE <{ offset VMmov, 'mov', ID }>,
<{ offset VMlea, 'lea', ID }>,
<{ offset VMpush, 'push', ID }>,
<{ offset VMpop, 'pop', ID }>,
<{ offset VMpushaq, 'pushaq', ID }>,
<{ offset VMpopaq, 'popaq', ID }>,
<{ offset VMjmp, 'jmp', ID }>,
<{ offset VMjle, 'jle', ID }>,
<{ offset VMjge, 'jge', ID }>,
<{ offset VMjl, 'jl', ID }>,
<{ offset VMjg, 'jg', ID }>,
<{ offset VMje, 'je', ID }>,
<{ offset VMrol, 'rol', ID }>,
<{ offset VMror, 'ror', ID }>,
<{ offset VMshl, 'shl', ID }>,
<{ offset VMshr, 'shr', ID }>
Posted on 2004-03-17 08:10:23 by drarem
One of MASM's silly limitations... try this:

CHANKWARE VPU <offset VMmov, 'mov', ID>
VPU < offset VMlea, 'lea', ID >
VPU < offset VMpush, 'push', ID >
VPU < offset VMpop, 'pop', ID >
Posted on 2004-03-17 09:13:37 by f0dder
i would recomend that you study and understand the FORTH VM, it is a little ODD but it has tremendouse power and hase the source code (compared to Java)

The VM core is only 3-8 ASM intructions long (but i can take a long time until ou understand how those 8 instructions are working :D

The VM only has 3-4 registers max: IP, SP, RP and sometimes a an extra scratch register, but theoretically it can run with only IP and SP . I bet it can be implemented in days/hours on every new CPU.

Using the VM core and those 3 registers the first 8-30 instructions have to be primitives (aka they need to be implemented in ASM or native language on target CPU all the rest of FORTH is also written in FORTH (macro) ...

So even if i detect some forthish style in your : (collon) definition... you obviousely do not get it right since you want to implement so many registers into your VM.

Simplify! because the VM has deepth on its own ... no need to emulate a full CPU or elese you will create a VMvare like emulator and not a Virtual Machine.
Posted on 2004-03-17 13:03:11 by BogdanOntanu
Ask yourself what you are building and why. A VM for a scripting language - but for what purpose? This will affect the VM itself, but even more the functionality you implement as host code (a "shell utility" scripting language needs different stuff than a 3D game scripting language).

Keep it relatively simple. Your current stuff is starting to look a bit like an x86 spinoff... you might consider using DLLs with native code instead, it'll be even more flexible and run faster :) - keep the VM relatively simple, and have a look at some other architectures than x86.

Also, decide whether you need a "VM" or a "CPU". The border between the two is sorta vague, and even stuff that most people certainly see as "pure VM" like java or .net could be implemented as native CPUs.

Keeping things simple does not necessarily mean a instruction set with very simple instructions. If you want any kind of performance from a virtual machine, you'll want to have it relatively abstract/highlevel, with "longer" functions that can be implemented efficiently on the native machine. This also gives things like JITters a lot more playroom.

Ie, it would be more useful to have a strlen "instruction" in the VM and implement it with fast native code, than implementing it with simple VM instructions. This sounds like almost the complete opposite of forth? Anyway, it will keep the VM programs less bloated & faster, at the expense of a larger runtime... but ideally (for a "real life" VM) the combined size of all programs should end up quite larger than runtime size, and if the programs have smaller size and larger performance - well, I can live with a somewhat larger runtime.

Ohyeah, you'll also have to consider how you want to program the VM. In VM-assembly, or rather through a highlevel/scripting language that gets compiled to VM bytecode? Would you be familiar with a purely stack-based VM, pure registers, or a mix? And what would be efficient for a compiler?

hase the source code (compared to Java)

Isn't the full specs available for Java? At least enough to do a java->bytecode compiler, which should be quite some help when implementing it. Plus opensource java compilers, and what do I know.
Posted on 2004-03-17 18:28:28 by f0dder