Okay, here's a quick sketch of what I'm thinking of for the core architecture of the interpreter. It's not PDD'd yet, as I fully expect (hope, even) that the sillier parts of it will get ripped to shreds: =head1 Stacks The interpreter has multiple stacks, and they're all segmented. The push/pop opcodes handle allocating new stack segments when necessary. The segmentation's generally transparent, since very little besides the push/pop opcodes needs to know anything about the actual stack architecture. The stacks are at least: =over 4 =item Temp stack for squirreling away the contents of individual registers =item Register stack For pushing the entire register file at once. There are four sets, one for each register type. =item state stack For the interpreter's internal state =back =head1 Registers We have four sets. Each set has 64 members =over 4 =item PMC pointer These registers point to PMCs =item stringish pointer These registers hold pointers to string structures and things like it. (bigint and bigfloat structs are the same, more or less) =item integers Integers. Mostly for temp work and the regex engine ops. =item floats Floats. (Duh! :) Mostly for temp math work. Potentially unused. =back =head1 Opcodes Opcodes are all dispatched indirectly via an opcode function table. Each segment of bytecode (a segment roughly corresponding to a compilation unit--a precompiled module would be in its own segment, for example) has its own opcode function table. Opcodes are all responsible for returning a pointer to the next opcode to execute. The interpreter can figure out the offset, but we won't--it's faster for the opcode functions to do the math. (No table lookups that way) =head1 The opcode loop This is a tight loop. All it does is call an opcode function, get back a pointer to the next opcode to execute, and check the event dispatch flag. Lather, rinse, repeat ad infinitum. =head1 Bytecode Bytecode is both the on-disk representation of a perl program and the in-memory representation of a perl program. The bytecode comes in three sections. The fixup and constants sections have absolute machine addresses in them (after the loader is finished with them) while the opcode section has none. This will allow us to mmap precompiled code and share at least some of it amongst multiple processes. =over 4 =item fixup section This section has pointers to various things that we need pointers to. On-disk the pointers are zeroed, and the loader will fix them up properly. =item constant section The constants section contains all the PMCs for the constants used in the code. The loader will patch up the various pointer bits as needed when the code is loaded in. =item opcode section This section contains the actual executable code (if stuff fed to an interpreter can be considered executable) for a perl program. It should be completely position independent, referring only to variables that are dynamically allocated, referred to by name, or in the constant section. =back Dan --------------------------------------"it's like this"------------------- Dan Sugalski even samurai dan@sidhe.org have teddy bears and even teddy bears get drunkThread Next