why limited number of registers
- large number of regs may increase the clock cycle time since the addressing circuit (mux/demux) delay time will increase.
- only registers communicated to each other with operators (ALU). they also communicated with the main memory by load/store instructions. it acted as a small cache but without hardware hit/miss management
- optimization on using registers is a key to both performance and energy efficiency.
- ALU can also take operand data from instructions for computation with constant.
instruction format
- in order to use the bit field efficiently, the meaning of the later bit field is determined by previous field. normally the op field will indicate the field segmentation of the rest field.
conditional branches and jumps
- to achieve switch and loop structure of algorithm, in other words, reuse hardware resource in a time multiplexing fashion, we use jump instruction to reset the program counter.
- for procedures and functions, a black box model for structuring program also need to use jumps, but the program counter will need to return to it’s original address when the procedure/function is done, the interface is implemented by sharing registers.
- procedure/functions can be run as pipeline model (each core running a procedure/function) with shared memory between those cores (possibly need to send a done message between cores for better efficiency ), if there is a explicit parallelism, all the procedure/func and run concurrently and synchronies by done message again. note when a core finish early, instead waiting for the done message from previous core, it can just switch to another thread to keep the core busy.
No comments:
Post a Comment