| R. Lee, "Precision Architecture," IEEE Computer, Vol. 22(1), Jan. 1989, pp. 78--91. |
.... page node page offset virtual address Subsequent work has verified the performance advantages of translating virtual addresses to physical addresses at the memory rather than at the processor ( Teller94] Qui98] Qui01] A related idea is inverted page tables ( Houdek81] Chang88] [Lee89]) which also feature a one to one correspondence between page table entries and physical pages. However, the intention of inverted page tables is simply to support large address spaces without devoting massive amounts of memory to traditional forward mapped page tables. The page tables still ....
Ruby Lee, "Precision Architecture", IEEE Computer, January 1989, pp. 78-91.
....requirements on the granularity of translation (which should be large in order to maximise translation coverage) and protection (which should be small) it makes sense to consider separating the hardware mechanisms for protection and translation. One such approach is that used in the PA RISC [33] and Itanium [34] processors. These tag TLB entries with a protection key, which is used to look up additional access information in a separate protection cache. On the Itanium this cache is a small (16 on the first generation processor) fully associative set of protection key registers (PKRs) ....
Ruby B. Lee. Precision architecture. IEEE Comp., 22(1):78--91, Jan 1989.
....address during the load operation. A simple optimization, which is often missed by compilers is possible as follows: 2. and Rc,Rc,0x3FC 3. load Rd,Rc(Rb) 3.2. Scaled indexed addressing This addressing mode is found in some existing Instruction Set Architectures (ISAs) such as the PA RISC 2. 0 [6]. The index scaling is also migrated into the load instruction, and this permits a single instruction saving per table lookup. 2. and Rc,Rc,0xFF 3. load.4 Rd,Rc(Rb) The 4 after the load indicates that a scaling for four bytes (that is two bits) will be applied to the index. 3.3. ....
R.B. Lee, "Precision Architecture," IEEE Computer, Vol. 22, No. 1, pp. 78-91, January 1989.
....understood as clearly as subword arithmetic operations. They require moving several fields (subwords) in parallel. Conventional shift and rotate instructions move all the bits in a register by the same amount. Extract and deposit instructions, found in instruction set architectures like PA RISC [17], move one field using one or two instructions. Early subword permutation instructions like mix and permute [4] in the PA RISC MAX 2 multimedia instructions are a first attempt to find efficient and general purpose subword permutation primitives. However, the sufficiency or efficiency of these ....
Ruby Lee, "Precision Architecture", IEEE Computer, Vol. 22, No. 1, Jan 1989, pp.78-91.
....and shifted to their destination locations using a series of logical AND, logical OR, and shift instructions [18] For an arbitrary permutation of the bits in an n bit word, this procedure requires as many as 4n instructions. If the architecture includes instructions such as extract and deposit [7], one can reduce the instruction count of this procedure to 2n, yet this method is still unacceptably slow. Lookup tables can also be employed to perform permutations with repetitions in software [18] First, the n bit source datum is divided into x groups of bits; each group is used to index a ....
R. Lee, "Precision Architecture," IEEE Computer, vol. 22, no. 1, pp. 78-91, January 1989.
....load operation. A simple optimization, which is often missed by compilers is possible as follows. 1. shr Rc,Ra,22 2. and Rc,Rc,0x3FC 3. load Rd,Rc(Rb) 3.2. Scaled indexed addressing This addressing mode is found in some existing Instruction Set Architectures (ISAs) such as the PA RISC 2. 0 [6]. The index scaling is also migrated into the load instruction, and this permits a single instruction saving per table lookup. 1. shr Rc,Ra,24 2. and Rc,Rc,0xFF 3. load.4 Rd,Rc(Rb) The 4 after the load indicates that a scaling for four bytes (that is two bits) will be applied to the index. ....
R.B. Lee, "Precision Architecture," IEEE Computer, Vol. 22, No. 1, pp. 78-91, January 1989.
....and main memory, This trend has also increased the penalty of cache misses, which occur when a requested block is not in cache memory. Another trend in cache memory is that the size of cache memory becomes increasingly large, so we now see computer systems with several mega byte cache memory [17, 2, 11, 10]. In such a computer system, once a block is referenced, it is unlikely to be replaced from cache memory until a context switch occurs. However when a context switch occurs, most of the cache state built by the old process is replaced by the new process. The above two trends indicate that the ....
R. B. Lee. Precision architecture. IEEE Computer, 22(1):78--91, Jan. 1989.
....for the sharing of data. This advantage was recognised by systems like IBM System 38 [HSH81] Monads [RA85] and Psyche [SLM90] The first two systems used custom hardware to implement an address space that was bigger then the available 32 bits. It was not until the advent of the HP PA RISC [Lee89] the MIPS R4000 [Hei91] and the DEC Alpha [Dig92] processors that off the shelf hardware became available that had a 64 bit addressing space 1 . A 64 bit address space represents a vast increase in addressability over the previous 32 bits, making it possible to hold all the transient and ....
Ruby B. Lee. Precision architecture. IEEE Computer, 22(1):78--91, January 1989.
....page number, virtual to physical address translation is achieved by searching the page table to find a matching virtual page number. The physical frame number is deduced from the table index to the matching entry. The searching is usually performed with the aid of a hash table [IBM78, CM88, Lee89] Such an IPT is illustrated in Figure 3.5. The IPT consists of two parts, the hash table (or CHAPTER 3. PAGE TABLE STRUCTURES 26 hash anchor table) and the page directory 1 (or frame table) The page directory has an entry for each physical frame, and is indexed by physical frame number. Each ....
Ruby B. Lee. Precision architecture. Computer, January 1989.
....compiler [74] which favors byte addressable memory, and the performance measurements of Chapter 5 required five complete versions of the compiler. Second, despite any disadvantages, virtually all modern commerciallyproduced microprocessor designs are byte addressed: 56] 45] 40] 39] 27] [52], 78] 28] A notable exception is the DEC Alpha [14] D16 full word loads and stores allow base displacement style addressing, while subword modes (byte and halfword) require the address to be in a register. The six bit displacement field is signed. By requiring the offset to be word aligned, ....
Ruby B. Lee. Precision architecture. IEEE Computer, 22(1), January 1989.
....permh,0312 Ra,Ra ;Ra = a d b c arbitrary permutation without repetition Figure 3: Permute Instruction Examples 2. 4 Other Useful PA RISC features In addition to the MAX2 instructions described above, other existing features in the PA RISC architecture are also very useful for media processing [12 14, 5]. Table 2 lists some of the more useful ones. The Shift Right Pair instruction allows two source registers to be concatenated and shifted together, with the resulting rightmost 64 bits placed in the destination register. This instruction facilitates use of arbitrarily aligned 64 bit quantities. ....
....in execution time is between 2.6 and 4.2 for our examples. The speedup would have been even greater, except that the code without MAX2 instructions was able to take advantage of PA RISC Extract and Deposit instructions and the arithmetic nullify feature to shorten the instruction pathlength [12 14]. For most RISC architectures using shifts and ands rather than Extracts and Deposits, and without arithmetic nullification, the pathlength reduction for Block Match and Matrix Transpose would have been greater than 4, rather than just 2.6 (Figure 13) Proceedings of Multimedia Hardware ....
Lee R., "Precision Architecture", IEEE Computer, vol. 22 no. 1, Jan 1989, pp. 78-91.
....face. We had two reasons for this approach. First, programming a RISC machine in a true assembly language can be quite unpleasant. We expected this to be especially true for us, because the Titan s instruction set is reduced even further than most RISC machines that preceded or followed it [1,5,16,24,26,31,33], and moreover we expected the architecture to change somewhat from generation to generation. Second, we wanted to do very global optimization, including intermodule promotion of variables to registers. Traditional assemblers can interfere with this by allowing actions that the optimizer does not ....
Ruby B. Lee. Precision Architecture. IEEE Computer 22 (1), pp. 78-89, January 1989.
....resident pages. Two basic techniques have been used: inverted page tables and directly hashed page tables. 4.1. 1 Inverted Page Tables Inverted page tables (IPT) are characteristic of large address space architectures such as the System 38 [IBM78] the 801 [CM88] and the HP Precision Architecture [Lee89] These architectures used short form addresses together with address space registers to generate long form addresses of 40 to 64 bits in length. Translation of these long form addresses was the motivation for using hashing techniques. An IPT consists of two parts (see Fig. 2) The frame table ....
Ruby B. Lee. Precision architecture. Computer, January 1989.
....step is the selection of a suitable architecture template. 2.2. VLIW Processor Architecture Typically, modern microprocessors are defined in terms of a general processor architecture with a number of specific implementations. Examples include: x86 [33] PowerPC [18] 19] 71] Alpha [97] PA [59], MIPS [55] SPARC [104] etc. The architecture provides a general framework which links hardware design and the software development. Each new processor implementation must comply with certain basic architectural features. These features permit an abstract view of the processor for software ....
R. Lee. Precision architecture. IEEE Computer, 22(1):78--91, Jan 1989.
.... associative registers used to cache tokens that permit access to specific regions of memory. In a DSVMS, such regions correspond to objects in the DSVM. Protection domains are currently implemented using hardware similar to the PLB in at least one commercial processor, the HP PA RISC [27]. The increasing interest in thread based programming suggests that other commercial architectures may soon adopt similar protection mechanisms. 2.3 Thread Based Programming Until recently, concurrent software systems have been implemented primarily using multiple process software. System ....
R.B. Lee. Precision Architecture. IEEE Computer, 22(1):78 -- 91, 1989.
....unreliable and difficult to debug. In order to avoid this problem, protection mechanisms based on protection domains [KCE92] may be used. These systems provide what are essentially hardware supported capabilities and they are currently available in some commercial processors (e.g. the HP PA RISC [Lee89] architecture) A fundamental limitation to supporting single level stores on conventional machines is their limited (4Gbyte) address space size. Since a large database system may easily exceed 4GBs a virtual address space of that size cannot be used to address the entire database. This situation ....
R.B. Lee. Precision Architecture. IEEE Computer, 22(1):78 -- 91, 1989.
....but it is just one example of a possible processor architecture. In fact most modern microprocessors are defined in terms of a generic architecture with DESIGN PROJECT: VIDEO COMPRESSION SYSTEM 31 specific implementations. Examples include x86 [ 76] PowerPC [ 44] 45] 141] Alpha [ 177] PA [ 120], MIPS [ 109] SPARC [ 186] etc. These architectures can also be used as building blocks for the design flow. In principal, they are variations of Johnson s generic superscalar architecture [ 103] They differ mainly in the names of the instructions and in the specific way of handling ....
R. Lee. Precision architecture. IEEE Computer, 22(1):78--91, Jan 1989.
....of software (both applications and operating systems) together with increasing VLSI densities, have brought us to the verge of a major architectural change: the move from a 32 bit to a 64 bit virtual address space. This trend can be seen, for example, in the architectures of the HP PA RISC [Lee 89] and the IBM RS 6000 [Groves Oehler 90] The MIPS R4000 [MIP 91] and Digital s recently announced Alpha family [Dobberpuhl et al. 92] are the first architectures with unsegmented 64 bit address spaces. Unlike the move from 16 to 32 bit addressing, a 64 bit address space could be revolutionary ....
....restrictions: 1) cross segment pointers are not supported, 2) multiple pointer forms exist that must be treated differently by applications, and (3) application software must coordinate segment register usage to create an illusion of a single address space. The Hewlett Packard Precision [Lee 89] differs from other segmented architectures in that it allows applications to specify long form virtual addresses directly. Thus the Precision could support a uniform address operating system. However, long form pointer dereference requires a sequence of four instructions, so most software for ....
R. B. Lee. Precision architecture. IEEE Computer, pages 78--91, January 1989.
....integer units can be used in parallel with the floating point unit to fetch operands for the floating point calculations, resulting in vector processing of floating point operations. Graphics support. None. Communication support. Basic multiprocessor protocol support and interrupts. HP PA7100[9, 10] Noteworthy instructions. The integer multiply and divide execution times are data dependent, since only minimal hardware support is provided. Lee states that most of these operations as they appear in applications are variables paired with constants; in this case the execution time can be ....
Lee, R.B., Precision Architecture. IEEE Computer, 1989. 22(January): p. 78-91.
....systems, persistent storage, protection, capability based systems, object oriented database systems, microkernel operating systems, wide address architectures, 64 bit architectures. 1 Introduction The appearance of 64 bit address space architectures, such as the DEC Alpha [Dig 92] HP PARISC [Lee 89] and MIPS R4000 [MIP 91] signals a radical increase in the amount of address space available to operating systems and applications. This shift provides the opportunity to reexamine fundamental operating system structure specifically, to change the way that operating systems use address ....
....Opal segments, including persistent segments, are simultaneously and directly visible (given proper protection) in virtual memory by all applications. Linked data structures in Opal can easily span segments, perhaps with different access controls. In this respect Opal is closer to the HP PA RISC [Lee 89] which supports traditional segmented addressing, but also allows applications to use global virtual addresses directly. However, most software on the PA RISC uses short form addresses, because they are more compact and efficient, and because they permit backward compatibility with private ....
Lee, R. B. Precision architecture. IEEE Computer, pages 78--91, January 1989.
.... : 9 6 Operand Length Histogram : 10 7 Data Cache Miss Rate : 11 iv 1 What Is RYO RYO (Roll Your Own) is a family of novel instrumentation tools for the PA RISC family of processors [3]. These tools replace a specific set of PA RISC assembly instructions with user supplied subroutines. Because the user provides his own custom instrumentation routines, the use of the tool is virtually unlimited. Its uses could range from replacing a faulty hardware divide instruction with a ....
Ruby B. Lee. Precision Architecture. Computer, 22(1):78--91, January 1989.
....to some or all of the following restrictions: 1) crosssegment pointers are not supported, 2) multiple pointer forms must be treated differently by applications, and (3) software must coordinate segment register usage to create an illusion of a single address space. The Hewlett Packard Precision [Lee 89] differs from other segmented architectures in that it allows applications to specify long form virtual addresses directly. However, long form pointer dereference is expensive, so most software uses segmented addressing. Most capability based architectures [Organick 83, Levy 84] support uniform ....
Lee, R. B. Precision architecture. IEEE Computer, pages 78--91, January 1989.
....or parallel system. Another is the use of a shared memory programming model, even when the physical memory is distributed. Another is the growing size of physical memories (due to denser RAM chips) and of virtual memories (with the advent of 64 bit CPUs like the MIPS R4000 [5] the HP PA RISC [7], and the DEC ALPHA [14] Finally, main memory and secondary storage are increasingly unified through the use of virtual memory and memory mapped files. These trends make it possible to reconsider some of the basic assumptions in operating system design. Most current operating systems provide a ....
R. B. Lee. Precision architecture. IEEE Computer, 22(1):78--91, January 1989.
....some problems, including the inability to implement optimizations such as copy on write, incompatibility with existing software, and garbage collection of the address space. Our work is not applicable to single address space machines. Third, in paged segmented systems, such as HP PARisc TM [30], PowerPC TM [31] and IBM RT PC [22] the virtual address specifies a (segment id, offset) tuple and using the segment table translates into an effective global virtual address. This again eliminates aliases but can efficiently support only a limited number of segments with size and ....
Ruby B. Lee, "Precision Architecture", IEEE Computer, January 1989, 78-91.
....software (both applications and operating systems) together with increasing VLSI densities, have brought us to the verge of a major architectural change: the move from a 32 bit to a 64 bit virtual address space. This trend can be seen, for example, in the segmented architectures of the HP PARISC [Lee 89] and the IBM RS 6000 [Groves Oehler 90] and ESA 370 [Scalzi et al. 89] and the flat architectures of the MIPS R4000 [MIP 91] SPARC Version 9 [SPA 92] and Digital s Alpha family [Dig 92] Unlike the move from 16 to 32 bit addressing, a 64bit address space will be revolutionary instead of ....
R. Lee. Precision architecture. IEEE Computer, 22(1):78--91, Jan. 1989.
....the authors and should not be interpreted as representing the official policies, either expressed or implied, of DARPA, OSF, HP, the NSF, or the U.S. government. 1 Introduction Virtually indexed caches are becoming increasingly common as architects try to reduce processor cycle times [Kohn 89, Lee 89] With a virtually indexed cache, the virtual address of a data item selects the cache line in which the item should reside. In contrast, with a physically indexed cache, the virtual address must first be translated into its corresponding physical address, and that address is used to select the ....
....Such aligned addresses do not create consistency problems and therefore do not require any consistency management. 1.1 Motivation and goals We have implemented the machine dependent layer of the virtual memory system [Rashid et al. 87] in Mach 3. 0 [Accetta et al. 86] for the HP 9000 Series 700 [Lee 89] The HP 9000 Series 700 uses a high performance RISC based microprocessor (HP PARISC) with separate instruction and data caches that are direct mapped, virtually indexed, and physically tagged. The data cache is write back. There is no hardware support for consistency when a physical address is ....
Lee, R. B. Precision Architecture. IEEE Computer, pages 78--91, January 1989.
....RISC (PA RISC) was designed to serve as a common foundation for HP s computer systems and to excel in all of HP s computer markets. This section outlines the salient features of PA RISC insofar as they relate to operating system design. More complete details of the architecture can be found in [10, 12, 8, 11, 6]. The memory hierarchy of the PA RISC consists of registers (32 general purpose, 25 control and 8 space registers) an architecturally visible cache (i.e. there are instructions for cache management) main memory, and an I O system. The kernel and all processes share a global 64 bit virtual ....
Ruby B. Lee. Precision Architecture. IEEE Computer, 22(1):78--91, January 1989.
....Next we present the key data structures employed by the hardware dependent and the hardware independent components of the HP UX VM subsystem. The hardware independent component is based on UNIX System V Release 2 and Release 3 [1] The PA RISC architecture defines a global virtual address space [14]. A global virtual address is made up of two components: a space identifier (spaceID) and an offset. The offset in turn is partitioned into a virtual page number, and a page offset. The PA RISC 2.0 allows a 64 bit space ID and a 64 bit offset, which are combined to generate up to a 96 bit global ....
R. B. Lee. Precision Architecture. IEEE Computer, pages 78--91, January 1989.
....6, one can easily implement instruction annulling. Usually annulling is applied to branch delay slot instructions, but the mechanism can be used by any instruction. An example of a commercial machine that uses annulling with instructions other than branches is the HP Precision Architecture [Lee89] All arithmetic instructions in the HP PA have the option of conditionally annulling the following instruction. This allows short if then else code to be compiled without branch instructions. The recompilation search supports annulling by providing the search alternative in which a PC state is ....
....memory operations. These do not show up in the final instruction set because they are not generally useful enough for the benchmark programs. This is not to say that combination of annulling with arithmetic and memory instructions is not useful. For example, the HP Precision Architecture (HP PA) Lee89] incorporates annulling with all arithmetic instructions. This 170 type of annulling is different from the annulling used here. The automatically generated instruction set is based on the restriction of the annulling decision to the register read stage of the pipeline (basing the annulling ....
Ruby B. Lee. Precision architecture. Computer, 22(1):78--91, January 1989.
....successful acquirer subsequently releases the lock by resetting its value to 0, and unsuccessful acquirers usually spin until they are successful. Hardware engineers have implemented numerous variants of this functionality (e.g. set more than one bit [9] swap 0 and 1 for a test and clear lock [12]) but the basic concept is that of an atomic operation that sets the lock to a known state and returns its old value. The existence of caches can cause an important change in the acquisition of test and set locks because it is important that the spinning of unsuccessful acquirers not generate ....
Ruby B. Lee, "Precision Architecture", COMPUTER, (January, 1989), pp. 78--91.
....of the mbuf takes tells us about the effectiveness of the cache. This effectiveness is indicative of the benefits an application could obtain in a copy free implementation; the application would directly obtain this cache residency. The measurements were taken on a 50MHz HP Apollo 720 workstation [8, 7]. These workstations 1 For the purpose of this paper, we do not distinguish between 128 byte mbufs, and larger (page sized) mbuf clusters. contain a 256KByte data cache and a 128KByte instruction cache. The data cache is write back, direct mapped, virtually addressed and physically tagged with ....
R. B. Lee. Precision architecture. IEEE Computer, pages 78--91, January 1989.
....the additional lookaside buffer that Domain Page schemes require to operate efficiently. This is particularly significant for machines that need to support multiple cache accesses cycle, as the PLB would have to be replicated or multi ported. HP PA RISC: The HP PA RISC protection architecture [18] performs access control at the page level. Each TLB entry contains a physical page number, permission information, and a page group identifier. If a TLB hit occurs but the page group identifier does not match that provided by the process, an access violation fault occurs. Four special registers ....
LEE, R. B. Precision architecture. IEEE Computer 22, 1 (January 1989), 78--91.
No context found.
R. Lee, "Precision Architecture," IEEE Computer, Vol. 22(1), Jan. 1989, pp. 78--91.
No context found.
Ruby B. Lee. Precision architecture. IEEE Computer, 22(1):78--91, January 1989.
No context found.
Ruby B. Lee. Precision architecture. IEEE Comp., 22(1):78--91, Jan 1989.
No context found.
Ruby B. Lee. Precision architecture. IEEE Comp., 22(1):78--91, Jan 1989.
No context found.
Ruby B. Lee. Precision architecture. IEEE Computer, 22(1):78--91, January 1989.
No context found.
R. B. Lee, "Precision Architecture," IEEE Computer, vol. 22, no. 1, pp. 78-91, January 1989.
No context found.
R. B. Lee, "Precision Architecture," IEEE Computer, 22(1), pp. 78-91, Jan. 1989.
No context found.
R. B. Lee, "Precision Architecture," IEEE Computer, Vol.22, No.1, January 1989, pp.78--91.
No context found.
R. B. Lee. Precision architecture. Computer, 22(1):78--91, Jan. 1989.
No context found.
Ruby B. Lee. Precision architecture. IEEE Computer, 22(1):78--91, January 1989.
No context found.
R. Lee, "Precision Architecture," Computer, Vol. 22, No. 1, Jail. 1989, pp. 78-91.
No context found.
R. Lee, "Precision Architecture," IEEE Computer, Vol. 22, no. 1, January 1989, pp 78-91.
No context found.
R.B. Lee, "Precision Architecture," IEEE Computer, Vol. 22, no. 1, January 1989, pp.79-91.
No context found.
Lee, R., "Precision Architecture," IEEE Computer. Vol. 22, No. 1, January 1989.
No context found.
Ruby B. Lee. Precision Architecture. IEEE Computer, 22(1):78--91, January 1989.
No context found.
Lee 1989; Lee, R.B., "Precision Architecture." Computer, January 1989, pp 78- 91.
No context found.
R. Lee. Precision architecture. IEEE Computer, 22:78--91, January 1989.
No context found.
Ruby B. Lee. Precision Architecture. IEEE Computer 22(1):78--91, January 1989.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST at NUS Add search form to your site Submit documents Feedback
CiteSeer.IST at NUS - Copyright Penn State and NEC. Hosted by the School of Computing, National University of Singapore.