Last update: 6/11/08
| IBM (PowerPC-440) | IBM (POWER5) | IBM (POWER5+) | IBM (POWER6) | AMD (Opteron 248) |
|---|---|---|---|---|
| frost | bluevista | blueice | bluefire | lightning |
| Chip speed (clock) | ||||
| 700 MHz | 1,900 MHz | 1,900 MHz | 4,700 MHz | 2,200 MHz |
| Cycle time | ||||
| 1.42 ns | 0.5262 ns | 0.5262 ns | 0.2128 ns | 0.45 ns |
| Die type | ||||
| Dual core | Single core | Dual core | Dual core | Single core |
| Cache size | ||||
|
L1 -- 32-KB data cache, 32-KB instruction cache, 32-B cache line. 64-way set-associative. L1 not coherent, so only 1 CPU used. L2 -- 256 KB, shared, 128-B line, fully associative. L3 -- 4 MB, 8-way set-associative. 128-B cache line |
L1 -- 96-KB per core, 32-KB data cache, 64-KB instruction cache, 128-B cache line. 4-way associative L2 -- 1.92 MB, 128-B line, 10-way set-associative. L3 -- 36 MB, 12-way set-associative. 128-B cache lines |
L1 -- 96-KB per core, 32-KB data cache, 64-KB instruction cache, 128-B cache line. 4-way associative L2 -- 1.92 MB per processor, 128-B line, 10-way set-associative. L3 -- 36 MB, 12-way set-associative. 128-B cache lines |
L1 -- 128 KB (64-KB data + 64 KB instruction) per processor L2 -- 4 MB per processor L3 -- (off chip) 32 MB per chip, shared by the two processors |
L1 -- 64-KB, 2-way set-associative L2 -- 1024 KB, 16-way associative 64-B cache lines on both 8 cycle latency (to transfer a whole cache line) |
| Cache latencies | ||||
|
L1 miss: 3 cycles L2 miss: 11 cycles L3 miss: ~35 cycles; external DRAM 75 cycles 2 instructions/cycle possible |
L1 miss: 4 cycles L2 miss: 14 cycles L3 miss: L3 is on-chip, so operates at half CPU speed 7 instructions/cycle possible |
L1 miss: 4 cycles L2 miss: 14 cycles L3 miss: L3 is on-chip, so operates at half CPU speed 7 instructions/cycle possible |
L1 miss: L2 miss: 4 floating point operations/cycle per processor possible |
L1 miss: 2 cycles L2 miss (to local memory): 19 cycles, 3 instructions/cycle possible |
| Translation Lookaside Buffer (TLB) | ||||
| CPU memory management unit is a 64-entry fully associative unified TLB, supporting variable page sizes | TLB holds 1024 entries, 4-way set-associative, pages can be 4 KB or 16 MB. Also has 2 ERATs with 128 entries each | TLB holds 1024 entries, 4-way set-associative, pages can be 4 KB or 64 KB (settable by user). 16 MB pages possible (requires system reset). Also has 2 ERATs with 128 entries each | 2-level TLB: L1 TLB holds 32 entries to 4 KB pages, fully associative L2 TLB holds 512 entries, 4-way associative |
|
| TLB latencies | ||||
| Very low latency if in L2 | Very low latency if in L2 | Very low latency if in L2 | Very low latency if in L2 | Similar to L2 cache miss |
| Registers | ||||
| Double FPU has 32 primary f.p. registers, 32 secondary f.p. registers | 120 GPRs, 120 FPRs | 120 GPRs, 120 FPRs | 16 general-purpose (X86 integer) registers, 64 f.p. (128-bit media, 64-bit media, and X87 f.p.) registers | |
| Functional units | ||||
|
3 32-bit integer pipelines: 7-stage pipeline No support for f.p. in the processor core Floating-point pipeline: 5 cycles, floating point load to use latency: 4 cycles Can do 2 instructions/cycle |
7 functional units: 2 floating-point even units Can fetch 2 groups of 5 instructions/cycle and complete 10 instructions/cycle |
7 functional units: 2 floating-point even units Can fetch 2 groups of 5 instructions/cycle and complete 10 instructions/cycle |
7 functional units: 2 floating-point even units |
7 pipelined functional Pipeline depths: 12(int), 17(f.p.) Integer: Floating-point: Can do 3 instructions/cycle Max of 72 instructions in flight at once |
| Prefetching? | ||||
| Yes-configurable | Yes-automatic | Yes-automatic | Yes | Yes |
| Simultaneous Multi-Threading (SMT)? | ||||
| No | Yes. SMT appears to the OS as multiple CPUs. Threaded applications may take advantage of this (e.g., "ptile=16"). | Yes. SMT appears to the OS as multiple CPUs. Threaded applications may take advantage of this (e.g., "ptile=32"). | Yes. SMT appears to the OS as multiple CPUs. Threaded applications may take advantage of this (e.g., "ptile=64") | No |
| Peak rates (SPECfp2000) | ||||
| 946 | 2702 | 2702 | SPECfp2006 - 22.3 | 1691 |
| Page sizes (settable by user) | ||||
| 8 possible page sizes | 4 KB | 4 KB or 64 KB | 4 KB or 64 KB | 4 KB |