

NEIL H. E. WESTE DAVID MONEY HARRIS CLOC

## Lecture 19: Packaging, Power, & Clock

#### Outline

- Packaging
- Power Distribution
- Clock Distribution

## Packages

- Package functions
  - Electrical connection of signals and power from chip to board
  - Little delay or distortion
  - Mechanical connection of chip to board
  - Removes heat produced on chip
  - Protects chip from mechanical damage
  - Compatible with thermal expansion
  - Inexpensive to manufacture and test

#### **Package Types** Through-hole vs. surface mount 84-pin PLCC 14-pin DIP 387-pin PGA Multichip Module 44-pin PLCC 86-pin TSOP 84-pin PGA 280-pin QFP XILINX® 40-pin DIP 560-pin BGA 296-pin PGA

21: Package, Power, and Clock CMOS VLSI Design <sup>4th Ed.</sup>

# **Chip-to-Package Bonding**

□ Traditionally, chip is surrounded by *pad frame* 

- Metal pads on 100 200  $\mu m$  pitch
- Gold bond wires attach pads to package
- Lead frame distributes signals in package
- Metal heat spreader helps with cooling



21: Package, Power, and Clock

CMOS VLSI Design <sup>4th Ed.</sup>

## **Advanced Packages**

- Bond wires contribute parasitic inductance
- Fancy packages have many signal, power layers
  - Like tiny printed circuit boards
- Flip-chip places connections across surface of die rather than around periphery
  - Top level metal pads covered with solder balls
  - Chip flips upside down
  - Carefully aligned to package (done blind!)
  - Heated to melt balls
  - Also called C4 (Controlled Collapse Chip Connection)

### **Package Parasitics**

#### □ Use many V<sub>DD</sub>, GND in parallel

– Inductance,  $I_{DD}$ 



# **Heat Dissipation**

- 60 W light bulb has surface area of 120 cm<sup>2</sup>
- Itanium 2 die dissipates 130 W over 4 cm<sup>2</sup>
  - Chips have enormous power densities
  - Cooling is a serious challenge
- Package spreads heat to larger surface area
  - Heat sinks may increase surface area further
  - Fans increase airflow rate over surface area
  - Liquid cooling used in extreme cases (\$\$\$)

## **Thermal Resistance**

- $\Box \quad \Delta \mathsf{T} = \Theta_{\mathsf{ja}} \mathsf{P}$ 
  - $-\Delta T$ : temperature rise on chip
  - $\Theta_{ja}$ : thermal resistance of chip junction to ambient
  - P: power dissipation on chip
- Thermal resistances combine like resistors
  - Series and parallel

$$\mathbf{D}_{ja} = \Theta_{jp} + \Theta_{pa}$$

Series combination

#### Example

- Your chip has a heat sink with a thermal resistance to the package of 4.0° C/W.
- The resistance from chip to package is 1° C/W.
- The system box ambient temperature may reach 55°
  C.
- □ The chip temperature must not exceed 100° C.
- What is the maximum chip power dissipation?

```
\Box (100-55 C) / (4 + 1 C/W) = 9 W
```

#### **Temperature Sensor**

- Monitor die temperature and throttle performance if it gets too hot
- Use a pair of pnp bipolar transistors
  - Vertical pnp available in CMOS







Voltage difference is proportional to absolute temp
 Measure with on-chip A/D converter

## **Power Distribution**

Power Distribution Network functions

- Carry current from pads to transistors on chip
- Maintain stable voltage with low noise
- Provide average and peak power demands
- Provide current return paths for signals
- Avoid electromigration & self-heating wearout
- Consume little chip area and wire
- Easy to lay out

### **Power Requirements**

- $\Box$  V<sub>DD</sub> = V<sub>DDnominal</sub> V<sub>droop</sub>
- Want  $V_{droop}$  < +/- 10% of  $V_{DD}$
- □ Sources of V<sub>droop</sub>
  - IR drops
  - L di/dt noise

I<sub>DD</sub> changes on many time scales



#### **IR Drop**

□ A chip draws 24 W from a 1.2 V supply. The power supply impedance is  $5 \text{ m}\Omega$ . What is the IR drop?

#### L di/dt Noise

- A 1.2 V chip switches from an idle mode consuming 5W to a full-power mode consuming 53 W. The transition takes 10 clock cycles at 1 GHz. The supply inductance is 0.1 nH. What is the L di/dt droop?

21: Package, Power, and Clock CMOS VLSI Design <sup>4th Ed.</sup>

# **Bypass Capacitors**

- Need low supply impedance at all frequencies
- $\hfill\square$  Ideal capacitors have impedance decreasing with  $\omega$
- Real capacitors have parasitic R and L
  - Leads to resonant frequency of capacitor



CMOS VLSI Design 4th Ed.

# **Power System Model**

Power comes from regulator on system board

- Board and package add parasitic R and L
- Bypass capacitors help stabilize supply voltage
- But capacitors also have parasitic R and L
- □ Simulate system for time and frequency responses



## **Frequency Response**

Multiple capacitors in parallel

- Large capacitor near regulator has low impedance at low frequencies
- But also has a low self-resonant frequency
- Small capacitors near chip and on chip have low impedance at high frequencies
- Choose caps to get low impedance at all frequencies



frequency (Hz)

21: Package, Power, and Clock

CMOS VLSI Design <sup>4th Ed.</sup>

#### **Example: Pentium 4**

- Power supply impedance for Pentium 4
  - Spike near 100 MHz caused by package L
- Step response to sudden supply current chain
  - 1<sup>st</sup> droop: on-chip bypass caps
  - 2<sup>nd</sup> droop: package capacitance





#### **Charge Pumps**

- Sometimes a different supply voltage is needed but little current is required
  - 20 V for Flash memory programming
  - Negative body bias for leakage control during sleep
- Generate the voltage on-chip with a charge pump



## **Energy Scavenging**

- Ultra-low power systems can scavenge their energy from the environment rather than needing batteries
  - Solar calculator (solar cells)
  - RFID tags (antenna)
  - Tire pressure monitors powered by vibrational energy of tires (piezoelectric generator)
- Thin film microbatteries deposited on the chip can store energy for times of peak demand

# **Clock Distribution**

- On a small chip, the clock distribution network is just a wire
  - And possibly an inverter for clkb
- On practical chips, the RC delay of the wire resistance and gate load is very long
  - Variations in this delay cause clock to get to different elements at different times
  - This is called *clock skew*
- Most chips use repeaters to buffer the clock and equalize the delay
  - Reduces but doesn't eliminate skew

#### Example

□ Skew comes from differences in gate and wire delay

- With right buffer sizing,  $clk_1$  and  $clk_2$  could ideally arrive at the same time.
- But power supply noise changes buffer delays
- clk<sub>2</sub> and clk<sub>3</sub> will always see RC skew



## **Review: Skew Impact**

- Ideally full cycle is available for work
   Skew adds sequencing overhead
- Increases hold time too

$$t_{pd} \leq T_c - \underbrace{\left(t_{pcq} + t_{\text{setup}} + t_{\text{skew}}\right)}_{\text{skew}}$$

sequencing overhead

$$t_{cd} \ge t_{\text{hold}} - t_{ccq} + t_{\text{skew}}$$



21: Package, Power, and Clock

CMOS VLSI Design <sup>4th Ed.</sup>

#### Solutions

- □ Reduce clock skew
  - Careful clock distribution network design
  - Plenty of metal wiring resources
- □ Analyze clock skew
  - Only budget actual, not worst case skews
  - Local vs. global skew budgets
- Tolerate clock skew
  - Choose circuit structures insensitive to skew

## **Clock Dist. Networks**

- □ Ad hoc
- Grids
- H-tree
- Hybrid

### **Clock Grids**

- □ Use grid on two or more levels to carry clock
- Make wires wide to reduce RC delay
- Ensures low skew between nearby points
- But possibly large skew across die

## **Alpha Clock Grids**



#### **H-Trees**

#### Fractal structure

- Gets clock arbitrarily close to any point
- Matched delay along all paths
- Delay variations cause skew
- □ A and B might see big skew



21: Package, Power, and Clock

CMOS VLSI Design <sup>4th Ed.</sup>

## **Itanium 2 H-Tree**

- □ Four levels of buffering:
  - Primary driver
  - Repeater
  - Second-level clock buffer
  - Gater
- Route around obstructions



# **Hybrid Networks**

- Use H-tree to distribute clock to many points
- Tie these points together with a grid
- □ Ex: IBM Power4, PowerPC
  - H-tree drives 16-64 sector buffers
  - Buffers drive total of 1024 points
  - All points shorted together with grid