Report

Routability-Driven BlockageAware Macro Placement Yi-Fang Chen, Chau-Chin Huang, Chien-Hsiung Chiou, Yao-Wen Chang, Chang-Jen Wang outline • INTRODUCTION • THE PROPOSED ALGORITHM • EXPERIMENTAL RESULTS • CONCLUSIONS INTRODUCTION • A modern system-on-a-chip (SoC) usually hundreds of intellectual property (IP) macros and millions of standard cells. • Some of the macros, namely pre-placed macros, need to be placed at specified locations for various concerns , such as power/thermal considerations or the connections with IO pads. INTRODUCTION • In macro placement, the pre-placed macros are treated as blockages and not allowed to overlap with other macros or standard cells. • For complex designs with many pre-placed macros, therefore, it would make the designs much more difficult. INTRODUCTION • Three-stage mixed-size placement: handles the mixed-size placement in three stages, placement prototyping,macro placement,and standard-cell placement • For the three-stage approach, the MP-tree macro placer has been shown to be packing efficient and easy for implementation for commercial applications. INTRODUCTION INTRODUCTION • Although the MP-tree can pack macros efficiently and robustly, there may still exist vertical overlaps (especially for big macros) because it packs macros along the top and bottom contours. • Further, the MP-tree does not model any blockage or pre-placed macro and adopts a shifting method to remove an overlap with a blockage or a pre-placed macro. INTRODUCTION INTRODUCTION • We propose a new circularly-packing-tree (CP-tree for short) representation based on MP-tree, which is easy for implementation, fast for operations, and flexible for handling various constraints. FLOW CP-tree THE PROPOSED ALGORITHM • We further model pre-placed boundary blocks and boundary branches in the CP-tree representation. • A pre-placed boundary block represents a macro which is pre-placed along the boundary. PRE-PLACED BOUNDARY BLOCK THE PROPOSED ALGORITHM • To place blocks along the chip boundaries circularly, we introduce four boundary branches for a packing subtree to represent the chip boundaries. • A boundary branch consists of a set of packing anchors , which are two pseudo terminal nodes and a set of pre-placed boundary nodes, for initialization. BOUNDARY BRANCH BOUNDARY BRANCH • For a packing anchor p, it contains the following information: • (1) anchor distance : • the distance from p to the next packing anchor. • (2) the number of branch children : • the number of children on the boundary branch between p and the next packing anchor BOUNDARY BRANCH • (3) interval space : • the rest space after all branch children are placed • The value ip of packing anchor p is computed as follows: • dp is the anchor distance • wc is the width of a branch child c of p BOUNDARY BRANCH • (4) anchor width constraint : • check whether the interval space is less than 0 or not. • (5) switching point : • the starting point changing the packing direction for p. BOUNDARY BRANCH BOUNDARY BRANCH CP-tree Floorplan Representation • Because any two boundary branches might overlap with each other , we allow the branch blocks with the back-packing direction to shift with the distance Ds. CP-tree Floorplan Representation R-wirelength model • To propagate the information of placement prototyping to macro placement, we propose a new routability-aware wirelength model, namely the R-wirelength model. • We construct two available layer maps H and V for the respective horizontal and vertical routing directions to record the layer blockages and reflect the routing resource utilization in our R-wirelength model. R-wirelength model • The available layer map functions H(b) and V (b) of a bin b can be initialized for the pre-placed macros as follows: R-wirelength model • Ab is the area of the bin b. • LH and LV are the respective numbers of available metal layers for • Ox(b,m) and Oy(b,m) are the overlap functions of a bin b and a preplaced macro m. • βH(m) and βV (m) are the layer blockages which define the number of metal layers occupied by the macro m. R-wirelength model R-wirelength model • Instead of decomposing a net e as twopin nets which is timeconsuming, we find a netbox R(e) for the net e to model the halfperimeter wirelength (HPWL) of e. • Let Rb(e),Rt(e), Rl(e), Rr(e) be the bottom, top, left, and right of this netbox. • Ae be the area of R(e). R-wirelength model • we can compute the R-wirelength Wr(e) for the net e as follows: R-wirelength model R-wirelength model • A smaller R-wirelength implies shorter wirelength and better routability for the placement with macros and standard cells. Simulated Annealing Optimization • For the CP-tree representation, we propose a simulated annealing (SA) flow to optimize the macro placement. • we use the following five operations to perturb a CP-tree: • Op1: rotate a node. • Op2: move a node to another place. • Op3: swap two nodes. • Op4: change the packing direction of a packing anchor. • Op5: change the switching point of a packing anchor. Simulated Annealing Optimization • With the placement solution, we can evaluate the cost of this solution by a feasibility cost evaluation defined as follows: • A is the weighted macro placement area, • D is the total macro displacement • W is the R-wirelength • U is the total number of overlaps between any two macros • N is the narrow channel effect Simulated Annealing Optimization • The weighted macro placement area is the summation of the area under four contours times a weighted thickness function Θ(c). • The weighted thickness function is defined as follows: • Tc is thickness of the contour c. • Ch and Cw are the height and width of the chip. Simulated Annealing Optimization • Here, the thickness of a contour is the maximum height of the contour. Simulated Annealing Optimization • A narrow channel, which is formed between two or more macros, usually incurs serious routing congestion. • If so, the narrow channel effect can be modeled by • Pci/(wci・ min(hci−1, hci+1)), EXPERIMENTAL RESULTS • We conducted experiments based on five real industry benchmarks to evaluate our proposed algorithm. • We implemented our algorithm in the C++ programming language on a Linux workstation with 24 Intel Xeon 2.0 GHz CPUs and 72 GB memory. • All placement prototypes (global placement results)and standard cell placements were obtained by NTUplace3 [6], and then the final placement results were routed by Cadence SOC Encounter. EXPERIMENTAL RESULTS CONCLUSIONS • We have presented a new CP-tree based macro placement algorithm. • We have also presented the R-wirelength model to consider routability. • Experimental results have shown that our placer can obatin placement solutions with desired routability and wirelength.