The best way to learn 
floorplanning? Just do it
I can't say that I've seen any literature on the art of floorplanning. 
Unfortunately really good floorplanning is more of an art than a science. That 
should not discourage the neophyte however, as even basic floorplanning can have 
dramatic results. The goal is of course to place the logic in ways that make the 
routing easier, less congested and shorter. As a starting point, you might let 
the tool do the place and route. After it is finished, read the placed design 
into the floorplan tool and start looking for ways you can improve the layout. 
The first thing you will probably notice is how awful the automatic placers 
really are. 
For Xilinx designs you will have to get a hold of XACT6. There is a beta 
floorplanner for M1.4, but it is really not ready for prime time yet. In my 
opinion, there is not enough info supplied with the beta floorplanner for 
someone who is not already familiar with the xact6 floorplanner to use it. 
Anyway, get a copy of xact 6 and look at the on-line documentation for the 
floorplanner. Play with the tool using a simple design (a design with some 
Relatively Placed Macros and loose logic is probably the best) with an eye 
toward minimimizing the complexity of the interconnect. The Altera MAX PLUS 
tools also have a floorplanner, which I think is a little harder to use than the 
xilinx one. Fortunately, the routing structure on the Altera device makes it 
less sensitve to having a good floorplan (and less capable for really high 
performance stuff). 
Beyond playing with a few designs, I can't really offer any quick advice. 
Floorplanning is rather like putting together a jigsaw puzzle, except that there 
are a large number of solutions. Because of that, it takes a little bit of an 
artist's eye to do it well. When you get down to it, I find that some people 
have a knack for it, while others just don't. To be honest, most of us who 
advocate floorplanning have been doing it long before the floorplan tools were 
available (we used up a lot of graph paper and pencils). Play with the tools on 
as many designs as you can. As you gain familiarity with the architecture and 
the tool, you will start to recognize what works and what doesn't. As with the 
arts, there is no substitute for natural talent. Fortunately, engineers tend to 
be good puzzle solvers, so there is hope. I'm sorry I couldn't offer more help 
than this. 
Fliptronics, FPGA consulting firm, has a decent introduction to floorplanning on the web. 
 
  
FPGA Floorplanning (1 of 1)
(Updated 08/18/2002) 
 
Floorplanning is the process of 
identifying structures that should be placed close together, and allocating 
space for them in such a manner as to meet the sometimes conflicting goals of 
available space (cost of the chip), required performance, and the desire to have 
everything close to everything else. 
Within the Xilinx chips it is often 
the case that the smallest area design is also the highest performance design. 
This flies in the face of many design methodologies, where area and speed are 
considered to be things that should be traded off against each other. 
The reason this is so is probably 
because there are limited routing resources, and the more routing resources that 
are used, the slower the design will operate. Optimizing for minimum area allows 
the design to use fewer resources, but also allows the sections of the design to 
be closer together. This leads to shorter interconnect distances, less routing 
resources to be used, faster end-to-end signal paths, and even faster and more 
consistent place and route times. Done correctly , there are no negatives 
to Floorplanning. 
What negatives could there be? 
Well, if the Floorplanning is done with no regard for the architecture of the 
chip, then it is possible to actually do a worse job than the Xilinx placer 
section of the place and route software. It is also possible that there are 
constraints that are not well understood until placement is complete, and 
routing commences. So the issue then is what constitutes the "Done 
correctly". 
As a general rule, data-path 
sections benefit most from Floorplanning, and random logic, state machines, and 
other non-structured logic can safely be left to the placer section of the place 
and route software. 
Data paths are typically the areas 
of your design where multiple bits are processed in parallel with each bit being 
modified the same way with maybe some influence from adjacent bits. Example 
structures that make up data paths are Adders, Subtractors, Counters, Registers, 
and Muxes. 
How to Floorplan a design
Although there are no hard and fast rules to Floorplanning, this 
section outlines the basic structure for a Floorplanned design, and highlights 
the issues you need to consider when Floorplanning a design. As described above, 
Floorplanning has its greatest return when applied to data path elements. The 
Xilinx XC4000 devices, and all of the derivative families  (the A, D, E, 
EX, H, L, XL, Spartan, and SpartanXL families) all have the following basic 
structure: 
    A 
    rectangular array of Configurable Logic Blocks (CLBs). These logic blocks 
    contain two main function generators, and two flip-flops. The function 
    generators can represent any number of gates that as a group has no more 
    than 4 inputs, one output, and no internal loops (that would implement latch 
    like behavior). The flip-flops are either rising or falling edge triggered, 
    include a clock-enable function that is implemented with a re-circulation 
    multiplexer from the Q output to the D input, and can have either an active 
    high asynchronous reset or set function. Associated with each CLB are two 
    tri-stateable buffers.Segmented interconnect including short interconnect for local 
    signals, and long-lines for spanning the width or height of the chip. In 
    many of the devices, the horizontal long-lines can be split into a left and 
    a right half, allowing up to twice as many lines, that span half the width 
    of the chip.The two tri-stateable buffers associated with each CLB are 
    pre-connected to two of the horizontal long-lines.Input and Output pins on all 4 sides of the array.Pre-built Carry logic that is pre-connected vertically in 
    column of CLBs.
To support these characteristics, consistently implement all data 
path elements with a bit pitch of two bits per row, and data path elements are 
always vertical structures, of one or more columns. 
The Xilinx FPGAs are biased to have data flow along horizontal 
interconnect, and to have arithmetic functions operate in vertical columns. The 
bias comes from the horizontal long lines with tri-stateable buffers, and the 
vertical pre-built and routed carry logic. 
The carry logic is also used to build fast counters, so although 
you may not initially think of a counter as an arithmetic function, it falls 
into the same pattern as adders, subtractors, and arithmetic comparisons, 
because of its use of the carry chain. This view can be clarified by thinking of 
a counter as an incrementor, followed by a holding register. 
The bit pitch of two bits per row is driven primarily by the 
structure of the carry logic, but is also the bit pitch that the tri-stateable 
buffers implement. What this means is that the natural structure of arithmetic 
functions in these devices implements 2 bits of a function (a two bit slice) in 
one row of CLBs, and for simple functions, in one column. A simple function such 
as a ten bit synchronous up-counter will therefore take 5 rows and 1 column, a 
total of 5 CLBs. 
Although the XC4000 devices and the A, D, E, H, and L derivatives 
allow the carry signal between CLBs to interconnect in both an up and down 
direction within a column, the more recent XC4000EX, XC4000XL, Spartan and 
SpartanXL devices only support the carry signals being routed up a column. For 
all devices, within a CLB, the carry routing is up, with regard to the two 
function generators. It is expected that this up only bias will exist in future 
products from Xilinx. To be compatible with all these products, you should 
only uses the up direction for carry, and this bias then affects 
all other functions that are generated. For the example 10 bit counter 
described in the previous paragraph, the Floorplan will have bit 0 and 1 in the 
CLB at the bottom of the column of 5 CLBs, and the top CLB will have bits 8 and 
9. 
  
  
    | 
        
  | 
    Following Xilinx's standard, the two main 
      function generators are shown on the left of diagrams, and are labeled F 
      and G, and the two flip-flops are shown on the right and are labeled X and 
      Y.
       For the example counter, in the CLB at the bottom of the 
      five CLB group (the one with the RLOC=R4C0 attribute), the F function 
      generator will be used to implement the logic that feeds the D pin of the 
      X flip-flop, the output of which, is the least significant bit of the 
      counter, Q0. 
      The G and Y sections of the same CLB implement bit 1 of 
      the counter. The next CLB above (the one with the RLOC=R3C0 attribute) 
      implements bit 2 and 3. This continues up the column, through to the top 
      CLB which implements bits 8 and 9.  |   When two or more functions of your design are Floorplanned in 
this way and placed side by side, with the signals that flow from one function 
to the next aligned on the same row, and in near or adjacent columns, the design 
will place and route much faster and the resulting design will perform faster 
than a design without Floorplanning, and that relies on the Xilinx place and 
route software to decide on placement. Of course, custom building each function 
section of your design with detailed Floorplanning for each function generator 
and flip-flop can be a complex, time consuming, and potentially error prone 
process. 
The Xilinx Place and Route software uses a hierarchical placement 
constraint system called relative location attributes. Each level of the 
hierarchy has an origin in the top left corner that has a relative location of 
row zero and column zero. As a constraint this is represented as R0C0. Rows are 
numbered from top to bottom, and columns are numbered from left to right. When a 
relative location attribute (RLOC) is assigned to a part of the hierarchy that 
is not a single CLB, then the underlying RLOCs are added to the attached 
attribute to calculate the RLOC value for each of the underlying RLOCs. This 
process continues throughout the hierarchy, resolving each CLB RLOC to a value 
that is relative to the RLOC at the top of the hierarchy. This process, and 
other issues related to how RLOCs are processed are discussed in full in the 
Xilinx "Libraries Guide" document, in the "Attributes, Constraints, and Carry 
Logic" chapter, in the "Relative Location (RLOC) Constraints" section. Although 
this section of Xilinx's documentation is quite complex, it is recommended that 
you review it to better understand how the RLOCs in the modules support 
Floorplanning. 
An Example design, with various levels of Floorplanning
This section examines the results of Floorplanning, and compares 
the resulting structure, the place and route time, and the design performance. 
The example while contrived is typical of the types of logic that benefit from 
Floorplanning. The example design comprises four sixteen bit binary up counters, 
that all feed into a selection multiplexer. The output of the selection 
multiplexer is registered, and the output of this register is connected to the 
FPGA pins. 
There are two basic timing path categories that need to be 
analyzed. The first is the maximum delay in any of the counters. And the second 
is the maximum delay from any of the counters to the multiplexer output 
register. For the counter, the maximum delay will be from the clock to out time 
of the LSB flip-flop, through the logic that establishes the next counter value, 
to the D input of the MSB flip-flop, and meeting its setup time. The reciprocal 
of this maximum internal delay within the counter is the maximum clock rate at 
which the counter will count reliably. 
Seven different levels of Floorplanning are applied to this 
simple design, using the XC4005E, XC4010E, and XC4010XL as targets. The '-2' 
speed grade is used for all examples, and place and route programs used are as 
follows: 
  - XC4005E-2 PPR V5.2.1
 - XC4010E-2 PPR V5.2.1
 - XC4010E-2 PAR M1.4
 - XC4010XL-2 PAR M1.4
  
The combination of running the XC4010E devices with both place 
and route programs allows comparison of these programs on the XC4000E families. 
Running both the XC4010E and XC4010XL on the M1.4 program, allows comparison of 
these two product families. While the goal is to show the value of 
Floorplanning, the program and product comparisons are 
interesting. 
The same seven levels of Floorplanning were applied to each of 
these four product/program combinations. The seven design styles have the 
following characteristics:  
  - The 
  4 counters are binary ripple counters (CB16CE), from the Xilinx unified 
  library XC4000E, the multiplexer and output register are also taken from this 
  library. There is no Floorplanning in this style, and the choice of a ripple 
  counter, while available in the library, is a poor 
  choice.
 - The 
  4 counters are binary counters that use the built-in carry logic (CC16CE), 
  from the Xilinx unified library XC4000E, the multiplexer and output register 
  are also taken from this library. While there is no explicit Floorplanning in 
  this style, the counters include internal Floorplanning, because the carry 
  logic imposes a column structure on the counters.
 - This 
  style adds four RLOC_ORIGIN Floorplanning constraints to the style 2 design, 
  placing the four counters in adjacent column, and aligning the MSBs of the 
  counters (and all other bits).
 - This 
  style replace the un-Floorplanned output register of the previous styles with 
  a Floorplanned register, and places it in the column to the right of the 
  fourth counter. It also is aligned with regard to bit 
  positions.
 - This 
  style is like style 4, except the output register is placed in the column to 
  the right of the column used for the register in style 4.
 - This 
  style uses a Floorplanned multiplexer and output register, and places it in 
  the two columns to the right of the fourth counter. The odd bit multiplexers 
  and output register flip-flops are in one of these two columns, and the even 
  bits are in the other column.
 - This 
  style uses the same components of style 6, but the Floorplan has been changed. 
  The first two columns contain the first two counters, the next two columns are 
  the multiplexer and output register, and the last two columns contain the 
  third and fourth counter.
  
To understand the differences in the results for these design 
styles, the following descriptions of the behavior of the place and route 
software, as well as an analysis of the device resources should be 
helpful. 
Style 1 uses no Floorplanning or guidance on using the carry 
logic that is available in these products. The results are consistently the 
poorest. Style 2 changes the structure of the counters to use carry logic, and 
for this style through to style 7, the performance and size of the counters does 
not change much. There is no direct Floorplanning of the counters with regard to 
their relative placement. While this does not affect the counters, it may not be 
optimal for the routing from the counters to the multiplexer. As can be seen in 
the following diagrams, the style 2 designs have placed the counters near each 
other, but they are not aligned. 
Style 3 adds Floorplanning to the counters, and by aligning the 
counters, the routing to the multiplexer should be more straightforward. This 
should improve the delays from the counters through the multiplexer to the 
output register. As can be seen in the diagrams, the multiplexer logic is placed 
somewhat randomly around the core of the 4 counters. 
Style 4 places the output register in the next column to the 
right of the four counters, and the flip-flops of this register are aligned with 
the counter bits. Although this should help significantly, it does not, because 
the 8 logic blocks that hold the 16 flip-flops of the output register do not 
have sufficient gate resources to implement the 16 four-input multiplexers. Some 
of the multiplexers are placed with the flip-flops, and some are placed near 
by. 
Style 5 attempts to alleviate the problems with style 4, by 
moving the output register to the next column to the right, leaving room for the 
8 multiplexers that couldn't fit in with the flip-flops. None of the place and 
route programs take full advantage of this opportunity for 
improvement. 
Style 6 resolves the performance issue of the multiplexer, by 
replacing it with a Floorplanned multiplexer with output register. This 
multiplexer performs an additional optimization of not placing all the 
flip-flops in the same column, but rather, placing the flip-flops with the 
multiplexers. A four-to-one multiplexer requires all the gate resources of a 
CLB, so to build a 16 bit wide multiplexer with four inputs will require 16 
CLBs. Strictly maintain a Floorplanning structure of two bits of data path 
implemented per row of structure. The 16 CLBs are Floorplanned to use two 
columns by eight rows, with bits 0 and 1 on the row at the bottom, and bits 14 
and 15 at the top. This exactly matches the bit position of the counters, except 
the counters have an additional block at the top, for the TC and CEO outputs. 
This is resolved by placing the counters with RLOC-ORIGINS on row 1, but the 
multiplexer is placed on row 2. 
At this point you may wonder what additional improvement could be 
made to style 6. Consider the routing from the left most counter to the 
multiplexer. It must pass through the other three counters to get to the 
multiplexer. Similarly, the output of counters two and three must also pass 
through the fourth counter to get to the multiplexer. Therefore, there is more 
routing congestion around counter four, although it has the shortest path to the 
multiplexer. The output of the first counter must traverse the furthest distance 
to get to the multiplexer. In synchronous designs like this, the slowest path 
out of a group of paths will be the limiting factor. For the counters to run at 
their fastest, they need to have their routing congestion minimized. For the 
paths from the four counters to the multiplexer to be minimized, the multiplexer 
and the four counters need to be placed so as to minimize the worst-case 
distance. Both of these goals are achieved in style 7 by placing the multiplexer 
and its output register in the middle of the structure, with two counters to its 
left, and two counters to its right. 
As can be seen from the following tables and diagrams, style 7 
delivers the fastest counters, the fastest counter to multiplexer output 
register time, the fastest placement time, and the fastest routing time. 
Studying the schematics for design styles 1 and style 7 shows almost no 
additional effort to create design 7's result. Selecting counters and 
multiplexers that are pre-Floorplanned, together with five placement attributes 
is all that is required. (Some thought as to what the placement constraints 
should be, obviously is also needed) 
  
  
    | 
       XC4005EPC84-2 Processed with PPR 
      V5.2.1c  |  
  
    | 
       Design Style  | 
    
       Counter Delay (nS)  | 
    
       Max Frequency (MHz)  | 
    
       Counter to MUX REG delay 
      (nS)  | 
    
       Partition + Placement time 
      (S)  | 
    
       Routing Time (Seconds)  | 
    
       CLBs Used  |  
  
    | 
       1  | 
    
       17.1  | 
    
       58.4  | 
    
       11.8  | 
    
       4+28   | 
    
       12  | 
    
       72  |  
  
    | 
       2  | 
    
       13.1  | 
    
       76.3  | 
    
       10.8  | 
    
       6+15  | 
    
       13  | 
    
       48  |  
  
    | 
       3  | 
    
       13.4  | 
    
       74.6  | 
    
       11.7  | 
    
       6+14  | 
    
       17  | 
    
       48  |  
  
    | 
       4  | 
    
       13.1  | 
    
       76.3  | 
    
       14.4  | 
    
       7+12  | 
    
       17  | 
    
       48  |  
  
    | 
       5  | 
    
       14.3  | 
    
       69.9  | 
    
       14.5  | 
    
       6+12  | 
    
       16  | 
    
       48  |  
  
    | 
       6  | 
    
       13.3  | 
    
       75.1  | 
    
       9.4  | 
    
       3+11  | 
    
       16  | 
    
       48  |  
  
    | 
       7  | 
    
       13.1  | 
    
       76.3  | 
    
       8.9  | 
    
       3+11  | 
    
       14  | 
    
       48  |   
  
  
  
    | 
       XC4010EPC84-2 Processed with PPR 
      V5.2.1c  |  
  
    | 
       Design Style  | 
    
       Counter Delay (nS)  | 
    
       Max Frequency (MHz)  | 
    
       Counter to MUX REG delay 
      (nS)  | 
    
       Partition + Placement time 
      (S)  | 
    
       Routing Time (Seconds)  | 
    
       CLBs Used  |  
  
    | 
       1  | 
    
       17.5  | 
    
       57.1  | 
    
       12.9  | 
    
       7+53  | 
    
       32  | 
    
       88  |  
  
    | 
       2  | 
    
       13.3  | 
    
       75.1  | 
    
       11.2  | 
    
       4+13  | 
    
       12  | 
    
       48  |  
  
    | 
       3  | 
    
       13.5  | 
    
       74.0  | 
    
       12.6  | 
    
       4+11  | 
    
       15  | 
    
       48  |  
  
    | 
       4  | 
    
       13.1  | 
    
       76.3  | 
    
       14.6  | 
    
       4+11  | 
    
       17  | 
    
       48  |  
  
    | 
       5  | 
    
       13.2  | 
    
       75.7  | 
    
       14.2  | 
    
       3+11  | 
    
       14  | 
    
       48  |  
  
    | 
       6  | 
    
       13.3  | 
    
       75.1  | 
    
       10.2  | 
    
       2+10  | 
    
       16  | 
    
       48  |  
  
    | 
       7  | 
    
       13.1  | 
    
       76.3  | 
    
       8.9  | 
    
       1+10  | 
    
       15  | 
    
       48  |   
  
  
  
    | 
       XC4010EPC84-2 Processed with 
      M1.3.7 (PAR –L4 –D5) (A)  |  
  
    | 
       Design Style  | 
    
       Counter Delay (nS)  | 
    
       Max Frequency (MHz)  | 
    
       Counter to MUX REG delay 
      (nS)  | 
    
       Placement time 
(Seconds)  | 
    
       Routing Time (Seconds)  | 
    
       CLBs Used  |  
  
    | 
       1  | 
    
       21.9  | 
    
       45.6  | 
    
       19.4  | 
    
       65-7=58  | 
    
       574-65=509  | 
    
       55  |  
  
    | 
       2  | 
    
       13.7  | 
    
       72.9  | 
    
       10.0  | 
    
       47-7=40  | 
    
       142-47=95  | 
    
       48  |  
  
    | 
       3  | 
    
       13.8  | 
    
       72.4  | 
    
       10.3  | 
    
       38-8=30  | 
    
       170-38=132  | 
    
       48  |  
  
    | 
       4  | 
    
       13.8  | 
    
       72.4  | 
    
       12.7  | 
    
       28-8=20  | 
    
       132-28=104  | 
    
       56  |  
  
    | 
       5  | 
    
       13.7  | 
    
       72.9  | 
    
       13.1  | 
    
       28-8=20  | 
    
       128-28=100  | 
    
       56  |  
  
    | 
       6  | 
    
       13.7  | 
    
       72.9  | 
    
       9.4  | 
    
       15-8=7  | 
    
       80-15=65  | 
    
       48  |  
  
    | 
       7  | 
    
       13.7  | 
    
       72.9  | 
    
       8.9  | 
    
       14-8=6  | 
    
       75-14=61  | 
    
       48  |   
  
  
  
    | 
       XC4010XLPC84-2 Processed with 
      M1.3.7 (PAR –L4 –D5) (B)  |  
  
    | 
       Design Style  | 
    
       Counter Delay (nS)  | 
    
       Max Frequency (MHz)  | 
    
       Counter to MUX REG delay 
      (nS)  | 
    
       Placement time 
(Seconds)  | 
    
       Routing Time (Seconds)  | 
    
       CLBs Used  |  
  
    | 
       1  | 
    
       18.5  | 
    
       54.0  | 
    
       8.8  | 
    
       68-20=48  | 
    
       147-68=79  | 
    
       55  |  
  
    | 
       2  | 
    
       11.6  | 
    
       86.2  | 
    
       7.0  | 
    
       53-21=32  | 
    
       134-53=81  | 
    
       48  |  
  
    | 
       3  | 
    
       11.9  | 
    
       84.0  | 
    
       6.9  | 
    
       46-21=25  | 
    
       128-46=82  | 
    
       48  |  
  
    | 
       4  | 
    
       12.1  | 
    
       82.6  | 
    
       10.6  | 
    
       34-22=12  | 
    
       95-34=61  | 
    
       56  |  
  
    | 
       5  | 
    
       11.7  | 
    
       85.4  | 
    
       10.7  | 
    
       33-21=12  | 
    
       91-33=58  | 
    
       56  |  
  
    | 
       6  | 
    
       11.9  | 
    
       84.0  | 
    
       6.8  | 
    
       25-20=5  | 
    
       64-25=39  | 
    
       48  |  
  
    | 
       7  | 
    
       11.7  | 
    
       85.4  | 
    
       6.1  | 
    
       26-21=5  | 
    
       69-26=43  | 
    
       48  |   
  
  
  
    | 
       XC4010XLPC84-2 Processed with 
      M1.4.12 (MAP –K, PAR –L4 –D5)  |  
  
    | 
       Design Style  | 
    
       Counter Delay (nS)  | 
    
       Max Frequency (MHz)  | 
    
       Counter to MUX REG delay 
      (nS)  | 
    
       Placement time 
(Seconds)  | 
    
       Routing Time (Seconds)  | 
    
       CLBs Used  |  
  
    | 
       1  | 
    
       18.2  | 
    
       54.9  | 
    
       11.3  | 
    
       64-20=44  | 
    
       185-64=121  | 
    
       83  |  
  
    | 
       2  | 
    
       11.3  | 
    
       88.5  | 
    
       9.8  | 
    
       39-21=18  | 
    
       183-39=144  | 
    
       72  |  
  
    | 
       3  | 
    
       11.8  | 
    
       84.7  | 
    
       10.6  | 
    
       33-20=13  | 
    
       108-33=75  | 
    
       72  |  
  
    | 
       4  | 
    
       11.6  | 
    
       86.2  | 
    
       10.8  | 
    
       32-21=11  | 
    
       128-32=96  | 
    
       72  |  
  
    | 
       5  | 
    
       11.7  | 
    
       85.4  | 
    
       11.0  | 
    
       32-21=11  | 
    
       116-32=84  | 
    
       72  |  
  
    | 
       6  | 
    
       11.6  | 
    
       86.2  | 
    
       6.8  | 
    
       24-21=3  | 
    
       59-24=35  | 
    
       48  |  
  
    | 
       7  | 
    
       11.7  | 
    
       85.4  | 
    
       6.1  | 
    
       24-20=4  | 
    
       61-24=37  | 
    
       48  |   
  
  
  
    | 
       XC4010XLPC84-2 Processed with 
      M1.4.12 (MAP –K, PAR –L5 –D5)  |  
  
    | 
       Design Style  | 
    
       Counter Delay (nS)  | 
    
       Max Frequency (MHz)  | 
    
       Counter to MUX REG delay 
      (nS)  | 
    
       Placement time 
(Seconds)  | 
    
       Routing Time (Seconds)  | 
    
       CLBs Used  |  
  
    | 
       1  | 
    
       17.3  | 
    
       57.8  | 
    
       11.3  | 
    
       99-20=79  | 
    
       224-99=125  | 
    
       83  |  
  
    | 
       2  | 
    
       11.7  | 
    
       85.4  | 
    
       9.9  | 
    
       58-21=37  | 
    
       229-58=171  | 
    
       72  |  
  
    | 
       3  | 
    
       12.1  | 
    
       82.6  | 
    
       10.5  | 
    
       46-20=26  | 
    
       140-46=94  | 
    
       72  |  
  
    | 
       4  | 
    
       11.6  | 
    
       86.2  | 
    
       11.1  | 
    
       44-21=23  | 
    
       117-44=73  | 
    
       72  |  
  
    | 
       5  | 
    
       11.7  | 
    
       85.4  | 
    
       10.9  | 
    
       44-21=23  | 
    
       134-44=90  | 
    
       72  |  
  
    | 
       6  | 
    
       12.1  | 
    
       82.6  | 
    
       6.7  | 
    
       27-21=6  | 
    
       60-27=33  | 
    
       48  |  
  
    | 
       7  | 
    
       11.7  | 
    
       85.4  | 
    
       6.1  | 
    
       27-21=6  | 
    
       66-27=39  | 
    
       48  |   
  
  
  
    | 
       XC4010XLPC84-2 Processed with 
      M1.4.12 (PAR –L4 –D5)  |  
  
    | 
       Design Style  | 
    
       Counter Delay (nS)  | 
    
       Max Frequency (MHz)  | 
    
       Counter to MUX REG delay 
      (nS)  | 
    
       Placement time 
(Seconds)  | 
    
       Routing Time (Seconds)  | 
    
       CLBs Used  |  
  
    | 
       1  | 
    
       18.8  | 
    
       53.2  | 
    
       9.1  | 
    
       63-20=43  | 
    
       199-63=136  | 
    
       55  |  
  
    | 
       2  | 
    
       12.0  | 
    
       83.3  | 
    
       7.7  | 
    
       45-20=25  | 
    
       132-45=87  | 
    
       48  |  
  
    | 
       3  | 
    
       12.2  | 
    
       81.9  | 
    
       6.7  | 
    
       36-21=15  | 
    
       116-36=80  | 
    
       48  |  
  
    | 
       4  | 
    
       11.9  | 
    
       84.0  | 
    
       10.3  | 
    
       30-20=10  | 
    
       97-30=67  | 
    
       56  |  
  
    | 
       5  | 
    
       12.0  | 
    
       83.3  | 
    
       10.5  | 
    
       31-21=10  | 
    
       103-31=72  | 
    
       56  |  
  
    | 
       6  | 
    
       11.6  | 
    
       86.2  | 
    
       6.8  | 
    
       24-20=4  | 
    
       58-24=34  | 
    
       48  |  
  
    | 
       7  | 
    
       11.7  | 
    
       85.4  | 
    
       6.1  | 
    
       24-20=4  | 
    
       61-24=37  | 
    
       48  |   
  
  
  
    | 
       XC4010XLPC84-2 Processed with 
      M1.4.12 (PAR –L5 –D5)  |  
  
    | 
       Design Style  | 
    
       Counter Delay (nS)  | 
    
       Max Frequency (MHz)  | 
    
       Counter to MUX REG delay 
      (nS)  | 
    
       Placement time 
(Seconds)  | 
    
       Routing Time (Seconds)  | 
    
       CLBs Used  |  
  
    | 
       1  | 
    
       18.1  | 
    
       55.2  | 
    
       7.7  | 
    
       105-21=84  | 
    
       257-105=152  | 
    
       55  |  
  
    | 
       2  | 
    
       12.0  | 
    
       83.3  | 
    
       6.7  | 
    
       72-21=51  | 
    
       199-72=127  | 
    
       48  |  
  
    | 
       3  | 
    
       11.8  | 
    
       84.7  | 
    
       6.8  | 
    
       55-21=34  | 
    
       138-55=83  | 
    
       48  |  
  
    | 
       4  | 
    
       12.1  | 
    
       82.6  | 
    
       10.5  | 
    
       40-21=19  | 
    
       148-40=108  | 
    
       56  |  
  
    | 
       5  | 
    
       12.1  | 
    
       82.6  | 
    
       10.6  | 
    
       40-20=20  | 
    
       102-40=62  | 
    
       56  |  
  
    | 
       6  | 
    
       12.1  | 
    
       82.6  | 
    
       6.7  | 
    
       29-22=7  | 
    
       61-29=32  | 
    
       48  |  
  
    | 
       7  | 
    
       11.7  | 
    
       85.4  | 
    
       6.1  | 
    
       27-21=6  | 
    
       66-27=39  | 
    
       48  |   
Interpreting the Floorplan Pictures
The full 
manual has all the pictures for all 8 of the above tables of data. This page 
only has the pictures for the last table, Which is the M1 PAR V1.4.12, with -L 5 
and -D 5, which represent high effort in both placer and router. 
  At the 
time of writing this page, the XC4000XL is Xilinx's leading FPGA family, and the 
M1 PAR version 1.4.12 is the current version of the place and route software. 
The color 
coding of the following Floorplans is as follows: 
All the pictures are of 
XC4010XL devices, which is an array of 20 by 20 CLBs. These are represented by 
small squares. If it is empty, the CLB is not usedWithin each CLB, colored 
squares on the left are F & G function generators, colored squares on the 
right are the flip-flops, and a colored rectangle in the middle represents the H 
function generator.If a square is colored 
blue, then it is being usedIf a square is colored 
yellow, then it is a function generator, and the carry logic is 
activeIf a square is colored 
magenta, then it is a function generator, and it is being used for single ported 
RAMIf a square is colored red, 
then it is a function generator, and it is being used for dual ported 
RAMIf a square is colored 
green, then it is a function generator, and it is being used for 
ROMIf an I/O cell is colored 
red, then it is being used for a global clock bufferAn "X" over an I/O cell 
indicates an I/O cell that is not bonded to a package pinAn inward pointing arrow on 
an I/O cell indicates usage as an inputAn outward pointing arrow 
on an I/O cell indicates usage as an outputIf an I/O or CLB cell has a 
gray background, then it means that there was placement control used on that 
location
  
  
  
    | 
        
  | 
    
       XC4010XL-S1-F  
      The 4 counters are binary ripple counters (CB16CE), from the Xilinx 
      unified library XC4000E, the multiplexer and output register are also 
      taken from this library. There is no Floorplanning in this style, and the 
      choice of a ripple counter, while available in the library, is a poor 
      choice. 
        This is also what you will get from synthesis if it does 
      not know about the carry logic in the XC4000 families. 
 |  
  
    | 
        
  | 
    
       XC4010XL-S2-F  
      The 4 counters are binary counters that use the built-in carry logic 
      (CC16CE), from the Xilinx unified library XC4000E, the multiplexer and 
      output register are also taken from this library. While there is no 
      explicit Floorplanning in this style, the counters include internal 
      Floorplanning, because the carry logic imposes a column structure on the 
      counters. 
        This is also what you will get from synthesis if it 
      knows about carry logic, but you do not do any Floorplanning. While the 
      performance for this style is not too bad for this example, when a chip is 
      used at 50% or more, the lack of Floorplanning can seriously degrade 
      performance, and routing times may become very long. 
 |  
  
    | 
        
  | 
    
       XC4010XL-S3-F  
      This style adds four RLOC_ORIGIN Floorplanning constraints to the style 
      2 design, placing the four counters in adjacent column, and aligning the 
      MSBs of the counters (and all other bits). 
        The Floorplanning is 
      shown by the gray background to the four columns that contain the 
      counters. Since the multiplexer is not Floorplanned, it is the CLBs with 
      logic in them, but a white 
background. 
 |  
  
    | 
        
  | 
    
       XC4010XL-S4-F  
      This style replace the un-Floorplanned output register of the previous 
      styles with a Floorplanned register, and places it in the column to the 
      right of the fourth counter. It also is aligned with regard to bit 
      positions.  
        Note that the multiplexer logic is still scattered all 
      around the Floorplanned core. Although there is room in the Floorplanned 
      output register CLBs to merge some of the multiplexer, the mapper in the 
      current version of the M1 software will not do this. 
 |  
  
    | 
        
  | 
    
       XC4010XL-S5-F  
      This style is like style 4, except the output register is placed in the 
      column to the right of the column used for the register in style 4. 
       
        This opened up a column for the placer to move the multiplexer 
      into. It looks like half of the 16 bits of multiplexer logic have been 
      moved into this area, and half are still floating about. Merging the 
      multiplexer into the Floorplanned output register CLBs has not happened. 
 |  
  
    | 
        
  | 
    
       XC4010XL-S6-F  
      This style uses a Floorplanned multiplexer and output register built by 
      FlibGen module 
      generator, and places it in the two columns to the right of the fourth 
      counter. The odd bit multiplexers and output register flip-flops are in 
      one of these two columns, and the even bits are in the other column.  |  
  
    | 
        
  | 
    
       XC4010XL-S7-F  
      This style uses the same components of style 6, but the Floorplan has 
      been changed. The first two columns contain the first two counters, the 
      next two columns are the multiplexer and output register, and the last two 
      columns contain the third and fourth 
      counter.  |   
If you have read this page 
and found it useful, please send an email to philip@fliptronics.com 
 
Copyright ?1998, 
1999, 2000, 2001, 2002  by Fliptronics. All rights 
reserved. 
Fliptronics, Sunnyvale, CA 94086-7629, USA TEL: 408-737-0295, E-mail: 
philip@fliptronics.com 
                 |