Introduction
One can think of a clock in an SOC being analogous to the heart beat in humans. In most chip designs today, we run clocks with fixed frequencies that guarantee correct silicon operation in all the corners. This is similar to the human system operating with a constant heartbeat tuned to the most stressful conditions. As we all know, our pulse rates vary under a closed feedback loop with the surrounding operating conditions (emotional or physical), leading to a high degree of energy and performance efficiency. The concept of sleeping with a 160bpm pulse rate seems strange to us, so why does the same concept not seem strange for SOC operation? In our pursuit of a safe margin of error under worst case conditions, we leave efficiency (lowest power at highest performance) on the table during other, more benign, operating conditions.
Owing to process variations, different instances of the same circuit exhibit significant power/performance variability, even when they are working with the same temperature and voltage conditions. This variability gets worse with fluctuating voltage (i.e. IR drops) and temperature (i.e. hot spots) introduced by the SOC's operation. If one can imagine that every circuit segment consisting of a register to register datapath has a variable natural frequency of data transfer, would it not be wonderful to have runtime performance closely track variability in real time? Going further, would it not be even better to manage SOC power through voltage regulation based on workload demand, and let your circuit immediately adjust to its best natural frequency for those local conditions? This vision would take the concept of adaptive voltage and frequency scaling closer to its physical limit.
Elastix realizes this vision by providing solutions to reduce power, increase performance and improve robustness of a circuit under varying operating conditions. It is based on a proprietary technology that enables the circuit to keep track of its variability and let it approach its natural working frequency. The technology can be applied with a design flow and tool set that is practically identical to the one used for synchronous circuits. Starting from the layout of a synchronous circuit, the resulting circuit will resemble the original one, with the inclusion of a few incremental changes that will incorporate elasticity in the clock, and be guaranteed to be functionally equivalent to the original version.
Elastic clocks run at their natural heart rate
Some of the variability factors of a circuit are random. For
example, the dopant concentration at every transistor channel may vary from
transistor to transistor regardless of their location. However, a significant
part of the variability exhibits spatial correlation. For example, the temperature
affects homogeneously those transistors that are nearby in large regions of the
die. Likewise, voltage variations affect in a similar way those transistors
that have local proximity, since they often share a power supply branch.
If the frequency of a circuit could be dynamically adapted,
it would be ideally possible to find the optimal clock period for every cycle
given the process variations and the current voltage and temperature conditions.
We will refer to this optimal frequency as the natural frequency of the
circuit. Elastix provides a technology that enables every circuit to approach
its natural frequency at every clock cycle. This technology is based on the
concept of Elastic Clocks.
Elastic Clocks greatly rely on spatial correlation to
provide an efficient solution. In its simplest view, an Elastic Clock can be
contemplated as an adaptable oscillator with local proximity to the critical
paths and with a period that closely matches the delay of the circuit at every
cycle. Thus, the margins to ensure a safe operation can be reduced and the
clock period can be dynamically adjusted to highly mitigate the waste of
performance and power.
On the contrary, the conventional rigid clocks are generated
by electronic oscillators regulated by external quartz crystals that are
unaware of the internal variability of the circuit. This is the reason why
overly conservative guard margins must be added to the clock period to cover
the circuit delays under any operating conditions.
Elastic Clocks trigger the activity of elastic modules.
However, these modules cannot be considered in isolation. They must communicate
with their neighborhood in a reliable way. Elastix provides the technology that
enables the elastic modules to communicate consistently with their environment.
Elastic handshakes harmonize the communication between modules
An elastic system may be composed of several elastic modules
that need to communicate to each other, e.g., execution unit, floating-point
unit, instruction memory, etc. For tracking variability and increasing the
local proximity, each module may have its own Elastic Clock.
When two modules need to exchange data, their Elastic Clocks
must synchronize for a reliable communication. Elastix provides proprietary
solutions for two types of scenarios:
- Tightly-coupled communication, which is suitable for those
modules that have a cycle-by-cycle data exchange. An example of this scenario
would be the communication between an instruction cache and the execution unit.
- Loosely-coupled communication, which is suitable for those
modules that have occasional data exchange. This situation mostly occurs on SoC
design in which the IP cores sporadically interact. This scenario will be
further discussed in a forthcoming section (Elastic SoCs).
For a reliable and sustainable tightly-coupled communication
between two modules, the Elastic Clocks must reconcile their frequencies and
synchronize at known time instants. In this way, the flip-flops receiving data
from the sender have a reliable time reference for sampling the incoming data.
This reconciliation is performed by means of elastic handshakes
that enforce the Elastic Clocks to wait for each other at agreed time instants.
The Elastix design flow and tools automatically insert the logic implementing
these handshakes, and generates timing constraints (in the standard SDC format)
that ensures correct synchronization at all corners after physical design.
The synchronization among different Elastic Clocks in a
complex system implies that the frequency of the system will be determined by
the module that has the Elastic Clock with slowest frequency. This module may
change dynamically because of the varying operating conditions, e.g., one
module may become critical during a certain period of time, whereas another module
may become critical later. However, the matching of frequencies through the
elastic handshakes will make the circuit work safely regardless the operating
conditions of each individual module.
The figure below depicts a more detailed description of the implementation
of matched delays between pairs of handshake controllers. Delays in the forward
and backward directions are included to adjust the arrival of the request
and acknowledge signals of the handshake.
To track variability, the matched delays are physically located close
to the critical paths of the logic, thus increasing the spatial
correlation between them.
Elastic Clocks can track short-term variability
Nowadays, there are known techniques to scale the voltage
and the frequency of a system for a better adaptation to the process
variability and operating conditions (AVS and DVFS). These techniques are based
on open and closed feedback loops to adjust voltage and frequency. An essential
component of these systems is the voltage regulator that can respond to the
requirements of a system and provide the supply voltage at a specific value. However,
voltage regulators have long response times, on the order of microseconds, and
cannot react to short-term variability.
Elastic Clocks offer an extra value to the existing
techniques: the frequency (clock period) can be scaled at every cycle. The
benefits of this extra value can be exploited in different situations. For
example, IR drops can be assimilated immediately by the Elastic Clocks
adjusting the period to accommodate the possible delay variations in the
circuit. Compared to the existing techniques, Elastic Clocks can reduce the margins
required for IR drops and, additionally, bring more robustness to the
occurrence of unexpected voltage fluctuations. The reason for that is that
voltage fluctuations are absorbed by the Elastic Clocks, thus immediately
adjusting the period to accommodate the extra delay incurred by the logic
in the circuit.
Elastic Clocks can also adapt to variable-delay
computations. For example, the delay of a multiplier can be occasionally
reduced if some specific logic may detect that one of the operands is zero. A
similar situation occurs when fast carry propagations are produced on
arithmetic operations with short operands. With the capability of changing the
clock period at every cycle, Elastic Clocks can also react immediately to these
predictable situations.
Adjusting voltage and frequency for just-in-time delivery
A useful scenario for the application of elastic clocks to
reduce power by means of (dynamic) adaptive voltage scaling is when the Elastic
Clocks are used as distributed performance monitors. Unlike ring oscillators
and critical path replicas used in previously available AVS solutions, the
Elastic Clocks are:
- Placed very close to the logic they must track, in order to
benefit from smaller OCV margins due to local spatial correlation.
- All kept in a "flexible lockstep" thanks to the handshaking
structure, rather than all kept in the fixed lockstep by the skew constraints
of a traditional clock distribution network. This ensures, among other things,
lower on-chip noise and lower electro-magnetic emission (also called EMI or
EME) due to the temporal spreading of clock edges and the resulting chip
activity.
- Automatically taking into account the clock skew and the data
skew resulting from supply voltage changes, thus ensuring much better
robustness in cases where multiple voltage supplies must be used (e.g. for
memories and logic).
When two communicating blocks in a voltage scaling scenario exchange
data and are kept at different supply voltages, those supply voltages can be
independently tuned in a closed-loop control fashion, by monitoring the relative phase
difference between the handshaking signals. The voltage of the "late" block can
be increased, in order to compensate its slower speed, or the voltage of the
"early" block can be lowered to reduce its power consumption.
This mechanism, in particular, ensures "just-in-time"
communication between an elasticized and a synchronous block, where now the
phase difference is measured between the clock and the handshake. This
effectively forces the elastic block to communicate correctly with the
synchronous block under every operating condition, but always at the lowest
voltage which does not result in timing violations. Process variations and
long-term operating condition variations (temperature changes, residual ripple
from the power supply) can be compensated, while short-term variations can
sometimes be absorbed by elastic clocks due to a dynamic useful skew mechanism,
if they occur far enough from the interfaces with the synchronous world.
As an additional feature, elastic circuits can also be run
in synchronous mode, where the matched delay lines between the controllers now
act purely like process monitors in traditional AVS. This provides additional
risk reduction during the early phases of the adoption of the Elastix
technology. In other words, if there are concerns about the long-term
reliability of the chip when running in elastic mode, it can still be run
purely in synchronous AVS mode at the flip of a switch.
Elastic SoCs and system-level power management
SoCs are composed of heterogeneous IP cores that implement
different tasks in a collaborative way. In the same SoC we can find
microprocessors, co-processors with DSP and multimedia functions, memory
controllers, UARTs, etc. These cores may even come from different suppliers and
have different specifications in terms of clock frequency, supply voltage and
power consumption.
An SoC typically has an interconnect network that provides
the communication layer among different tasks executed in various cores. The
communication must be performed in such a way that the timing of each
transaction guarantees no synchronization failures in the system. In
synchronous SoCs, two schemes are used for robust communication:
- Harmonize the frequencies of the communicating cores. For
example, the SoC can be designed with a master clock of 800 MHz with all the
cores working at rationally-related frequencies (e.g. 800, 400 and 200 MHz).
This scheme ensures a robust communication but prevents each core from working
at its optimal speed. Timing convergence is another important design issue when
the same clock must be distributed within a large chip.
- Use asynchronous interfaces between independent clock domains.
This scheme is typically implemented using flip-flop synchronizers or
asynchronous FIFOs. Even though timing convergence becomes much simpler with
this scheme, a significant latency penalty is incurred at each transaction
usually with a significant negative impact in performance. This impact is especially
even more severe when the synchronization is performed in critical loops of the
system.
Elastix offers a solution for SoC communication that
inherits the best features of the previous schemes and avoids their
deficiencies. By having an SoC with elastic cores, Elastix provides a
proprietary interface (Elastic FIFOs) that enables each core to run at its
natural frequency while still having a low-latency communication and re-using
existing communication fabrics (e.g. AHB- or AXI-based). Thus, Elastix offers a
solution that conciliates modularity, timing convergence and efficiency.
Elastic SoCs open the door to new and more efficient power
management techniques. The power consumption of each core can be simply
adjusted by changing the supply voltage, since the frequency of the Elastic
Clocks will be automatically adapted without any external intervention. With
the elastic interfaces that isolate clock domains, power management schemes at
any level of granularity can be devised and larger power savings can be obtained.
Design flow
The Elastix technology can be incorporated into a circuit
with a non-disruptive design flow. The designer can use her/his favorite EDA
flow to produce a synchronous netlist with a physical placement and without
clock trees. At this point, the designer may have optimized the layout to
optimize the cost of the implementation in area, performance and/or power. The
design may contain conventional structures for low power (e.g. clock gating)
and test (e.g. scan chains and BIST). These structures will be preserved and
supported after the insertion of the Elastic Clocks.
Elasticity is incorporated in three steps:
- The circuit is partitioned into elastic modules and an Elastic Clock tree is
synthesized for each module.
- The handshakes between neighboring clocks are created.
- The delays of the Elastic Clocks are synthesized.
After the insertion of the Elastic Clocks, the final layout
can be synthesized.
Timing and matched delays
The local controllers that generate the Elastic Clocks must synchronize in a way
that the data transfers between the elastic modules are robust. For this reason,
the synthesis of the matched delays between the controllers is an essential
step in the design of elastic circuits.
The two figures below represent the setup and hold constraints related to the
data transfer between two registers. In these constraints, two paths are always
compared. For a correct timing, one of them (in red) must be faster than
the other (in green).
These constraints are used for all corners and on-chip variability (OCV) conditions
to synthesize the matched delays and post-layout sign-off.
Verification
Verification is an essential step in any EDA flow and
Elastix provides a complete flow to ensure the correctness of the circuit. An
elastic circuit can be validated using a simulation setup similar to the one of
a conventional synchronous circuit, reusing the same testbenches and
assertions.
Additionally, Elastix also provides a flow that guarantees a
full coverage to catch both functional and timing-related bugs in the
controllers and their interaction with the datapath. This flow is based on
formal verification tools and is performed in two steps:
- Functional verification, aimed at proving the functional equivalence
of the circuit with respect to the original synchronous one. Functional
equivalence can be proven with existing equivalence checking tools. Additionally,
the handshake protocol implemented by the Elastic Clocks is verified using Assertion-Based
Verification tools, where the assertions guarantee the correct generation of clocks
events.
- Timing sign-off, aimed at checking that data are captured at the
sequential elements without violating any setup/hold constraint. A set of SDC
constraints is generated and verified for each design.
Summary
Elastix provides a technology that faces one of the most
challenging problems in IC design nowadays: power consumption. By adopting the
inherent variability manifested on every chip and incorporating it into the
clocks that synchronize the data transfers, Elastix offers a solution that
contributes to reduce the conservative margins required to operate at
worst-case conditions. Instead, the Elastic Clocks keep track of the
variability and adapt accordingly, so as to only invest the required energy at
every time instant. This technology can be adopted by using non-disruptive EDA
methods based on the existing commercial flows.
|