Evaluation Engineering - 24

ATE

PROTECTING AI CHIPS
FROM THERMAL
CHALLENGES DURING
ATE TEST
by Carl Peach and Yi Zhang
iStock.com/gmast3r

As the industry realizes the impact that artificial intelligence
(AI) can have on a wide range of applications, many companies are seeking to develop AI chips that will speed up the
processing of machine learning and algorithm processing.
According to IDC, there are up to three dozen venture-funded
AI chip startups and in-house chip development initiatives
within large data center operators.
An AI system has three components: large data sets, machine
learning algorithms, and computing hardware to process the
data. The demand for computing power specifically has led to
a rapidly growing and highly competitive market for AI chips.
While AI chip designers are pushing die sizes toward the
reticle limits of the silicon manufacturing processes, new innovative chip architectures are being introduced to maximize
performance per square millimeter of die size. This has led to a
rapid increase in device power density, which is quickly reaching
the thermal limits of the silicon processes and device packaging
technologies. It is projected that the amount of dark silicon, the
part of a silicon die that must be powered-off to meet a given
thermal design power (TDP) constraint, may reach 50% to 80% at
7nm. AI chip designers are investing a lot of time and resources
in optimizing device heat dissipation and thermal management
to minimize the amount of dark silicon on their devices.

Thermal challenges in ATE test
For these devices, the switching activities from structure scan
test in device bring-up and in volume production further increase the device current draw which exacerbates the thermal
issue in ATE test. It is likely that core supply current in AI chips
will approach 900A (~700W) in 2019.
This is especially problematic in device evaluation and early
production stages, where devices are not fully debugged, and
test programs are in development. An unidentified device defect
or an error in the test program could cause thermal runaway,
a condition where device internal resistances would decrease,
leading to higher currents and higher temperature. The situation

24

EVALUATION ENGINEERING JUNE 2019

leads to further lowering internal resistances until thermal
damage occurs. This damages both devices under test (DUT)
and the ATE's Front End Hardware (FEH), including sockets,
probes, and DIB interfaces. Thermal runaway results in both a
financial loss as well as, and often more importantly, a delay in
time-to-market.

ATE thermal protection requirements
for high current AI chips
There are several key criteria that need to be considered for an
ATE thermal protection system:
Real-time shutdown of power supplies. The response time
of the protection solution needs to be much faster than other
operations such as pattern bursts. It cannot rely on the operating system, which may be busy at the worst possible time doing
testing related tasks. To ensure that sockets are not damaged
during package test, a 100ms response time may be adequate,
but probe testing requires a faster response such as 50ms.
Entire site needs to shut down. The solution also needs to
ensure that the power from multiple power supplies feeding a
single site is shut down as well in case of an event.
Applicable to a wide range of AI chip designs. AI chips
can have a variety of thermal sensors at multiple locations on
a die, some accessible by analog measures and some accessible
by register reads. Some chips do not include on-chip thermal
sensors for time-to-market or other technical reasons. The ATE
thermal protection solution needs to be able to work for all
these scenarios.
No device yield impact. The protection mechanism cannot
impact device test yields and ATE instrument performance or
features. For example, if a shutdown mechanism fails all sites
due to a thermal issue only from one of the sites, test yields
would be affected.
Applicable to single-site and multi-site test: In addition
to a shutdown of the core supplies, other supplies, instruments,
and channels connected to that site should be shut down as


http://www.iStock.com/gmast3r

Evaluation Engineering

Table of Contents for the Digital Edition of Evaluation Engineering

Editor's Note: How safe are 5G signals?
By the Numbers
Industry Report
Tech Focus
Featured Tech
IMS, Sensors Expo Preview
Data Acquisition Systems: Turnkey use, portability, and real-time analysis among customers' DAQ must-haves
Oscilloscopes: The recent history of today's high-end oscilloscope technology
RE/Microwave Test: 5G Brings new onslaught of challenges
ATE: Protecting AI Chips from thermal challenges during ATE test
Sensors: Sensors open new vistas in electronics
Evaluation Engineering - Cover1
Evaluation Engineering - Cover2
Evaluation Engineering - 1
Evaluation Engineering - By the Numbers
Evaluation Engineering - 3
Evaluation Engineering - Industry Report
Evaluation Engineering - 5
Evaluation Engineering - Data Acquisition Systems: Turnkey use, portability, and real-time analysis among customers' DAQ must-haves
Evaluation Engineering - 7
Evaluation Engineering - 8
Evaluation Engineering - 9
Evaluation Engineering - 10
Evaluation Engineering - 11
Evaluation Engineering - 12
Evaluation Engineering - RE/Microwave Test: 5G Brings new onslaught of challenges
Evaluation Engineering - 14
Evaluation Engineering - 15
Evaluation Engineering - 16
Evaluation Engineering - 17
Evaluation Engineering - 18
Evaluation Engineering - 19
Evaluation Engineering - 20
Evaluation Engineering - 21
Evaluation Engineering - Oscilloscopes: The recent history of today's high-end oscilloscope technology
Evaluation Engineering - 23
Evaluation Engineering - ATE: Protecting AI Chips from thermal challenges during ATE test
Evaluation Engineering - 25
Evaluation Engineering - Tech Focus
Evaluation Engineering - 27
Evaluation Engineering - Featured Tech
Evaluation Engineering - 29
Evaluation Engineering - IMS, Sensors Expo Preview
Evaluation Engineering - 31
Evaluation Engineering - Sensors: Sensors open new vistas in electronics
Evaluation Engineering - Cover3
Evaluation Engineering - Cover4
https://www.nxtbook.com/endeavor/evaluationengineering/novemberdecember2020
https://www.nxtbook.com/endeavor/evaluationengineering/Evaluation_Engineering_October_2020
https://www.nxtbook.com/endeavor/evaluationengineering/september2020
https://www.nxtbook.com/endeavor/evaluationengineering/August_2020
https://www.nxtbook.com/endeavor/evaluationengineering/july2020
https://www.nxtbook.com/endeavor/evaluationengineering/mayjune2020
https://www.nxtbook.com/endeavor/evaluationengineering/april2020
https://www.nxtbook.com/endeavor/evaluationengineering/march2020
https://www.nxtbook.com/endeavor/evaluationengineering/february2020
https://www.nxtbook.com/endeavor/evaluationengineering/january2020
https://www.nxtbook.com/endeavor/evaluationengineering/december2019
https://www.nxtbook.com/endeavor/evaluationengineering/november2019
https://www.nxtbook.com/endeavor/evaluationengineering/october2019
https://www.nxtbook.com/endeavor/evaluationengineering/september2019
https://www.nxtbook.com/endeavor/evaluationengineering/august2019
https://www.nxtbook.com/endeavor/evaluationengineering/july2019
https://www.nxtbook.com/endeavor/evaluationengineering/june2019
https://www.nxtbook.com/endeavor/evaluationengineering/may2019
https://www.nxtbook.com/endeavor/evaluationengineering/april2019
https://www.nxtbook.com/endeavor/evaluationengineering/march2019
https://www.nxtbook.com/endeavor/evaluationengineering/february2019
https://www.nxtbookmedia.com