#### Resilience through Self-Configuration in the Future Massively Defective Nanochips

Piotr Zając (a,b), <u>Jacques Henri Collet</u> (a), Jean Arlat (a), and Yves Crouzet (a) (a) LAAS- CNRS, Université de Toulouse, 7 av. du Colonel Roche 31077 Toulouse, France (b) Department of Microelectronics and Computer Science, Technical University of Lodz, Al. Politechniki 11, 93-590 Lodz, Poland {pzajac, jacques.collet, jean.arlat, yves.crouzet}@laas.fr We typically consider nanochips with several hundreds billions transistors, as expected from the reduction of dimensions?

Which problems are expected?

1) The control of the physical complexity

2) The increase of the defect rate in nanochips inexorably due to the reduction of dimensions

## **Control of physical complexity**

**Response:** Massively replicative architectures



We consider massively replicative architectures with hundreds of cores, typically 100 to 1000 cores.

#### The increase of the defect rate in nanochips

Consequence: A large fraction of cores will be defective. We consider the issue of dependability in these replicative architectures when 10 to 40% of cores are defectives



- 1) Self-diagnosis of cores by <u>mutual test</u> with as little external control as possible.
- 2) Self-configuration of communications.
- 3) Self-shutdown of defective (or inaccessible) cores.
- 4) Self-allocation of tasks at runtime (including redundancy.



This architectural analysis is mostly independent of the technology ! It holds for quantum dots, molecular electronics, or "classical" nanoelectronics. It is an architectural approach to dependability in massively replicative and defective chips.











#### Self-Configuration of routes



**Principle: Contract Net Protocol** 

**Step 1**: Emit a Request Message, which is broadcast across the Single Connected Zone (flooding, possibly inside a propagation radius). Each core adds the route in the route field of each forwarded message

**Step 2**: Each core sends one Acknowlegment message to the IOP, which follows the RM route in the opposite direction.

**Step 3**: The IOP (i.e., the emitter) stores the discovered routes in a special buffer, the Valid Route Buffer.

# Route discovery efficiency of flooding mechanism 2D-square network



## Route efficiency versus node connectivy



### Conclusion



1) It is an architectural approach to the resilience in large arrays of cores, mostly independent on the node technology

2) The keywords for massively defective nanochips with hundreds billions transistors are AUTONOMY, SELF-DIAGNOSIS, SELF-CONFIGURATION of COMMUNICATIONS (black-box like chip)

a) Diagnosis through mutual-test to split the chip in singleconnected zones isolating the defective nodes (or the clusters of defective nodes)

b) Self-configuration of routes (i.e., route discovery) by broadcasting a message and adding the route on the fly in the message field.