



# On Improving the Robustness Of Convolutional Neural Networks

Juan-Carlos Ruiz-García

ITACA-UPV (Spain) jcruizg@disca.upv.es

IFIP WG 10.4 SUMMER MEETING 27th-30th JUNE 2024, GOLD COAST (AUSTRALIA)



## **CNNs in Cyber-Physical Systems**

#### Use of Convolutional neuronal networks (CNN) for environmental sensing



Transportation Space exploration

- □ Real-time constrainsts in decisions → need of local inference → use of HW accelerators to support totally or partially the CNN inference process
- In edge devices, robustness must be considered attending to PPA properties (performance, power consumption and area)



#### **Prototyping HW-based CNN** acceleration solutions for edge devices



#### **Graphics Processing** (GPU)

#### ✓ Performance

Energy consumption







#### **Prototyping HW-based CNN acceleration solutions for edge devices**



#### Graphics Processing (GPU)

- ✓ Performance
- Energy consumption



#### Tensor processing (**TPU**)

- ✓ Performance
- Energy consumption
- × Flexibility





# Prototyping HW-based CNN acceleration solutions for edge devices



#### Graphics Processing (GPU)

✓ Performance

Energy consumption

# Field-Programmable Gate Array (**FPGA**)

- ✓ Performance per watt
- ✓ Flexibility and adaptability
- 🗴 Design

#### Tensor processing (**TPU**)

- ✓ Performance
- Energy consumption
- × Flexibility











```
library IEEE;
use IEEE.STD LOGIC 1164.ALL;
use IEEE.NUMERIC STD.ALL;
entity DualPortRegisterFile is
   Generic (ADDRESS_SIZE : POSITIVE;
           REGISTER_SIZE : POSITIVE);
   Port ( rst i : in STD LOGIC;
         olk_i : in STD_LOGIC;
          en_i : in STD_LOGIC;
          write en i : in STD LOGIC;
          readReg1 i : in STD LOGIC VECTOR (ADDRESS SIZE-1 downto 0);
          readReg2_i : in STD_LOGIC_VECTOR (ADDRESS_SIZE-1 downto 0);
          writeReg i : in STD LOGIC VECTOR (ADDRESS SIZE-1 downto 0);
          writeData i : in STD LOGIC VECTOR (REGISTER SIZE-1 downto 0);
                                                                                                                          RTL design
          readData1_o : out STD_LOGIC_VECTOR (REGISTER_SIZE-1 downto 0);
          readData2_o : out STD_LOGIC_VECTOR (REGISTER_SIZE-1 downto 0));
end DualPortRegisterFile;
architecture Behavioral of DualPortRegisterFile is
   type RegFile is array (0 to (2**ADDRESS SIZE)-1) of STD LOGIC VECTOR (REGISTER SIZE-1 downto 0);
   signal registers : RegFile := (others => (others => '0'));
begin
   process(rst_i, clk_i)
   begin
       if rst_i = '1' then
          registers <= (others => (others => '0'));
       elsif rising edge(clk i) then
           if en_i = '1' then
               if write_en_i = '1' then
                  registers(to_integer(unsigned(writeReg_i))) <= writeData_i;</pre>
               end if;
           end if;
       end if;
   end process;
   readData1 o <= registers(to_integer(unsigned(readReg1 i)));</pre>
   readData2_o <= registers(to_integer(unsigned(readReg2_i)));</pre>
end Behavioral;
```







end Behavioral;







end Behavioral;





Institute







- Multiplicity and location of faults (single vs multiple faults, accidental vs malicious faults)
- Duration of the fault injection campaigns: use of statistical fault injection
- Fault injection process (reproducibility, representativity, CNN optimizations)
  - Which low-level faults can be emulated using high-level models? How?
  - How do CNN optimizations impact the fault injection process?

On improving the robustness of CNNs using In-Parameter Zero-Space ECCs Juan Carlos Ruiz García, (jcruizg@disca.upv.es) Workshop VERDI @ DSN2024 June 24<sup>th</sup>, Brisbane (Australia)



POLITÈCNICA





\* Note: The concrete división between red and green bits will vary from one CNN to another

[SAFECOMP 2024]<sup>2</sup> Use of non-significant and invariant bits for BF16 CNN protection with ECCs

Invariants (No protection required, use for ECC) Bits to protect 1 x x x 1 1 0 0 0 0 0 0 0 5 x x x 1 0 9 8 7 6 5 4 x x x x

\* Note: The concrete división between blue, red and green bits will vary from one CNN to another

1[EDCC 2024] Juan Carlos Ruiz, David de Andrés, Luis J. Saiz-Adalid, Joaquin Gracia-Moran: Zero-Space In-Weight and In-Bias Protection for Floating-Point-based CNNs. EDCC 2024: 89-96, Lovaina (Bélgica), Abril 2024.
2[SAFECOMP 2024] Juan Carlos Ruiz, David de Andrés, Luis J. Saiz-Adalid, Joaquin Gracia-Moran: In-Memory Zero-Space Floating-Point-based CNN Protection Using Non-Signicant and Invariant Bits, SAFECOMP 2024, Florencia (Italia), Septiembre 2024.





# On Improving the Robustness Of Convolutional Neural Networks

Juan-Carlos Ruiz-García

ITACA-UPV (Spain) jcruizg@disca.upv.es

IFIP WG 10.4 SUMMER MEETING 27th-30th JUNE 2024, GOLD COAST (AUSTRALIA)



### Already known facts



#### INT8 CNNs are more robust than FP32/BF16 CNNs to single bitflips

- FP32: 22,84 → 01000001101101011100001010001
  - → 01100001101101011100001010001 (**421323637458275900000!!**)
- INT8: 68 → 01000100 → 01100100 (100)
- This may not be true in the case of multiple faults
  - FP32: 22,84 → 01000001101101011100001010001
     → 011000011011011010110001010001 (no effect on CNN accuracy)
  - INT8: 68 → 01000100
    - $\rightarrow$  10100100  $\rightarrow$  potential effect on CNN accuracy

On improving the robustness of CNNs using In-Parameter Zero-Space ECCs Juan Carlos Ruiz García, (jcruizg@disca.upv.es) Workshop VERDI @ DSN2024 June 24<sup>th</sup>, Brisbane (Australia)



### Importance of multi-bitflips

Accidental faults: The number of bits altered by a single ionizing particle increases as CMOS integration does and voltaje is reduced



On improving the robustness of CNNs using In-Parameter Zero-Space ECCs Juan Carlos Ruiz García, (jcruizg@disca.upv.es) Workshop VERDI @ DSN2024 June 24<sup>th</sup>, Brisbane (Australia)





### Importance of multi-bitflips

#### Malicious faults: A reduced number (5-10) of flipped bits in parameters can lead a CNN to crush

Adnan Siraj Rakin, Zhezhi He, and Deliang Fan, "Bitflip attack: Crushing neural network with progressive bit search" in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1211–1220, 2019.



### **Using Integers**

#### □ Example using affine (asymmetric) quantization and FP32/BF16→INT8

UNIVERSITAT Politècnica de València





**Operations with quantized numbers**\*

$$y(j) = B(j) + \sum_{i=0}^{l} x(i) \times w(i, j)$$

biases are adjusted so that  $\rightarrow z_b = 0$  and  $S_b = S_x \times S_w$ and remember the dequantization formula  $\rightarrow r_2 = s_2(q_2 - z_2)$ 

$$s_{y}(q_{y}-z_{y}) = S_{b} x q_{b} + \sum_{i=0}^{I} S_{x}(q_{x}-zx) \times S_{w}(q_{w}-zw)$$

$$q_{y} = \frac{S_{x} \times S_{w}}{S_{y}} q_{b} + \frac{S_{x} \times S_{w}}{S_{y}} \left[ \sum_{i=0}^{I} (q_{x} - zx) \times (q_{w} - zw) \right] + z_{y}, \text{ where } M_{0} = \frac{S_{x} \times S_{w}}{S_{y}} \in [0.1[$$

**Tip**:  $M_0 = 0.111 \rightarrow M'_0 = 2^3 x M_0$  [shift left] = 111 →  $M_0 = M'_0 / 2^3$  [shift right] So  $M'_0 = 2^{32} M_0$ 





\* [CoRR 2021] Markus Nagel et al. "A White Paper on Neural Network Quantization", CoRR abs/2106.08295 (2021)

**On improving the robustness of CNNs using In-Parameter Zero-Space ECCs** Juan Carlos Ruiz García, (jcruizg@disca.upv.es)

UNIVERSITAT Politècnica de València

> Workshop VERDI @ DSN2024 June 24<sup>th</sup>, Brisbane (Australia)



### **C-based implementation**



Institute