Internal noise in deep neural networks: interplay of depth, neuron number, and noise injection step
D. A. Maksimov, V. M. Moskvitin, N. Semenova
TLDR
This paper analyzes how internal noise injection timing (before/after activation) affects deep neural network performance and how pooling can reduce its impact.
Key contributions
- Activation functions act as effective nonlinear filters for internal noise in DNNs.
- Injecting noise before activation leads to higher accuracy, especially for additive noise.
- For noise after activation, multiplicative noise is less detrimental than additive noise.
- Pooling-based noise reduction consistently improves network performance in all cases.
Why it matters
This research offers crucial insights into how internal noise impacts deep neural networks based on its injection point. Understanding these dynamics is vital for designing more robust and accurate models in noisy environments, providing practical guidance for noise mitigation strategies.
Original Abstract
This paper examines the influence of internal Gaussian noise on the performance of deep feedforward neural networks, focusing on the role of the noise injection stage relative to the activation function. Two scenarios are analyzed: noise introduced before and after the activation function, for both additive and multiplicative noise influence. The case of noise before activation function is similar to perturbations in the input channel of neuron, while the noise introduced after activation function is analogous to noise occurring either within the neuron itself or in its output channel. The types of noise and the method of their introduction were inspired by analog neural networks. The results show that the activation function acts as an effective nonlinear filter of noise. Networks with noise introduced before the activation function consistently achieve higher accuracy than those with noise applied after it, with additive noise being more effectively suppressed in this case. For noise introduced after the activation function, multiplicative noise is less detrimental than additive noise, and earlier hidden layers contribute more significantly to performance degradation due to cumulative noise amplification governed by the statistical properties of subsequent weight matrices. The study also demonstrates that pooling-based noise reduction is effective in both cases when noise is introduced before and after the activation function, consistently improving network performance.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.