2022 year, volume 26, issue 4 (PDF)
The results of the analysis of some available sources are presented, on the basis of which an automatic learning model for artificial neural networks was implemented.
Keywords: image recognition, artificial neural network, automated machine learning, automatic learning model, hyperparameters.
Consider convolutional neural network optimization problem using small dataset. The proposed method fine-tune pretrained neural network with specigic constrains for convolutional kernels. Pretrained weights are decomposed using SVD. During fine-tuning the only singular values are trained. In the paper are investigated the dinamic of singular values during training process and its influence on quality of resulting model. The method are comapared with other approaches in rengenological images classsification problm.
Keywords: deep learning, convolutional neural networks, singular value decomposition.
Within the practical application of neural networks, the number of parameters in the network is much larger than the number of samples in the dataset, however, the network still has good generalization characteristics. Traditionally considered that such over-parameterized and non-convex models can easily fall into local minima while searching for the optimal solution and show low generalization performance, but in fact it is not. Although under some regularization conditions it is possible to effectively control the network generalization error, it is still difficult to explain the generalization problem for large networks. In our work, we determine the difference between the overfitting step and the feature learning step by quantifying the impact of updating one sample during gradient descent on the entire training process, revealing that neural networks generally have less impact on other samples during the overfitting step. In addition, we use the Fisher information matrix to mask the gradient produced by the backpropagation process, thereby slowing down the neural network’s overfitting behavior and improving the neural network’s generalization performance.
Keywords: Neural Networks, Generalization, Overfitting, Fisher Information.
It is shown that, under some natural restrictions on the element basis, for any recurrent circuit with a fixed number of inputs, it is possible to construct a circuit functionally equivalent to it in the same basis using superposition operations and no more than a double application of the feedback operation with a linear increase in the number of delays used compared to original circuit implementation. Thus, the linearity of the order of memory growth in the transition to the optimal (in terms of the number of delays) scheme with no more than two feedbacks has been proved. Short-term and long-term memory modules are distinguished in the structure of the recurrent circuits with no more than two feedbacks constructed in the work. The result obtained, in particular, is valid for the class of finite automata, as well as for the class of neural circuits built from elements containing gates.
Keywords: recurrent circuits, feedback, linear estimation, gates.
To study the efficiency of distributed algorithms, a number of mathematical formalizations with new concepts of algorithm complexity are introduced. In this paper complexity of finding strongly connected components in the AMPC model is investigated. AMPC model, unlike other more limited distributed formalizations, allows to build a query tree within one step of the algorithm. A probabilistic algorithm is obtained that implements strongly connected components search in polylogarithmic or sublinear time (depending on the amount of available local memory). The amount of required local memory is sublinear with respect to the number of vertices in the graph.
Keywords: distributed algorithms, probabilistic algorithms, strongly connected components.
The paper investigates the question of the angle between the normal vectors of separating hyperplanes defined by linear test algorithms. The question is considered for various reference sets of tests.
Keywords: linear test algorithms separating hyperplanes, dead-end tests.
This article considers the problem of \[К\]- and \[К\]-finite generation for precomplete classes of linear automata operating over the Galois field consisting of two elements. The set of all studied classes is the \[A\]-criterion system in the class of linear automata. A finite basis was presented for each class under consideration.
Keywords: finite automaton, linear automaton, composition operation, feedback operation, completeness, closed class, precomplete class, \[К\]-finitely generated class, \[A\]-finitely generated class.
The article considers applying a cellular automaton with locators to solving the problem of vector addition. The locator cellular avtomaton model assumes the possibility for each cell to translate a signal through any distance. It is proven in this article that such possibility allows to decrease the problem complexity from linear to logarithmic (against the classic cellular automaton model).
Keywords: cellular automata, homogeneous structures, the problem of vector addition.
For now some facts about completeness of linear automata are proven. We now know about every precomlete class on superposition operetion and composition operation, the completeness criterions are also had been formulated. In this work we have proven some facts about the complexity of this process: to receive neutral element and delay.
Keywords: linear automata, comlexity estimation.
The present paper considers a problem of functional classes obtained by using neural networks on max non-linearities bases. Firstly, some properties of CPL-functions and equivalence classes generating them are investigated. Proceeding from these properties a theorem is proved that neural networks built on the basis of linear and max non-linearity functions can exactly recover any convex CPL-function. Secondly, RELU-basis, a special case of max non-linearities bases, is investigated, with a theorem similar to the previous one mentioned above proved. The question of estimating the number of neurons and layers in obtained architectures is also discussed. All the mentioned theorems have a constructive proof, i.e. neural network architectures with mentioned features are built explicitly.
Keywords: Neural networks, architecture, functions recovery, functions expressibility, convex functions, particle-linear functions, ReLU function, max function.