[go: up one dir, main page]

CN116796815B - A neural network architecture search method and system based on polyhedron transformation representation - Google Patents

A neural network architecture search method and system based on polyhedron transformation representation

Info

Publication number
CN116796815B
CN116796815B CN202310971483.9A CN202310971483A CN116796815B CN 116796815 B CN116796815 B CN 116796815B CN 202310971483 A CN202310971483 A CN 202310971483A CN 116796815 B CN116796815 B CN 116796815B
Authority
CN
China
Prior art keywords
neural network
network architecture
calculation
distance
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310971483.9A
Other languages
Chinese (zh)
Other versions
CN116796815A (en
Inventor
余腾
王子伯炎
马一鸣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sober Heterogeneous Technology Co ltd
Original Assignee
Beijing Sober Heterogeneous Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sober Heterogeneous Technology Co ltd filed Critical Beijing Sober Heterogeneous Technology Co ltd
Priority to CN202310971483.9A priority Critical patent/CN116796815B/en
Publication of CN116796815A publication Critical patent/CN116796815A/en
Application granted granted Critical
Publication of CN116796815B publication Critical patent/CN116796815B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45595Network integration; Enabling network access in virtual machine instances

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Neurology (AREA)
  • Image Analysis (AREA)

Abstract

本发明提供了一种基于多面体变换表示的神经网络架构搜索方法及系统,方法包含:获得神经网络架构中任意两个神经网络架构之间的编辑距离,将编辑距离的度量空间转换为一个多面体结构;将多个核函数级联进行级联,将多个核函数计算的结果组合在一起;选择一个或多个块作为基础,通过对不同的块结构进行搜索和评估,找到最优的神经网络架构;将多个神经网络架构搜索任务分解成多个子任务,利用计算节点间的通信实现并行计算;系统包含:编辑距离计算模块、计算结果组合模块、最优架构输出模块及任务分解模块。本发明多面体表示实现并行性和可扩展性,通过将核函数的计算操作分布在多个面上,可以同时进行多个核函数的计算,从而提高计算效率。

The present invention provides a neural network architecture search method and system based on polyhedral transformation representation. The method comprises: obtaining the edit distance between any two neural network architectures in the neural network architecture, converting the edit distance metric space into a polyhedral structure; cascading multiple kernel functions and combining the results of the kernel function calculations; selecting one or more blocks as a basis, and finding the optimal neural network architecture by searching and evaluating different block structures; decomposing the multiple neural network architecture search tasks into multiple subtasks, and implementing parallel computing through communication between computing nodes. The system comprises: an edit distance calculation module, a calculation result combination module, an optimal architecture output module, and a task decomposition module. The polyhedral representation of the present invention achieves parallelism and scalability. By distributing the kernel function calculation operations across multiple faces, multiple kernel function calculations can be performed simultaneously, thereby improving computing efficiency.

Description

Neural network architecture searching method and system based on polyhedral transformation representation
Technical Field
The invention relates to the technical field of neural network architecture search, in particular to a neural network architecture search method and system based on polyhedral transformation representation.
Background
Neural network structure search is a leading edge research field aimed at automating neural network model design. Conventional neural network model designs typically rely on human experience and expertise, requiring significant time and effort to try and tune. With the advancement of machine learning algorithms and search strategy techniques, neural network structure searches can automatically discover more efficient and optimized neural network structures. The application of the neural network structure search is computer vision, and the neural network structure search has wide application in the field of computer vision. The method can be used for tasks such as image classification, target detection, image segmentation and the like, and the accuracy and generalization capability of the model are improved by automatically searching the optimal network structure. In the field of natural language processing, the neural network structure search can be used for tasks such as text classification, machine translation, voice recognition and the like. By searching the optimal network structure, the performance of the model in terms of text processing and semantic understanding can be improved. The neural network structure search can also be applied to the field of reinforcement learning, and the neural network structure suitable for specific environments and tasks is automatically designed, so that the learning and decision-making capability of an intelligent agent in complex tasks is improved. By automatically searching the optimal network structure, the model performance can be improved, the model design process can be accelerated, and the development of artificial intelligence technology can be promoted.
In the prior art, application number CN202210705263.7 discloses a visual analysis system and method for neural network architecture space, the method comprises the steps of receiving at least one neural network architecture, obtaining graph editing distances between any two neural network architectures in the neural network architecture, obtaining a clustering hierarchical structure of the neural network architecture according to the graph editing distances, wherein each layer of the clustering hierarchical structure comprises at least one neural network architecture category, each category comprises the neural network architecture, and visualizing at least one layer in the clustering hierarchical structure of the neural network architecture, wherein the visual view provides a global overview of the neural network architecture and context information of any one of the neural network architectures, and although the neural network architecture designer is facilitated to design or search a neural network architecture search tool to search the neural network architecture with better performance on various data sets in the search space, the search space and the calculation cost of the neural network architecture search tool are reduced, the layer-based evolution of the clustering hierarchical structure guided by the editing distances limits the final prediction precision of the network, and the effect of changing only one layer on the whole neural network can be weak at a time, so that the visual view is very low in evolution time.
In the second prior art, application number CN202211585844.8 discloses a network searching method and device, electronic equipment and storage medium, wherein the network searching method comprises the steps of acquiring a plurality of sampling networks from a sample data set, coding a current sampling network in the plurality of sampling networks to acquire a current structure code, aggregating the current structure code and a historical structure code to acquire a model mean value, acquiring an optimized structure code according to the optimized precision loss of the model mean value, decoding the optimized structure code to acquire an optimized network structure, and determining a target network structure until the optimized network structure of all the sampling networks is acquired based on iteration according to the optimized network structure, so as to process an object to be processed based on the target network structure. Although the accuracy of network searching can be improved, the evolution process needs to calculate the traditional editing distance and compare with a historical architecture to find an optimal network, and the calculation always has very serious data dependence, so that only sequential kernel functions can be used, but the parallel opportunities are lacking, and the sequential kernel functions in network generation become an important time sequence bottleneck.
In the prior art, application number CN202010144350.0 discloses a neural network architecture search parallelization method and equipment based on MPI, the method comprises the steps of starting MPI processes according to the number of GPUs in a current multi-machine environment, arranging the MPI processes in sequence, reading data from a designated position in a training set according to own serial numbers of the MPI processes after the MPI processes are started, performing gradient calculation, performing gradient protocol calculation on the GPU of each node according to a hierarchical structure, summarizing calculation results into a first GPU in the GPUs, performing gradient full protocol calculation on the first GPU according to a ring structure, broadcasting gradient calculation results from the first GPU in each node according to the hierarchical structure, and updating weights and bias values of the neural network by utilizing the new gradient values. Although the method can effectively accelerate the searching training efficiency of the neural network architecture on the basis of ensuring the recognition rate of the neural network architecture searching result model, greatly reduce the training time, thereby improving the efficiency of an automatic deep learning process, the basic performance of the initial network based on the layer structure is relatively backward, and searching is carried out on the layer structure with poor basic performance, so that the searched result is suboptimal, an optimal searching space cannot be provided, and the neural network architecture with the best performance cannot be searched finally.
The first prior art and the second prior art at present have the advantages that the traditional editing distance is calculated and compared with a historical architecture, and the parallel opportunity is lacked; the invention provides a neural network architecture searching method based on polyhedral transformation representation, which can abstract sequential edit distance calculation into polyhedral transformation representation, eliminate the dependence of calculation on a new abstraction level and realize the calculation on a graphic processor in parallel, thereby improving the speed of running searching, increasing the block-based transformation for new abstraction and improving the prediction precision of the neural network model result.
Disclosure of Invention
In order to solve the technical problems, the invention provides a neural network architecture searching method based on polyhedral transformation representation, which comprises the following steps:
Obtaining editing distance between any two neural network architectures in the neural network architecture, and converting a measurement space of the editing distance into a polyhedral structure, wherein each surface of the polyhedral structure represents a kernel function operation;
Cascading a plurality of kernel functions according to the sequence from simple to complex, wherein the output of each kernel function is used as the input of the next kernel function to form a multi-layer kernel function structure, and the results of the calculation of the kernel functions are combined together;
Selecting one or more blocks from the block structures of the current neural network architecture as a basis, performing operations of increasing the number of layers of the blocks or changing parameters of the blocks on the selected blocks, and searching and evaluating different block structures to find the optimal neural network architecture;
The method comprises the steps of dividing a plurality of neural network architecture search tasks into a plurality of subtasks, executing different subtasks on different computing nodes on an optimal neural network architecture, realizing parallel computation by utilizing communication among the computing nodes, balancing loads of the different computing nodes, and ensuring that the computing tasks of all the computing nodes are distributed and executed uniformly.
Optionally, the process of obtaining the editing distance includes the following steps:
Obtaining any two neural network architectures in the neural network architecture, wherein one of the neural network architectures is named as a first neural network architecture diagram, and the other one of the neural network architecture diagrams is named as a second neural network architecture diagram;
Converting the hierarchical relationship in the first neural network architecture diagram and the second neural network architecture diagram into nodes and edges in the diagram, wherein the nodes represent layers in the neural network, and the edges represent connection relationships among the layers;
Defining a distance parameter in a measurement space, wherein the distance parameter is used for measuring the distance between two neural network architectures, and the distance function selects a Euclidean distance function;
In each iteration, comparing the number of layers of the nodes, and selecting the number of layers of |L| as the distance calculation of the current iteration, wherein the min (M, N) distance can be calculated in parallel during each iteration, wherein M represents the number of layers of the first neural network architecture and N represents the number of layers of the second neural network architecture;
and outputting a final distance result until the distance calculation of all the nodes is completed, wherein the distance result represents the distance between the two neural network architectures.
Optionally, the process of cascading a plurality of kernel functions includes the following steps:
Generating an original neural network comparison kernel of an M-by-N matrix, wherein each element in the matrix represents the connection strength between an M-th layer of the new neural network and an N-th layer of the historical neural network;
Adding zero padding between the last layer of the current neural network and the first layer of the additional network to generate a matrix of M (n+n') for parallel execution of computations;
All newly generated neural networks are connected layer by layer on the original neural network comparison kernel to generate a two-dimensional kernel for calculation and analysis.
Optionally, the splitting process of the M (n+n') matrix comprises the following steps:
Splitting the whole M (N+N') matrix into a plurality of small matrixes with matched sizes, wherein the size of each small matrix is the same as that of a corresponding neural network diagram, so that the calculation task of each small matrix can be processed by one thread block;
on an image processor parallel computing platform, distributing each small matrix to a thread block for computing, wherein the thread block is a group of threads which are executed in parallel, and shares a memory and cooperatively computes;
For each thread block, calculating in the corresponding small matrix according to the connection strength between the M layer of the new neural network and the N layer of the historical neural network;
storing intermediate calculation results by using a shared memory of the thread blocks, and combining the calculation results of each thread block to obtain the calculation result of the whole M (N+N') matrix;
all newly generated neural networks are connected layer by layer to generate a two-dimensional kernel for subsequent calculation and analysis.
Optionally, the process of searching for different block structures includes the following steps:
Adding a selected block to expand the neural network architecture, after adding a new block, connecting the new block with the existing block in series, and adjusting the weight, the bias value and the activation function activation regularization item of the new block;
searching the optimal network architecture and block parameters by using an evolutionary algorithm, and defining an objective function and constraint conditions to guide a searching process according to requirements;
After the conversion is completed, the new neural network architecture is evaluated and tested, and further optimization and adjustment are performed according to the evaluation result.
Optionally, the process of executing different subtasks by different computing nodes includes the following steps:
generating asynchronous control logic by using a main node without a graphic processor, wherein the main node is responsible for coordinating logic task allocation;
The slave nodes use a central processor and an artificial intelligent accelerator card to execute independent tasks, the slave nodes are responsible for asynchronously executing network generation and training, each slave node uses a graphic processor to generate and train a neural network architecture in parallel, and meanwhile, the slave nodes train the current neural network and store the current model file through a shared file system.
And each node acquires candidate neural networks generated by other nodes and corresponding performance indexes by accessing the shared data, and the node selects the optimal candidate neural network and updates the parameters to optimize the model.
Alternatively, the slave nodes each read a history of candidate neural network structures in the shared file system and transform a new set of candidate neural networks on their respective nodes.
The invention provides a neural network architecture search system based on polyhedral transformation representation, which comprises:
The system comprises an edit distance calculation module, a calculation module and a calculation module, wherein the edit distance calculation module is responsible for receiving at least one neural network architecture;
The calculation result combination module is responsible for cascading a plurality of kernel functions according to the sequence from simple to complex of the kernel functions, the output of each kernel function is used as the input of the next kernel function, a multi-layer kernel function structure is formed, and the calculation results of the kernel functions are combined together;
The optimal architecture output module is in charge of selecting one or more blocks from the block structures of the current neural network architecture as a basis, performing operations of increasing the number of layers of the blocks or changing parameters of the blocks on the selected blocks, and searching and evaluating different block structures to find the optimal neural network architecture;
The task decomposition module is responsible for decomposing a plurality of neural network architecture search tasks into a plurality of subtasks, executing different subtasks by different computing nodes on the optimal neural network architecture, realizing parallel computation by utilizing communication among the computing nodes, balancing the loads of the different computing nodes, and ensuring that the computing tasks of all the computing nodes are uniformly distributed and executed.
Optionally, the edit distance calculation module includes:
The abstract processing unit is in charge of obtaining any two neural network architectures in the neural network architecture, wherein one of the neural network architectures is named as a first neural network architecture diagram, and the other one of the neural network architecture diagram is named as a second neural network architecture diagram;
the relationship conversion unit is responsible for converting the hierarchical relationship in the first neural network architecture diagram and the second neural network architecture diagram into nodes and edges in the diagram, wherein the nodes represent layers in the neural network, and the edges represent connection relationships among the layers;
the parallel traversing unit is responsible for defining a distance parameter in a measurement space, wherein the distance parameter is used for measuring the distance between two neural network architectures, and the distance function selects a Euclidean distance function;
The iteration processing unit is responsible for comparing the number of layers of the nodes in each iteration, selecting the number of layers of the |L| as the distance calculation of the current iteration, and calculating the min (M, N) distance in parallel during each iteration, wherein M represents the number of layers of the first neural network architecture, and N represents the number of layers of the second neural network architecture;
And the result output unit is in charge of completing the distance calculation of all the nodes and outputting a final distance result, wherein the distance result represents the distance between the two neural network architectures.
The multi-dimensional kernel cascade method is introduced to solve the problem that sequential computation of kernel functions usually causes performance bottleneck, the multi-dimensional kernel cascade method can achieve parallelism and expandability, computation of the kernel functions can be conducted simultaneously through distributing computation operations of the kernel functions on a plurality of planes, therefore computing efficiency is improved, multi-dimensional kernel cascade is adopted to further optimize search space, large-scale parallel computation is achieved, the search space can be reduced, computing efficiency is improved through combining computation results of the kernel functions together, meanwhile, multi-dimensional kernel cascade can accelerate the whole neural network architecture searching process through parallel computation, the purpose of introducing block-based transformation is to change the layer architecture of a network, the block-based transformation is different from traditional single-layer structure transformation, a group of layer architectures is changed at one time, an optimal network can be found more flexibly, the performance and efficiency of a neural network model can be further improved through searching and evaluating different block architectures, large-scale clustering-oriented computing tasks are distributed to different computing nodes, computing nodes are utilized to achieve communication between computing nodes, and computing efficiency of the cluster can be further improved, and the computational algorithm can be fully utilized.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims thereof as well as the appended drawings.
The technical scheme of the invention is further described in detail through the drawings and the embodiments.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:
FIG. 1 is a flowchart of a neural network architecture search method based on polyhedral transformation representation in embodiment 1 of the present invention;
FIG. 2 is a diagram showing the process of obtaining edit distance in embodiment 1 of the present invention;
FIG. 3 is a schematic diagram of a sequence kernel function based on edit distance in embodiment 2 of the present invention;
FIG. 4 is a diagram showing the addition of parallel kernel functions with respect to edit distance using a polyhedral notation in embodiment 2 of the present invention;
Fig. 5 is a schematic diagram of an embodiment of the present invention.
FIG. 6 is a diagram showing a series of one-dimensional kernel functions to maintain parallelism in embodiment 3 of the present invention;
FIG. 7 is a schematic diagram of a large-scale two-dimensional kernel function-oriented stitching parallel computation in embodiment 3 of the present invention;
FIG. 8 is a diagram showing the splitting process of M (N+N') matrix in example 4 of the present invention;
FIG. 9 is a flowchart illustrating the calculation of each small matrix allocated to a thread block in embodiment 4 of the present invention;
fig. 10 is a diagram showing a process of searching for a different block structure in embodiment 5 of the present invention;
FIG. 11 is a layer-based transformation diagram in example 5 of the present invention;
FIG. 12 is a block-based transform schematic of embodiment 5 of the present invention;
FIG. 13 is a process diagram of different computing nodes executing different subtasks in embodiment 6 of the present invention;
FIG. 14 is a diagram illustrating the decoupling task partitioning and task execution according to embodiment 6 of the present invention;
FIG. 15 is a block diagram of a search system for neural network architecture based on polyhedral transformation representation in embodiment 7 of the present invention;
FIG. 16 is a block diagram showing an edit distance calculation module in accordance with embodiment 8 of the present invention;
FIG. 17 is a block diagram showing a calculation result combining module in embodiment 9 of the present invention;
FIG. 18 is a block diagram of an output module of the optimal architecture in embodiment 10 of the present invention;
fig. 19 is a block diagram of a task decomposition module in embodiment 11 of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described below with reference to the accompanying drawings, it being understood that the preferred embodiments described herein are for illustration and explanation of the present invention only, and are not intended to limit the present invention.
The terminology used in the embodiments of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of embodiments of the application. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the application as detailed in the accompanying claims. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art according to the specific circumstances.
Embodiment 1 As shown in FIG. 1, the embodiment of the invention provides a neural network architecture searching method based on polyhedral transformation representation, which comprises the following steps:
S100, receiving at least one neural network architecture, obtaining editing distance between any two neural network architectures in the neural network architecture, and converting a measurement space of the editing distance into a polyhedral structure, wherein each surface of the polyhedral structure represents a kernel function operation;
s200, cascading a plurality of kernel functions according to the sequence from simple to complex, wherein the output of each kernel function is used as the input of the next kernel function to form a multi-layer kernel function structure, and the results of the calculation of the plurality of kernel functions are combined together;
s300, selecting one or more blocks from the block structures of the current neural network architecture as a basis, performing operations such as increasing the number of layers of the blocks or changing parameters of the blocks on the selected blocks, and searching and evaluating different block structures to find the optimal neural network architecture;
S400, decomposing a plurality of neural network architecture search tasks into a plurality of subtasks, executing different subtasks on different computing nodes on an optimal neural network architecture, and realizing parallel computation by utilizing communication among the computing nodes;
the working principle and beneficial effects of the technical scheme are that the embodiment firstly receives at least one neural network architecture; obtaining edit distance between any two neural network architectures, converting the measurement space of the edit distance into a polyhedral structure, wherein each face of the polyhedral structure represents a kernel function operation, cascading a plurality of kernel functions according to the simple to complex order of the kernel functions, forming a multi-layer kernel function structure by taking the output of each kernel function as the input of the next kernel function, combining the results of the computation of the kernel functions together, selecting one or more blocks from the block structures of the current neural network architecture as a basis, performing operations such as increasing the number of layers of the blocks or changing parameters of the blocks on the selected blocks, searching and evaluating the different block structures to find the optimal neural network architecture, finally decomposing the search tasks of the plurality of neural network architectures into a plurality of subtasks, executing the different subtasks on the optimal neural network architecture, realizing parallel computation by utilizing the communication among the computation nodes, balancing the loads of the different computation nodes, ensuring the computation tasks of the computation nodes to be uniformly distributed and executed, introducing the kernel functions into the block structures to perform operations such as to increase the number of layers of the blocks or change parameters of the blocks, and the like, and realizing the computation performance of the multi-dimensional function can be realized by the cascade computation, and the performance of the multi-dimensional function can be realized by further optimizing the performance of the calculation in the cascade computation structure, the search space can be reduced, and the calculation efficiency is improved; the multi-dimensional kernel cascade can accelerate the searching process of the whole neural network architecture through parallel computing, the purpose of introducing block-based transformation is to change the layer architecture of the network, the block-based transformation is different from the traditional single-layer structure transformation, a group of layer architectures are changed at one time, the optimal network can be more flexibly found, the optimal neural network architecture can be found through searching and evaluating different block structures, the performance and the efficiency of a neural network model can be further improved through the mode, the large-scale cluster-oriented expansion is realized through distributing computing tasks to different computing nodes, the parallel computing is realized through communication among the computing nodes, the computing resources of the cluster can be fully utilized, and the efficiency and the expandability of the algorithm are further improved.
The embodiment of the invention can abstract the sequential editing distance calculation into the polyhedral transformed representation. On the new level of abstraction, the dependency of computation is eliminated, and the computation is realized on a graphic processor in parallel, so that the speed of running search is improved, block-based transformation is increased for new abstraction, the prediction precision of the neural network model result is improved, the search speed and the result precision are optimized on the basis of abstract representation, the acceleration of kernel functions up to 10 times is realized, and the 3.92 times acceleration and the 20.03% precision performance gain of network generation are realized.
Embodiment 2 As shown in FIG. 2, the editing distance acquiring process provided by the embodiment of the invention on the basis of embodiment 1 comprises the following steps:
s101, obtaining any two neural network architectures in the neural network architecture, wherein one of the neural network architectures is named as a first neural network architecture diagram, and the other one of the neural network architecture diagrams is named as a second neural network architecture diagram;
s102, converting a hierarchical relation in a first neural network architecture diagram and a second neural network architecture diagram into nodes and edges in the diagram, wherein the nodes represent layers in the neural network, and the edges represent connection relations among the layers;
S103, defining a distance parameter in a measurement space, wherein the distance parameter is used for measuring the distance between two neural network architectures, and the distance function selects a Euclidean distance function;
S104, comparing the number of layers of the nodes in each iteration, selecting the number of layers of the |L| as the distance calculation of the current iteration, and calculating the min (M, N) distance in parallel during each iteration, wherein M represents the number of layers of the first neural network architecture, and N represents the number of layers of the second neural network architecture;
s105, repeating the step S103 and the step S104 until the distance calculation of all the nodes is completed, and outputting a final distance result, wherein the distance result represents the distance between two neural network architectures;
The working principle and the beneficial effects of the technical scheme are that the embodiment firstly obtains any two neural network architectures in the neural network architectures, wherein one of the two neural network architectures is named as a first neural network architecture diagram, and the other one is named as a second neural network architecture diagram; the method comprises the steps of abstracting a first neural network architecture diagram and a second neural network architecture diagram, converting hierarchical relations in the first neural network architecture diagram and the second neural network architecture diagram into nodes and edges in the diagram, wherein the nodes represent layers in the neural network architecture diagram, the edges represent connection relations between the layers, defining a distance parameter in a measurement space again, wherein the distance parameter is used for measuring the distance between the two neural network architectures, the distance function selects a Euclidean distance function, carrying out parallel traversal on the first neural network architecture diagram and the second neural network architecture diagram, calculating the distance between each node in the iteration period, then comparing the layer numbers of the nodes in each iteration period, selecting a smaller layer number as the distance calculation of the current iteration period, and carrying out parallel calculation on min (M, N) distances in each iteration period, wherein M represents the layer number of the first neural network architecture, N represents the second neural network architecture (N represents the N layer history network in the diagram of FIG. 3 and FIG. 4, M represents the dependency relation), repeating the steps until the distance calculation of all nodes is completed, the final distance calculation is output, the layer numbers represent the layer numbers of the nodes are compared with the neural network architecture diagram, the distance calculation between the two neural network architectures can be better understood by the abstract network architecture diagram, the hierarchical relationship and the connection mode between different architectures can be intuitively seen. By comparing the distances, the similarity and the difference of different architectures can be obtained, and the design and the selection of the neural network architecture can be assisted. By comparing the distances of different architectures, the architecture with higher similarity with the target architecture can be found, so that reference and reference are provided, the method is very helpful for designing a new neural network architecture or selecting a proper pre-training model, and the method can be used for optimizing and transferring learning of the neural network architecture. By comparing the distances of different architectures, commonalities and differences between similar architectures can be found, thereby guiding the optimization of network parameters and the strategy of migration learning. The embodiment of the invention can provide a quantitative mode to evaluate and compare the similarity and the difference between different architectures by abstracting, measuring the distance and comparing the neural network architecture, thereby assisting the design, the selection and the optimization of the neural network.
Embodiment 3 As shown in FIG. 5, on the basis of embodiment 1, the process for cascading a plurality of kernel functions provided by the embodiment of the invention comprises the following steps:
S201, connecting a new M-layer neural network architecture with two historical neural network architectures, wherein the two historical neural network architectures respectively comprise N layers and N' layers, generating an original neural network comparison kernel of an M-by-N matrix, and each element in the matrix represents the connection strength between the M-th layer of the new neural network and the N-th layer of the historical neural network;
s202, adding zero padding between the last layer of the current neural network and the first layer of the additional network to generate a matrix of M (N+N') for parallel execution of calculation;
S203, connecting all newly generated neural networks layer by layer on an original neural network comparison kernel to generate a two-dimensional kernel for calculation and analysis;
The working principle and the beneficial effects of the technical scheme are that the embodiment connects a new M-layer neural network architecture with two historical neural network architectures, wherein the two historical neural network architectures respectively comprise N layers and N ' layers; generating an M-N matrix original neural network comparison kernel, wherein each element in the matrix represents the connection strength between the M layer of a new neural network and the N layer of a historical neural network, adding zero padding between the last layer of the current neural network and the first layer of an additional network to generate an M-N matrix for parallel execution computation (N represents an N layer historical network, N ' represents an N ' layer historical network, M represents an M layer historical network, M ' represents an M ' layer historical network, the dependence relationship is shown, 0 between adjacent N and N ' is also zero padding zero pooling, 0 between adjacent M and M ' is also zero padding zero pooling), finally connecting all newly generated neural networks layer by layer on the original neural network comparison kernel to generate a two-dimensional kernel for computation and analysis, improving the expression capacity of the neural network by connecting different layers, utilizing different layers of neural networks, improving the learning capacity and information of the network, improving the learning capacity and the expression capacity of the network, increasing the training efficiency by the calculation model, increasing the calculation flexibility by the calculation and the calculation structure, increasing the calculation flexibility by the calculation demands of the parallel computation, and the adaptability and generalization capability of the network are improved. The novel neural network architecture of the embodiment has important significance in improving the expression capacity, the calculation efficiency and the adaptability of the neural network, and is expected to bring new breakthrough for the development and the application of the neural network.
Embodiment 4 as shown in fig. 8, on the basis of embodiment 3, the splitting process of the matrix of M (n+n') provided in the embodiment of the present invention includes the following steps:
S2021, splitting the whole M (N+N') matrix into a plurality of small matrices with matched sizes, wherein the size of each small matrix is the same as that of a corresponding neural network diagram, so that the calculation task of each small matrix can be processed by one thread block;
S2022, on the image processor parallel computing platform, distributing each small matrix to a thread block for computing, wherein the thread block is a group of threads which are executed in parallel, and the threads share a memory and cooperatively compute;
S2023, calculating a small matrix by using parallel calculation in each thread block, and calculating in the corresponding small matrix according to the connection strength between the M layer of the new neural network and the N layer of the historical neural network for each thread block;
s2024, storing intermediate calculation results by using a shared memory of the thread blocks, and combining the calculation results of each thread block to obtain the calculation result of the whole M (N+N') matrix;
s2025, repeating the steps S2021-S2024, connecting all newly generated neural networks layer by layer, and generating a two-dimensional kernel for subsequent calculation and analysis;
The working principle and the beneficial effects of the technical scheme are that the whole M (N+N ') matrix is split into a plurality of small matrixes with matched sizes, the size of each small matrix is the same as that of the corresponding neural network diagram, and the calculation task of each small matrix can be ensured to be processed by one thread block; the method comprises the steps of distributing each small matrix to a thread block on an image processor parallel computing platform for computing (as shown in figure 9), wherein the thread block is a group of threads which are executed in parallel, sharing memory and cooperatively computing, then computing the small matrix by using parallel computing in each thread block, computing in the corresponding small matrix according to the connection strength between the M layer of a new neural network and the N layer of a historical neural network for each thread block, storing intermediate computing results by using the shared memory of the thread block again, merging the computing results of each thread block to obtain the computing results of the whole M (N+N') matrix, finally repeating the steps S2021-S2024, connecting all newly generated neural networks layer by layer to generate a two-dimensional kernel for subsequent computing and analyzing, optimizing the parallel computing and task distribution by splitting the matrix and distributing each small matrix to one thread block, accelerating the computing process of the neural network by utilizing the characteristics of the parallel computing platform, improving the computing efficiency and the performance, simultaneously, further improving the computing speed by using the shared memory and optimizing the computing algorithm, further improving the computing speed and the data access speed of the neural network, the efficiency and performance of the whole system are improved.
Embodiment 5 as shown in fig. 10, on the basis of embodiment 1, the process of searching for different block structures provided by the embodiment of the present invention includes the following steps:
s301, adding a selected block to expand a neural network architecture, after adding a new block, connecting the new block with the existing block in series, and adjusting the weight, the bias value, the activation function activation regularization item and the like of the new block;
s302, scaling neuron number parameters in the selected block increase based on block conversion, searching optimal network architecture and block parameters by using an evolutionary algorithm, and defining an objective function and constraint conditions to guide a searching process according to requirements;
S303, after the conversion is completed, evaluating and testing a new neural network architecture, and further optimizing and adjusting according to an evaluation result;
The technical scheme has the working principle and beneficial effects that firstly, a selected block is added to expand a neural network architecture, after a new block is added, the new block is connected with the existing block in series (the figure 11 is a layer-based transformation schematic diagram), weights, bias values, activation function activation regularization items and the like of the new block are adjusted (the figure 12 is a block-based transformation schematic diagram), secondly, based on block conversion, the number of internal nerve cells of the selected block is increased to scale, an optimal network architecture and block parameters are searched by using an evolutionary algorithm, a target function and constraint conditions are defined according to requirements to guide a searching process, finally, after conversion is completed, the new neural network architecture is evaluated and tested, further optimization and adjustment are performed according to evaluation results, the new block is added, the depth and complexity of the neural network can be increased, more expression capacity is provided, the neural network can learn more complex modes and characteristics, accordingly, the performance and generalization capacity of the neural network are improved, the new block is connected with the existing block in series, the parameters of the new block are adjusted, the transfer of information is promoted, the transfer of information is improved, the optimal performance of the neural network is improved, the neural network can be better in the performance is better, the data can be adaptively set, the optimal, the data of the neural network can be better has the optimal performance and the optimal, and the performance of the neural network can be better can be adjusted, and has the optimal by setting the optimal performance parameters, and has the optimal performance, the method and the device optimize the neural network architecture by adding new blocks, connection and parameter adjustment, conversion and scaling of the blocks, evolutionary algorithm search and other methods, improve the performance and adaptability of the neural network, enable the neural network to be better suitable for different tasks and data sets, and achieve higher performance and effect.
Embodiment 6 As shown in FIG. 13, on the basis of embodiment 1, the process of executing different subtasks by different computing nodes provided by the embodiment of the present invention includes the following steps:
S401, generating asynchronous control logic by using a master node without a graphic processor, wherein the master node is responsible for coordinating logic task allocation;
S402, executing independent tasks on the slave nodes by using a central processor and an artificial intelligent accelerator card, wherein the slave nodes are responsible for asynchronously executing network generation and training, and each slave node generates and trains a neural network architecture in parallel by using a graphic processor;
the working principle and the beneficial effects of the technical scheme are that firstly, an asynchronous control logic is generated by a master node without a graphic processor, the master node is responsible for coordinating logic task allocation (FIG. 14 is a decoupling task partitioning and task execution schematic diagram), once the master node is generated and trained to schedule tasks to idle nodes through application software, secondly, a central processor and an artificial intelligent accelerator card are used on slave nodes to execute independent tasks, the slave nodes are responsible for asynchronously executing network generation and training, each slave node uses the graphic processor to generate and train a neural network architecture in parallel, meanwhile, the slave nodes train the current neural network and save the current model file through a shared file system, all the nodes of the scheme can access shared data to select the best candidate neural network in the next iteration and optimize parameters, each node can acquire candidate neural networks generated by other nodes and corresponding performance indexes through accessing the shared data, the nodes can select the best candidate neural network and update the parameters to optimize the model, a plurality of slave nodes respectively read the structures of the candidate neural networks in the shared file system, one group of the slave nodes are used for generating and training the candidate neural network in parallel, simultaneously can be used for improving the performance of the candidate neural networks and the candidate neural networks in the parallel acceleration system by comparing the current neural network with the current neural network and the current neural network through the shared network and the shared network, the current model is saved by the shared network, the parallel training capacity is greatly improved, the performance of the candidate neural network can be calculated and the training network can be calculated by comparing the performance of the candidate network and the candidate network can be calculated by comparing with the performance with the candidate network, in addition, the shared file system is used for storing the current model file, so that the latest model can be accessed at any time so as to carry out subsequent reasoning or other operations, the distributed computing scheme can realize flexible task scheduling and resource management, thereby better meeting different application requirements, the embodiment can improve the efficiency of generating and training the neural network, and can better utilize the computing resource to realize faster and flexible task processing.
Embodiment 7. As shown in FIG. 15, an embodiment of the present invention provides a neural network architecture search system based on polyhedral transformation representation, comprising:
The system comprises an edit distance calculation module, a calculation module and a calculation module, wherein the edit distance calculation module is responsible for receiving at least one neural network architecture;
The calculation result combination module is responsible for cascading a plurality of kernel functions according to the sequence from simple to complex of the kernel functions, the output of each kernel function is used as the input of the next kernel function, a multi-layer kernel function structure is formed, and the calculation results of the kernel functions are combined together;
The optimal architecture output module is in charge of selecting one or more blocks from the block structures of the current neural network architecture as a basis, performing operations such as increasing the layer number of the blocks or changing parameters of the blocks on the selected blocks, and finding out the optimal neural network architecture by searching and evaluating different block structures;
the task decomposition module is responsible for decomposing a plurality of neural network architecture search tasks into a plurality of subtasks, executing different subtasks by different computing nodes on the optimal neural network architecture, and realizing parallel computation by utilizing communication among the computing nodes;
The working principle and the beneficial effects of the technical scheme are that the edit distance calculation module of the embodiment receives at least one neural network architecture, obtains edit distances between any two neural network architectures in the neural network architecture, converts measurement space of the edit distances into a polyhedral structure, wherein each surface of the polyhedral structure represents one kernel function operation, the calculation result combination module carries out cascade connection on a plurality of kernel functions according to a simple to complex sequence of the kernel functions, the output of each kernel function serves as the input of the next kernel function to form a multi-layer kernel function structure, the results of calculation of the plurality of kernel functions are combined together, the optimal architecture output module selects one or more blocks from the block structures of the current neural network architecture as a basis, carries out operations such as increasing the number of layers of the blocks or changing parameters of the blocks, and the like on the selected blocks, searches and evaluates the different block structures to find the optimal neural network architecture, the task decomposition module decomposes the plurality of kernel network architectures into a plurality of sub-tasks, carries out different computation nodes on the optimal neural network architecture, the different computation nodes are used as the input of the next kernel function, the calculation function is carried out the cascade connection, the calculation performance of the multiple kernel functions can be further realized by adopting the cascade connection, the computation performance can be further realized, the parallel computation performance of the computation nodes can be realized, the parallel computation performance of the cascade connection can be realized, the calculation performance of the multiple-plane-oriented computation nodes can be further realized, the performance of the parallel computation performance can be realized, and the performance of the cascade connection can be further realized, and the performance can be realized, and the performance of the performance can be realized, the method comprises the steps of combining calculation results of a plurality of kernel functions, reducing search space and improving calculation efficiency, simultaneously accelerating the whole neural network architecture search process through parallel calculation by multi-dimensional kernel cascade, introducing block-based transformation to change the layer architecture of a network, changing a group of layer architectures at one time by the block-based transformation, finding the optimal network more flexibly, searching and evaluating different block architectures to find the optimal neural network architecture, further improving the performance and efficiency of a neural network model, realizing parallel calculation by distributing calculation tasks to different calculation nodes and utilizing communication among the calculation nodes for large-scale cluster expansion, fully utilizing the calculation resources of the cluster, and further improving the efficiency and expandability of the algorithm.
The embodiment of the invention can abstract the sequential editing distance calculation into the polyhedral transformed representation. On the new level of abstraction, the dependency of computation is eliminated, and the computation is realized on a graphic processor in parallel, so that the speed of running search is improved, block-based transformation is increased for new abstraction, the prediction precision of the neural network model result is improved, the search speed and the result precision are optimized on the basis of abstract representation, the acceleration of kernel functions up to 10 times is realized, and the 3.92 times acceleration and the 20.03% precision performance gain of network generation are realized.
Embodiment 8 as shown in fig. 16, on the basis of embodiment 7, the edit distance calculation module provided in the embodiment of the invention includes:
The abstract processing unit is in charge of obtaining any two neural network architectures in the neural network architecture, wherein one of the neural network architectures is named as a first neural network architecture diagram, and the other one of the neural network architecture diagram is named as a second neural network architecture diagram;
the relationship conversion unit is responsible for converting the hierarchical relationship in the first neural network architecture diagram and the second neural network architecture diagram into nodes and edges in the diagram, wherein the nodes represent layers in the neural network, and the edges represent connection relationships among the layers;
the parallel traversing unit is responsible for defining a distance parameter in a measurement space, wherein the distance parameter is used for measuring the distance between two neural network architectures, and the distance function selects a Euclidean distance function;
The iteration processing unit is responsible for comparing the number of layers of the nodes in each iteration, selecting the number of layers of the |L| as the distance calculation of the current iteration, and calculating the min (M, N) distance in parallel during each iteration, wherein M represents the number of layers of the first neural network architecture, and N represents the number of layers of the second neural network architecture;
The result output unit is in charge of completing the distance calculation of all the nodes and outputting a final distance result, wherein the distance result represents the distance between two neural network architectures;
The technical scheme has the working principle and beneficial effects that an abstracting processing unit of the embodiment obtains any two neural network frameworks in the neural network framework, one of the abstracting processing unit is named as a first neural network framework graph, the other of the abstracting processing unit is named as a second neural network framework graph, the first neural network framework graph and the second neural network framework graph are abstracted, a relation conversion unit converts a hierarchical relation in the first neural network framework graph and the second neural network framework graph into nodes and edges in the graph, the nodes represent layers in the neural network, the edges represent connection relations between the layers, a parallel traversing unit defines a distance parameter in a measurement space, the distance parameter is used for measuring the distance between the two neural network frameworks, the distance function is selected, the first neural network framework graph and the second neural network framework graph are traversed in parallel, the distance between each node is calculated in an iterative process, the iterative processing unit compares the number of layers of the nodes in each iterative process, the distance calculation of |L| is selected as the distance calculation of the current iteration, the distance between each iteration process can be calculated in parallel, the M and N|the second neural network framework graph represents the number of the first neural network and the second neural network framework graph represents the distance between the two neural network frameworks, the distance between the two neural network frameworks can be better understood, the distance between the two neural network frameworks can be calculated through the two layers, the distance between the two neural network frameworks can be better understood, the distance between the two neural network frameworks can be compared, and the distance between the two neural network frameworks can be better calculated, and the distance can be compared through the node to the final graph, and the distance graph has the distance between the distance structure is better understood, the method can obtain the similarity and the difference of different architectures and assist the design and the selection of the neural network architecture. By comparing the distances of different architectures, the architecture with higher similarity with the target architecture can be found, so that reference and reference are provided, the method is very helpful for designing a new neural network architecture or selecting a proper pre-training model, and the method can be used for optimizing and transferring learning of the neural network architecture. By comparing the distances of different architectures, commonalities and differences between similar architectures can be found, thereby guiding the optimization of network parameters and the strategy of migration learning. The embodiment of the invention can provide a quantitative mode to evaluate and compare the similarity and the difference between different architectures by abstracting, measuring the distance and comparing the neural network architecture, thereby assisting the design, the selection and the optimization of the neural network.
Embodiment 9. As shown in FIG. 17, on the basis of embodiment 7, the calculation result combination module provided in the embodiment of the present invention includes:
The system comprises an architecture connection unit, an original neural network comparison kernel, a data processing unit and a data processing unit, wherein the architecture connection unit is in charge of connecting a new M-layer neural network architecture with two historical neural network architectures, wherein the two historical neural network architectures respectively comprise N layers and N' layers;
The matrix generation unit is responsible for adding zero padding between the last layer of the current neural network and the first layer of the additional network to generate a matrix of M (N+N') for parallel execution of calculation;
The kernel generating unit is responsible for connecting all newly generated neural networks layer by layer on the original neural network comparison kernel to generate a two-dimensional kernel for calculation and analysis;
The technical scheme has the working principle and beneficial effects that the architecture connecting unit of the embodiment connects a new M-layer neural network architecture with two historical neural network architectures, wherein the two historical neural network architectures respectively comprise N layers and N 'layers, an original neural network comparison kernel of an M-by-N matrix is generated, each element in the matrix represents the connection strength between the M layer of the new neural network and the N layer of the historical neural network, the matrix generating unit adds zero padding between the last layer of the current neural network and the first layer of the additional network to generate a matrix of M-by-N', the matrix is used for parallel execution calculation, the kernel generating unit connects all newly generated neural networks layer by layer on the original neural network comparison kernel to generate a two-dimensional kernel for calculation and analysis, the scheme improves the expression capacity of the neural network, the learning capacity and the expression capacity of the neural network can be improved by utilizing the characteristics and information of different layers through connecting the neural networks of different layers, the calculation efficiency of the neural network is improved, the zero padding and the calculation efficiency of the neural network is increased through adding zero padding and the first layer of the additional network, the calculation efficiency is improved, the calculation model is improved, the calculation efficiency is improved, the neural network is flexibly and the training network is improved, the calculation is flexibly and the calculation is improved, the calculation is realized, the network is flexibly and the calculation is flexibly and the network is flexibly and flexibly well is adaptable according to the calculation is based on the calculation and well. The novel neural network architecture of the embodiment has important significance in improving the expression capacity, the calculation efficiency and the adaptability of the neural network, and is expected to bring new breakthrough for the development and the application of the neural network.
Embodiment 10. As shown in FIG. 18, on the basis of embodiment 7, the output module of the optimal architecture according to the embodiment of the present invention includes:
The layer transformation unit is responsible for adding a selected block to expand the neural network architecture, connecting the new block with the existing block in series after adding the new block, adjusting the weight, the bias value, the activation function activation regularization item and the like of the new block;
The block transformation unit is responsible for scaling neuron quantity parameters in the selected block based on block conversion, searching the optimal network architecture and block parameters by using an evolutionary algorithm, and defining an objective function and constraint conditions to guide a searching process according to requirements;
The evaluation unit is responsible for evaluating and testing the new neural network architecture after completing conversion, and further optimizing and adjusting according to the evaluation result;
The layer transformation unit of the embodiment adds a selected block to expand the neural network architecture, after adding a new block, connects the new block with the existing block in series, adjusts the weight, bias value, activation function activation regularization item and the like of the new block; the method comprises the steps of adding a new block to the block transformation unit, scaling the number of neurons in the selected block based on the conversion of the block, searching the optimal network architecture and block parameters by using an evolutionary algorithm, defining an objective function and a constraint condition to guide the searching process according to the requirement, evaluating and testing a new neural network architecture after the conversion is completed by the evaluation unit, further optimizing and adjusting the new neural network architecture according to the evaluation result, adding the new block to the scheme, increasing the depth and complexity of the neural network, providing more expression capacity, enabling the neural network to learn more complex modes and characteristics, improving the performance and generalization capacity of the neural network, improving the information transfer and fusion of the neural network by connecting the new block with the existing block in series, adjusting the parameters of the new block, improving the information mobility of the neural network, enabling the neural network to better utilize the information of input data, improving the performance of the neural network, adapting the neural network to tasks and data sets of different scales by adjusting the capacity and the complexity of the block in the selected block, improving the flexibility and adaptability of the network, setting the optimal neural network by using the evolutionary algorithm to search the optimal block and the constraint condition, optimizing the network by automatically searching the new block and optimizing the structure, optimizing the performance of the neural network, and the searching process can be improved by setting the optimal structure, the embodiment of the invention optimizes the neural network architecture by adding new blocks, connection and parameter adjustment, conversion and scaling of the blocks, evolutionary algorithm search and other methods, improves the performance and adaptability of the neural network, can enable the neural network to better adapt to different tasks and data sets, and realizes higher performance and effect.
Embodiment 11 as shown in fig. 19, on the basis of embodiment 7, the task decomposition module provided in the embodiment of the present invention includes:
the logic control unit is responsible for generating asynchronous control logic by using a main node without a graphic processor, and the main node is responsible for coordinating logic task allocation;
The node processing unit is responsible for executing independent tasks by using a central processor and an artificial intelligent accelerator card on the slave node, and the slave node is responsible for asynchronously executing network generation and training, and each slave node generates and trains a neural network architecture in parallel by using a graphic processor;
The technical scheme has the working principle and beneficial effects that the logic control unit of the embodiment uses a master node without a graphic processor to generate asynchronous control logic, the master node is responsible for coordinating logic task allocation, once the logic of the generation and training tasks is confirmed, the master node dispatches the tasks to idle nodes through application software, the node processing unit uses a central processor and an artificial intelligent accelerator card to execute independent tasks on slave nodes, the slave nodes are responsible for asynchronously executing network generation and training, each slave node uses the graphic processor to generate and train a neural network architecture in parallel, meanwhile, the slave nodes train the current neural network and save the current model file through a shared file system, all the nodes can access shared data to select the best candidate neural network in the next iteration and optimize parameters, each node can acquire candidate neural networks generated by other nodes and corresponding performance indexes through accessing the shared data, the nodes can select the best candidate neural network and update the parameters to optimize the model, the historical neural network structures in the shared file system, a group of new candidate neural networks are converted on the respective nodes respectively, the performance of the candidate neural networks can be better compared with the calculation and the calculation system through the parallel acceleration system, the performance of the candidate neural networks can be better calculated and the calculation and the performance of the candidate neural network can be better calculated and the parallel and the training system can be better calculated by using the improved, by using the shared file system to store the current model file, the latest model can be accessed at any time so as to carry out subsequent reasoning or other operations, the distributed computing scheme can realize flexible task scheduling and resource management, thereby better meeting different application requirements, the embodiment can improve the efficiency of generating and training the neural network, better utilize computing resources and realize faster and flexible task processing.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (7)

1. The neural network architecture searching method based on polyhedral transformation representation is characterized by comprising the following steps of:
Receiving at least one neural network architecture, acquiring editing distances between any two neural network architectures in the receiving architecture, and converting a measurement space of the editing distances into a polyhedral structure, wherein each face of the polyhedron represents a kernel function operation;
cascading a plurality of kernel functions according to the sequence of increasing complexity, wherein the output of each kernel function is used as the input of the next kernel function to form a multi-layer kernel function structure, and combining the calculation results of the kernel functions;
Selecting one or more blocks from the block structures of the current neural network architecture as a basis, and performing operations of increasing the number of layers of the blocks or changing parameters of the blocks on the selected blocks;
The method comprises the steps of dividing a plurality of neural network architecture search tasks into a plurality of subtasks, and executing different subtasks on an optimal neural network architecture by different computing nodes;
A process for cascading a plurality of kernel functions, comprising the steps of:
Generating an original neural network comparison kernel of an M-by-N matrix, wherein each element in the matrix represents the connection strength between an M-th layer of the new neural network and an N-th layer of the historical neural network;
adding zero padding between the last layer of the current neural network and the first layer of the additional network to generate a matrix of M (n+n') for parallel execution of computations;
Connecting all newly generated neural networks layer by layer on an original neural network comparison kernel to generate a two-dimensional kernel for calculation and analysis;
m (n+n') a resolution process of the matrix, comprising the steps of:
Splitting the whole M (N+N') matrix into a plurality of small matrixes with matched sizes, wherein the size of each small matrix is the same as that of a corresponding neural network diagram, and ensuring that the calculation task of each small matrix is processed by one thread block;
on an image processor parallel computing platform, distributing each small matrix to a thread block for computing, wherein the thread block is a group of threads which are executed in parallel, and shares a memory and cooperatively computes;
For each thread block, calculating in the corresponding small matrix according to the connection strength between the M layer of the new neural network and the N layer of the historical neural network;
storing intermediate calculation results by using a shared memory of the thread blocks, and combining the calculation results of each thread block to obtain the calculation result of the whole M (N+N') matrix;
all newly generated neural networks are connected layer by layer to generate a two-dimensional kernel for subsequent calculation and analysis.
2. The neural network architecture search method based on polyhedral transformation expression according to claim 1, wherein the edit distance acquisition process comprises the steps of:
Obtaining any two neural network architectures in the neural network architecture, wherein one of the neural network architectures is named as a first neural network architecture diagram, and the other one of the neural network architecture diagrams is named as a second neural network architecture diagram;
Converting the hierarchical relationship in the first neural network architecture diagram and the second neural network architecture diagram into nodes and edges in the diagram, wherein the nodes represent layers in the neural network, and the edges represent connection relationships among the layers;
Defining a distance parameter in a measurement space, wherein the distance parameter is used for measuring the distance between two neural network architectures, and the distance function selects a Euclidean distance function;
In each iteration, comparing the number of layers of the nodes, and selecting the number of layers of |L| as the distance calculation of the current iteration, wherein the min (X, Y) distance can be calculated in parallel during each iteration, wherein X represents the number of layers of the first neural network architecture and Y represents the number of layers of the second neural network architecture;
and outputting a final distance result until the distance calculation of all the nodes is completed, wherein the distance result represents the distance between the two neural network architectures.
3. The neural network architecture search method based on polyhedral transformation representation according to claim 1, wherein the process of different computing nodes executing different subtasks comprises the steps of:
generating asynchronous control logic by using a main node without a graphic processor, wherein the main node is responsible for coordinating logic task allocation;
The slave nodes use a central processor and an artificial intelligent accelerator card to execute independent tasks, the slave nodes are responsible for asynchronously executing network generation and training, each slave node uses a graphic processor to generate and train a neural network architecture in parallel, and meanwhile, the slave nodes train the current neural network and store the current model file through a shared file system.
4. The neural network architecture search method of claim 3, wherein all nodes access the shared data to select the best candidate neural network in the next iteration and optimize parameters, each node obtains candidate neural networks and corresponding performance metrics generated by other nodes by accessing the shared data, the node selects the best candidate neural network, and the parameters are updated to optimize the model.
5. A neural network architecture search method based on a polyhedral transformation representation as claimed in claim 3, wherein the historical candidate neural network structures in the shared file system are read from the nodes separately and a new set of candidate neural networks are transformed on the respective nodes.
6. A neural network architecture search system based on a polyhedral transformation representation, comprising:
The system comprises an edit distance calculation module, a calculation module and a calculation module, wherein the edit distance calculation module is responsible for receiving at least one neural network architecture;
the calculation result combination module is responsible for cascading a plurality of kernel functions according to the sequence from simple to complex of the kernel functions, the output of each kernel function is used as the input of the next kernel function, a multi-layer kernel function structure is formed, and the calculation results of the kernel functions are combined together;
The optimal architecture output module is in charge of selecting one or more blocks from the block structures of the current neural network architecture as a basis, performing operations of increasing the number of layers of the blocks or changing parameters of the blocks on the selected blocks, and searching and evaluating different block structures to find the optimal neural network architecture;
the task decomposition module is responsible for decomposing a plurality of neural network architecture search tasks into a plurality of subtasks, executing different subtasks by different computing nodes on the optimal neural network architecture, and realizing parallel computation by utilizing communication among the computing nodes;
A process for cascading a plurality of kernel functions, comprising:
Generating an original neural network comparison kernel of an M-by-N matrix, wherein each element in the matrix represents the connection strength between an M-th layer of the new neural network and an N-th layer of the historical neural network;
adding zero padding between the last layer of the current neural network and the first layer of the additional network to generate a matrix of M (n+n') for parallel execution of computations;
Connecting all newly generated neural networks layer by layer on an original neural network comparison kernel to generate a two-dimensional kernel for calculation and analysis;
m (n+n') a resolution process of the matrix, comprising:
Splitting the whole M (N+N') matrix into a plurality of small matrixes with matched sizes, wherein the size of each small matrix is the same as that of a corresponding neural network diagram, and ensuring that the calculation task of each small matrix is processed by one thread block;
on an image processor parallel computing platform, distributing each small matrix to a thread block for computing, wherein the thread block is a group of threads which are executed in parallel, and shares a memory and cooperatively computes;
For each thread block, calculating in the corresponding small matrix according to the connection strength between the M layer of the new neural network and the N layer of the historical neural network;
storing intermediate calculation results by using a shared memory of the thread blocks, and combining the calculation results of each thread block to obtain the calculation result of the whole M (N+N') matrix;
all newly generated neural networks are connected layer by layer to generate a two-dimensional kernel for subsequent calculation and analysis.
7. The neural network architecture search system based on a polyhedral transformation representation according to claim 6, wherein the edit distance calculation module comprises:
The abstract processing unit is in charge of obtaining any two neural network architectures in the neural network architecture, wherein one of the neural network architectures is named as a first neural network architecture diagram, and the other one of the neural network architecture diagram is named as a second neural network architecture diagram;
the relationship conversion unit is responsible for converting the hierarchical relationship in the first neural network architecture diagram and the second neural network architecture diagram into nodes and edges in the diagram, wherein the nodes represent layers in the neural network, and the edges represent connection relationships among the layers;
the parallel traversing unit is responsible for defining a distance parameter in a measurement space, wherein the distance parameter is used for measuring the distance between two neural network architectures, and the distance function selects a Euclidean distance function;
The iteration processing unit is responsible for comparing the number of layers of the nodes in each iteration, selecting the number of layers of the |L| as the distance calculation of the current iteration, and calculating the min (X, Y) distance in parallel during each iteration, wherein X represents the number of layers of the first neural network architecture, and Y represents the number of layers of the second neural network architecture;
And the result output unit is in charge of completing the distance calculation of all the nodes and outputting a final distance result, wherein the distance result represents the distance between the two neural network architectures.
CN202310971483.9A 2023-08-03 2023-08-03 A neural network architecture search method and system based on polyhedron transformation representation Active CN116796815B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310971483.9A CN116796815B (en) 2023-08-03 2023-08-03 A neural network architecture search method and system based on polyhedron transformation representation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310971483.9A CN116796815B (en) 2023-08-03 2023-08-03 A neural network architecture search method and system based on polyhedron transformation representation

Publications (2)

Publication Number Publication Date
CN116796815A CN116796815A (en) 2023-09-22
CN116796815B true CN116796815B (en) 2025-09-23

Family

ID=88036547

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310971483.9A Active CN116796815B (en) 2023-08-03 2023-08-03 A neural network architecture search method and system based on polyhedron transformation representation

Country Status (1)

Country Link
CN (1) CN116796815B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111160535A (en) * 2019-12-31 2020-05-15 北京计算机技术及应用研究所 DGCNN model acceleration method based on Hadoop
CN111325356A (en) * 2019-12-10 2020-06-23 四川大学 Neural network search distributed training system and training method based on evolutionary computation

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10769533B2 (en) * 2015-09-04 2020-09-08 Baidu Usa Llc Systems and methods for efficient neural network deployments
CN111931904A (en) * 2020-07-10 2020-11-13 华为技术有限公司 Neural network construction method and device
CN115204390A (en) * 2022-06-21 2022-10-18 清华大学 Visual analysis system and method for neural network architecture space

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111325356A (en) * 2019-12-10 2020-06-23 四川大学 Neural network search distributed training system and training method based on evolutionary computation
CN111160535A (en) * 2019-12-31 2020-05-15 北京计算机技术及应用研究所 DGCNN model acceleration method based on Hadoop

Also Published As

Publication number Publication date
CN116796815A (en) 2023-09-22

Similar Documents

Publication Publication Date Title
CN115543639B (en) Optimization method for performing deep learning tasks in distributed mode and distributed system
US11645493B2 (en) Flow for quantized neural networks
Talbi Automated design of deep neural networks: A survey and unified taxonomy
EP4036724A1 (en) Method for splitting neural network model by using multi-core processor, and related product
US20250005455A1 (en) Partitioning for an execution pipeline
US20190340499A1 (en) Quantization for dnn accelerators
CN116594748B (en) Model customization processing method, device, equipment and medium for task
CN119647604B (en) Training reasoning method of decision model, product, electronic equipment and medium
Sood et al. Neunets: An automated synthesis engine for neural network design
WO2021238734A1 (en) Method for training neural network, and related device
CN115952856A (en) Neural network production line parallel training method and system based on bidirectional segmentation
Wen et al. MapReduce-based BP neural network classification of aquaculture water quality
Violos et al. Predicting resource usage in edge computing infrastructures with CNN and a hybrid Bayesian particle swarm hyper-parameter optimization model
Shi et al. NASA: Neural architecture search and acceleration for hardware inspired hybrid networks
CN116796815B (en) A neural network architecture search method and system based on polyhedron transformation representation
Xiao et al. A novel method for intelligent reasoning of machining step sequences based on deep reinforcement learning
Moe et al. Implementing spatio-temporal graph convolutional networks on graphcore ipus
US20230419166A1 (en) Systems and methods for distributing layers of special mixture-of-experts machine learning models
Meng et al. Enhanced global optimization with parallel global and local structures for real-time control systems
Honda et al. Automating Neural Model Selection in Spiking Neural Networks Using AutoML Techniques
WO2025020165A1 (en) Devices and methods for generating execution plans for a machine learning model
US20240095309A1 (en) System and method for holistically optimizing dnn models for hardware accelerators
US20230162010A1 (en) Synthesizing Zero-Loss Low-Power Approximate DNN Accelerators With Large-Scale Search
Batzolis et al. Accelerating Edge Intelligence: Challenges and Future Directions in Hardware-Driven Machine Learning
Belatik et al. Optimization of AI Models for Embedded Systems: A Literature Review and Comparative Analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant