Disclosure of Invention
The invention aims to: the invention aims to provide an autonomous task decomposition and planning method for a multi-mechanical arm on-orbit service robot. The task planning method utilizes a hierarchical task network planner to decompose a typical on-orbit service task into atomic tasks which can be executed by an end effector of the mechanical arm, and then the decomposed atomic tasks are completed by the mechanical arm, so that the decomposition and planning of the whole on-orbit service task are completed, and the on-orbit service autonomous task planning requirement of the multi-mechanical-arm space robot aiming at on-orbit service is met.
The technical scheme is as follows: in order to achieve the above purpose, the present invention adopts the following technical scheme;
(1) For a multi-arm space robot, building its kinematic model to obtain a mapping matrix from joint space to working space:
(1) For a multi-arm space robot, its kinematic model is a mapping from joint space to working space:
wherein ,a is a mapping matrix from the joint space of the mechanical arm to the working space between any coordinate system i and an adjacent coordinate system i-1 i Is along X i From axis Z i Move to Z i+1 Is a distance of (2); alpha i-1 Is wound around X i-1 From axis Z i-1 Rotate to Z i Is a function of the angle of (2); d, d i Is along Z i Axis from X i-1 Move to X i Is a distance of (2); θ i To be around Z i Axis from X i-1 Rotate to X i Is a function of the angle of (2);
(2) Establishing an on-orbit service task planner model based on a layered task network
The hierarchical task network (Hierarchical task network, HTN) planner consists of five sets of definitions: state), task Network (TN), domain description (domain description), planning problem (HTN) node.
Status: state S is a symbolic description of the world state at execution time of the plan, where S 0 The initial state is the world initial state before planning has not been executed. World state is defined as a set of conjunctive expressions containing true or false binary values, defined as:
S=p 1 ^p 2 ^...^p n
where p is a predicate expression containing true or false binary values, which determines whether the world state is true or not.
Task network TN: the task network TN is represented by a pair of tuples comprising tasks T and constraints C, wherein the tasks comprise atomic tasks (private tasks) and Compound tasks (Compound tasks). The task network doublet is defined as:
T N =(T,C)
wherein TN In order to achieve the object, the task network is a task network, wherein T can be an atomic task or a composite task, the composite task is a complex task which can be decomposed into atomic tasks through a decomposition method (method), and the atomic task is a task which cannot be decomposed and can be directly executed by an end effector of a mechanical arm. C specifies the constraints of the current task in the planning process.
Description of the field: the domain knowledge D is defined in HTN as a binary group containing operators and decomposition methods, and is the core of HTN planning. The domain description is defined as follows:
D=(O,M)
wherein O represents a finite set of operators, is an action template describing an atomic task, and M is a finite set of task decomposition methods. Any operator O e O can be described as the following triplet:
o=(a,Pre(a),Eff(a))
where a represents an atomic task that can be performed by an operator, pre (a) represents a precondition required for performing the task, and ef (a) represents an effect on a state space after performing the task. For any decomposition method mε M, the following is described:
m=(T c ,Pre(T c ),Sub(T c ))
wherein Tc For the composite task to be decomposed, pre (T c ) Representing preconditions required for decomposing the task, sub (T c ) Representing subtasks generated after the composite task is decomposed, wherein the tasks can be composite tasks or atomic tasks.
HTN programming problem: a problem with HTN planning is a four-tuple comprising an initial state space, an initial task network, domain knowledge, and a target state:
P HTN =(S 0 ,TN 0 ,D,G)
wherein ,PHIN For HTN planning problem, S 0 In an initial state, TN 0 For the initial task network, D is the domain description and G is the target state. Planning problem P for any HTN HTN If the solution exists, an action sequence is used as a solution of the problem, and the action sequence is the solution of the planning problem, so that the tasks in all task networks can be completed.
HTN task node: the HTN task node is composed of child nodes, a prize value R at the current node, a node entry number N, and an accumulated prize value Q. HTN nodes are defined as a five-tuple:
node=(n_c,n_p,N,R,Q)
the node is a current task node, n_c is used for storing child nodes, n_p is used for storing parent nodes, and if the current node is a root node, the node is empty. N is the number of node entries, R is the immediate prize value at the node, and Q is the cumulative prize value at the node. The node entering times N are used for calculating UCT values, wherein the UCT values are the basis for node selection in the node selection stage to enter the simulation stage, and are defined as follows:
wherein Qmax (n_c) is the maximum value of the cumulative prize value of all the child nodes under the node n_c, N (node) is the access times of the parent node of the node n_c, N (n_c) is the access times of the node n_c, and c is the search factor, which is a constant.
2. The method for planning on-orbit service tasks of the multi-mechanical arm space robot according to claim 1, which mainly comprises the following parts:
multi-mechanical arm space robot: an atomic task sequence or an action sequence and the like which are responsible for executing the output of the HTN planner;
HTN task planner: a planner composed of various decomposition methods, operators and corresponding domain description files;
3. the on-orbit service mission planning method of a multi-arm space robot according to claim 1, comprising the steps of:
(3-1)target planning problem input, current state information S 0 Input and task network TN input;
(3-2) extracting an initial task t from the task network TN, taking the task as a root task node, and recording the node times N (t).
(3-3) judging whether the task t is an atomic task;
and (3-4) if yes, selecting any operator O epsilon O meeting the current precondition to be executed by the mechanical arm, updating the current state, and deleting the atomic task t from the task network TN. If not, entering the next step;
(3-5) node extension: selecting any decomposition method meeting the current precondition of the task t to decompose, taking the obtained subtask t 'as a child node of the task t, and recording the node entering times N (t');
(3-6) judging whether t' is an atomic task, if not, entering a simulation stage: randomly selecting a decomposable method t 'to decompose the method t' and giving a prize value until the decomposition is stopped for an atomic task (3-7); if it is (3-7);
(3-7) backtracking all the passed task nodes from the atomic task node t' to the task node t where the task node t is currently located, and accumulating the obtained reward value to obtain a Q value;
(3-8) judging whether all the decomposition methods of the task node t are executed, if so, entering (3-9), otherwise, entering (3-5);
(3-9) the current node has no extensible task node, at which point the selection phase is entered: calculating UCT values of all the expansion task nodes t', arranging the UCT values according to descending order, selecting the task node with the maximum value as the current task node, replacing the task t in the task network TN, and returning to (3-2);
and (3-10) judging whether the task network TN is empty, if so, planning to complete and output an action sequence, otherwise, reselecting the task in the task network TN.
The beneficial effects are that: compared with the existing task planning scheme, the autonomous task decomposition and planning method of the multi-mechanical arm space robot is based on a kinematic model of the multi-mechanical arm space robot, and the task planning process is subdivided into three steps: question input, task decomposition, and task execution. The method and the system realize the effectiveness of the task execution process of the space robot, can complete the autonomous decomposition and planning of the space robot, get rid of an on-orbit service mode of manually setting an action sequence to execute on-orbit tasks, and enable the space robot with multiple mechanical arms to execute the autonomous tasks in a complex environment. At present, the on-orbit service of the multi-arm space robot is concentrated on the planning of on-orbit resources at home and abroad, and the multi-arm on-orbit service task planning research of the multi-arm on-orbit service robot is not carried out yet.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
As shown in fig. 1, the functional architecture of the present invention is first composed of two major parts: a multi-arm space robot and an HTN mission planner. And (3) establishing a kinematic model of the space robot to obtain a mapping from a joint space to a working space, and establishing a coordinate system conversion matrix of adjacent joints of the mechanical arm required by kinematics to prepare for executing a subsequently solved action sequence. The HTN task planner performs task decomposition and planning on the mechanical arm to output a motion sequence which can be directly executed by the mechanical arm. The input of the task planner is the field description and the problem description of the task to be planned, an initial task network is established for the task, the task is decomposed by means of a given decomposition method to obtain an atomic task, the task is added into the task network, the initial task is deleted from the task network, and the task planning scheme action sequence is output to be executed by the mechanical arm until the task is decomposed into an atomic task which cannot be decomposed.
As shown in fig. 2, the working steps of the invention are as follows:
(1) Target planning problem input, current state information S 0 Input and task network TN input;
(2) Extracting an initial task t from a task network TN, taking the task as a root task node, recording the node times N (t), and judging whether the task t is an atomic task or not;
(3) If yes, any operator O epsilon O meeting the current precondition is selected to be executed by the mechanical arm, the current state is updated, and the atomic task t is deleted from the task network TN. If not, entering the next step;
(4) Node expansion: selecting any decomposition method meeting the current precondition of the task t to decompose, taking the obtained subtask t 'as a child node of the task t, and recording the node entering times N (t');
(5) Judging whether t' is an atomic task, if not, entering a simulation stage: randomly selecting a decomposable method t 'to decompose the method t' and giving a reward value until the decomposition into atomic tasks stops, and entering (6); if so, directly entering (6);
(6) Backtracking all the passed task nodes from the atomic task node t' until the current task node t, and accumulating the obtained reward value to obtain a Q value;
(7) Judging whether all the decomposition methods of the task node t are executed, if so, entering (8), otherwise, entering (4);
(8) The current node has no extensible task node, and the selection phase is entered at this time: calculating UCT values of all the expansion task nodes t', selecting the task node with the maximum value to replace the task t in the task network TN, and returning to the step (2);
(9) And judging whether the task network TN is empty, if so, planning to complete and output an action sequence, otherwise, reselecting the task in the task network TN.