[ARTICLE] A reward–punishment feedback control strategy based on energy information for wrist rehabilitation – Full Text


Based on evidence from the previous research in rehabilitation robot control strategies, we found that the common feature of the effective control strategies to promote subjects’ engagement is creating a reward–punishment feedback mechanism. This article proposes a reward–punishment feedback control strategy based on energy information. Firstly, an engagement estimated approach based on energy information is developed to evaluate subjects’ performance. Secondly, the estimated result forms a reward–punishment term, which is introduced into a standard model-based adaptive controller. This modified adaptive controller is capable of giving the reward–punishment feedback to subjects according to their engagement. Finally, several experiments are implemented using a wrist rehabilitation robot to evaluate the proposed control strategy with 10 healthy subjects who have not cardiovascular and cerebrovascular diseases. The results of these experiments show that the mean coefficient of determination (R 2) of the data obtained by the proposed approach and the classical approach is 0.7988, which illustrate the reliability of the engagement estimated approach based on energy information. And the results also demonstrate that the proposed controller has great potential to promote patients’ engagement for wrist rehabilitation.


Stroke has become one of the major diseases that threaten people’s physical and mental health in the world.1 Loss of control of upper limbs is a common impairment underlying disability after stroke for patients, which seriously affects their daily activities.2 Traditional physical therapy is labor intensive and requires great energy of therapists.3 With the development of robotics, the emergence of rehabilitation robots provides a new way for rehabilitation.4 Rehabilitation robots are able to assist patients to complete training tasks without therapists. In addition, rehabilitation robots are capable of estimating patients’ rehabilitation status accurately through a variety of sensors, which helps therapists to develop a follow-up treatment plan for patients.

Control of rehabilitation robots, however, remains an open-ended research area. Control strategies, which target subjects ranging from the mildly impaired and severely impaired, are the most extensively investigated controller paradigm in the rehabilitation robotics community and have been proved to be the most promising techniques for promoting recovery after stroke.5,6 There is strong evidence that high engagement in rehabilitation training induces neural plasticity.7 Therefore, great attention is paid on investigating how to use robot control strategies to promote patients’ active engagement in robotic therapy.

Assist-as-needed (AAN) control strategy is one of the most popular research topics in the field of rehabilitation robots control strategies and is considered promising to promote patients’ engagement. As the name suggested, AAN control strategy emphasizes that robots only supply as much effort as a patient needs to accomplish training tasks by estimating his/her performance in real time.8 Impedance control first proposed by Hogon was applied in AAN control strategy primitively.9 Representatively, Krebs et al. proposed an AAN controller based on impedance control with MIT-Manus,10,11 which can update impedance parameters according to patients’ performance. In this case of robotic therapy, the robot provides assistance based on specific impedance parameters when the subject is not able to track the desired trajectory and does not provide assistance when the subject is able to track or exceed the desired trajectory so as to allow the subject to move voluntarily. This kind of mechanism encourages subjects to get rid of the limitations of the desired trajectory, which can be regarded as a reward and make them more active. But some subjects showed signs of slack behavior that they rely too much on the robot’s assistance to complete the task without any punishments.12 In other words, giving only rewards without punishments will cause subjects’ slackness in rehabilitation training. Therefore, it is necessary to develop control strategies exhibiting the reward–punishment feedback.

Wolbrecht et al. proposed an adaptive controller including a forgetting term to create the reward–punishment feedback mechanism.13 The adaptive law is made up of an error-based adaptive law and a forgetting law. The standard adaptive law dominates when there is a major tracking error so as to assist the subject to complete the task, while the forgetting law dominates when there is a minor tracking error so as to decay the assistance force to promote the subject’s active engagement, which forms a mechanism that gives a reward feedback to subjects by exhibiting a minor tracking error when they are highly engaged and gives a punishment feedback by exhibiting a major tracking error when they are slack. But the adaptive controller is model-based, it does not perform well when it is applied to wrist or finger rehabilitation because minor modeling deviations affect the wrist or finger much more than the upper limb. The tracking error will not change significantly regardless of the degree of the subject’s engagement.

Another improvement to the AAN control strategy was proposed by Pehilivan et al., who introduced a minimum AAN control strategy, which relied on Kalman filter to estimate subjects’ capability.14,15 According to the estimated results, the controller updates the derivative feedback gain to modify the bounds of allowable error on the desired trajectory, which also reflects the idea of reward–punishment feedback. Subsequently, Kalman filter was replaced by nonlinear disturbance observer, and the electromyography (EMG) sensors were used to estimate the subjects’ engagement.16

To sum up, in order to promote the engagement of subjects, the common feature of the above control strategies is that they can create a reward–punishment feedback mechanism according to the subjects’ current engagement or performance. To the best of our knowledge, previous researchers have not specifically identified this mechanism. More control strategies for rehabilitation robots support this point of view.1725

In this article, we proposed a reward–punishment feedback control strategy to promote subjects’ engagement for wrist rehabilitation. Firstly, we utilize the energy contributed by the subject to estimate his/her engagement. The energy can be obtained by calculating the integral of the torque contributed by the subject against the position. Secondly, an adaptive controller including a reward–punishment term was proposed. Unlike the adaptive controller above,13 the included term is not constant. Instead, it updates based on the estimated results so that the controller can give reward or punishment feedback to subjects by reflecting different tracking error, which is suitable for wrist rehabilitation. Finally, the control strategy was demonstrated through experiments on healthy subjects without cardiovascular and cerebrovascular diseases operating a wrist rehabilitation robot. The contributions of this work include the development of an engagement estimated approach without any extra sensors, which greatly reduces development costs. This work also proposed an improved adaptive controller including a reward–punishment term for wrist rehabilitation, which has great potential to promote subjects’ engagement.

This article is organized as follows. The second section presents an engagement estimated approach based on energy information and a human robot coupled system modeling. The third section proposes an adaptive controller including a reward–punishment term and details the Lyapunov stability analysis. The fourth section introduces the specific implementation methods of three experiments. The fifth section presents and analyzes experimental results. Eventually, the discussion and conclusion are presented in the sixth section.

Engagement estimated based on energy information

We have developed a wrist rehabilitation robot, a three degree-of-freedom (DOF) device, as shown in Figure 1(a). The device is capable of independently actuating all three DOFs of subject’s forearm and wrist. Relatively, the device has three joints: flexion/extension joint, radial/ulnar deviation joint, and pronation/supination joint can all be controlled. Each joint of the device employs both a brushless DC motor with a conveyor belt to drive. Therefore, the control methods of the three joints are similar, and this article only describes the control strategy of the flexion/extension joint.

Figure 1. The mechanical structure of the wrist rehabilitation robot. (a) The directional view of the human robot coupled system. (b) The side view of the wrist rehabilitation robot.



, , , ,

  1. Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: