Volume 12, Number 3, 2015
Special Issue on Latest Advances in ILC/RLC Theory and Applications (pp.243-336)
Tremendous amount of data are being generated and saved in many complex engineering and social systems every day. It is significant and feasible to utilize the big data to make better decisions by machine learning techniques. In this paper, we focus on batch reinforcement learning (RL) algorithms for discounted Markov decision processes (MDPs) with large discrete or continuous state spaces, aiming to learn the best possible policy given a fixed amount of training data. The batch RL algorithms with handcrafted feature representations work well for low-dimensional MDPs. However, for many real-world RL tasks which often involve high-dimensional state spaces, it is difficult and even infeasible to use feature engineering methods to design features for value function approximation. To cope with high-dimensional RL problems, the desire to obtain data-driven features has led to a lot of works in incorporating feature selection and feature learning into traditional batch RL algorithms. In this paper, we provide a comprehensive survey on automatic feature selection and unsupervised feature learning for high-dimensional batch RL. Moreover, we present recent theoretical developments on applying statistical learning to establish finite-sample error bounds for batch RL algorithms based on weighted Lp norms. Finally, we derive some future directions in the research of RL algorithms, theories and applications.
Norm optimal iterative learning control (NOILC) has recently been applied to iterative learning control (ILC) problems in which tracking is only required at a subset of isolated time points along the trial duration. This problem addresses the practical needs of many applications, including industrial automation, crane control, satellite positioning and motion control within a medical stroke rehabilitation context. This paper provides a substantial generalization of this framework by providing a solution to the problem of convergence at intermediate points with simultaneous tracking of subsets of outputs to reference trajectories on subintervals. This formulation enables the NOILC paradigm to tackle tasks which mix "point to point" movements with linear tracking requirements and hence substantially broadens the application domain to include automation tasks which include welding or cutting movements, or human motion control where the movement is restricted by the task to straight line and/or planar segments. A solution to the problem is presented in the framework of NOILC and inherits NOILC'swell-defined convergence properties. Design guidelines and supporting experimental results are included.
This paper addresses the problem of robust iterative learning control design for a class of uncertain multiple-input multipleoutput discrete linear systems with actuator faults. The stability theory for linear repetitive processes is used to develop formulas for gain matrices design, together with convergent conditions in terms of linear matrix inequalities. An extension to deal with model uncertainty of the polytopic or norm bounded form is also developed and an illustrative example is given.
Terminal iterative learning control (TILC) is developed to reduce the error between system output and a fixed desired point at the terminal end of operation interval over iterations under strictly identical initial conditions. In this work, the initial states are not required to be identical further but can be varying from iteration to iteration. In addition, the desired terminal point is not fixed any more but is allowed to change run-to-run. Consequently, a new adaptive TILC is proposed with a neural network initial state learning mechanism to achieve the learning objective over iterations. The neural network is used to approximate the effect of iteration-varying initial states on the terminal output and the neural network weights are identified iteratively along the iteration axis. A dead-zone scheme is developed such that both learning and adaptation are performed only if the terminal tracking error is outside a designated error bound. It is shown that the proposed approach is able to track run-varying terminal desired points fast with a specified tracking accuracy beyond the initial state variance.
This paper addresses an iterative learning control (ILC) design problem for discrete-time linear systems with randomly varying trial lengths. Due to the variation of the trial lengths, a stochastic matrix and an iteration-average operator are introduced to present a unified expression of ILC scheme. By using the framework of lifted system, the learning convergence condition of ILC in mathematical expectation is derived without using λ-norm. It is shown that the requirement on classic ILC that all trial lengths must be identical is mitigated and the identical initialization condition can be also removed. In the end, two illustrative examples are presented to demonstrate the performance and the effectiveness of the proposed ILC scheme for both time-invariant and time-varying linear systems.
In this paper, a discrete-frequency technique is developed for analyzing sufficiency and necessity of monotone convergence of a proportional higher-order-derivative iterative learning control scheme for a class of linear time-invariant systems with higher-order relative degree. The technique composes of two steps. The first step is to expand the iterative control signals, its driven outputs and the relevant signals as complex-form Fourier series and then to deduce the properties of the Fourier coefficients. The second step is to analyze the sufficiency and necessity of monotone convergence of the proposed proportional higher-order-derivative iterative learning control scheme by assessing the tracking errors in the forms of Paserval'senergy modes. Numerical simulations are illustrated to exhibit the validity and the effectiveness.
The repetitive control (RC) or repetitive controller problem for nonminimum phase nonlinear systems is both challenging and practical. In this paper, we consider an RC problem for the translational oscillator with a rotational actuator (TORA), which is a nonminimum phase nonlinear system. The major difficulty is to handle both a nonminimum phase RC problem and a nonlinear problem simultaneously. For such purpose, a new RC design, namely the additive-state-decomposition-based approach, is proposed, by which the nonminimum phase RC problem and the nonlinear problem are separated. This makes RC for the TORA benchmark tractable. To demonstrate the effectiveness of the proposed approach, a numerical simulation is given.
Typical masking techniques adopted in the conventional secure communication schemes are the additive masking and modulation by multiplication. In order to enhance security, this paper presents a nonlinear masking methodology, applicable to the conventional schemes. In the proposed cryptographic scheme, the plaintext spans over a pre-specified finite-time interval, which is modulated through parameter modulation, and masked chaotically by a nonlinear mechanism. An efficient iterative learning algorithm is exploited for decryption, and the sufficient condition for convergence is derived, by which the learning gain can be chosen. Case studies are conducted to demonstrate the effectiveness of the proposed masking method.
The design of iterative learning controller (ILC) requires to store the system input, output or control parameters of previous trials for generating the input of the current trial. In order to apply the iterative learning controller for a real application and reduce the memory size for implementation, a current error based sampled-data proportional-derivative (PD) type iterative learning controller is proposed for control systems with initial resetting error, input disturbance and output measurement noise in this paper. The proposed iterative learning controller is simple and effective. The first contribution in this paper is to prove the learning error convergence via a rigorous technical analysis. It is shown that the learning error will converge to a residual set if a forgetting factor is introduced in the controller. All the theoretical results are also shown by computer simulations. The second main contribution is to realize the iterative learning controller by a digital circuit using a field programmable gate array (FPGA) chip applied to repetitive position tracking control of direct current (DC) motors. The feasibility and effectiveness of the proposed current error based sampleddata iterative learning controller are demonstrated by the experiment results. Finally, the relationship between learning performance and design parameters are also discussed extensively.
In this paper, iterative learning control (ILC) technique is applied to a class of discrete parabolic distributed parameter systems described by partial difference equations. A P-type learning control law is established for the system. The ILC of discrete parabolic distributed parameter systems is more complex as 3D dynamics in the time, spatial and iterative domains are involved. To overcome this difficulty, discrete Green formula and analogues discrete Gronwall inequality as well as some other basic analytic techniques are utilized. With rigorous analysis, the proposed intelligent control scheme guarantees the convergence of the tracking error. A numerical example is given to illustrate the effectiveness of the proposed method.
A new approach of adaptive distributed control is proposed for a class of networks with unknown time-varying coupling weights. The proposed approach ensures that the complex dynamical networks achieve asymptotical synchronization and all the closed-loop signals are bounded. Furthermore, the coupling matrix is not assumed to be symmetric or irreducible and asymptotical synchronization can be achieved even when the graph of network is not connected. Finally, a simulation example shows the feasibility and effectiveness of the approach.
In this paper, an iterative learning control algorithm is proposed for discrete linear time-varying systems to track iterationvarying desired trajectories. A high-order internal model (HOIM) is utilized to describe the variation of desired trajectories in the iteration domain. In the sequel, the HOIM is incorporated into the design of learning gains. The learning convergence in the iteration axis can be guaranteed with rigorous proof. The simulation results with permanent magnet linear motors (PMLM) demonstrate that the proposed HOIM based approach yields good performance and achieves perfect tracking.
Genetic algorithm (GA) has received significant attention for the design and implementation of intrusion detection systems. In this paper, it is proposed to use variable length chromosomes (VLCs) in a GA-based network intrusion detection system. Fewer chromosomes with relevant features are used for rule generation. An effective fitness function is used to define the fitness of each rule. Each chromosome will have one or more rules in it. As each chromosome is a complete solution to the problem, fewer chromosomes are sufficient for effective intrusion detection. This reduces the computational time. The proposed approach is tested using Defense Advanced Research Project Agency (DARPA) 1998 data. The experimental results show that the proposed approach is efficient in network intrusion detection.