Tutorial (System Identification Toolbox)

System Identification Toolbox

Outliers and Bad Data; Multi-Experiment Data

Real data are also subject to possible bad disturbances; an unusually large disturbance, a temporary sensor or transmitter failure, etc. It is important that such outliers are not allowed to affect the models too strongly.

The robustification of the error criterion (described under LimitError helps here, but it is always good practice to examine the residuals for unusually large values, and to go back and critically evaluate the original data responsible for the large values. If the raw data are obviously in error, they can be smoothed, and the estimation procedure repeated.

Often the data has portions with bad behavior. This may, e.g., be due to big disturbances or sensor failures over a period of time. It can also be that there are time periods where "nothing happens," the input is not exciting, etc. Then the best alternative is to break up the data into pieces of informative portions. By merging the pieces into a multiexperiment iddata object, they can still be used together to build models. Another situation when multiexperiment data is useful is when several different experiments have been performed on the same plant. The estimation routines take proper action to handle the different pieces. All estimation, simulation, and validation routines in the toolbox handle multi-experiment data in a transparent fashion. A typical string of commands could be

plot(Data)
Datam = merge(Data(1:340),Data(500:897), ...
              Data(1001:1200),Data(1550:2000))
m =pem(Datam{[1,2,4]}) % Portions 1,2 and 4 for estimation
compare(Datam{3},m) % Portion 3 for validation

Dealing with Data Missing Data