Evaluation Methods

We evaluated OMNI by performing a set of experiments aimed at answering the following research questions.

  • RQ1 (Accuracy / No Overfitting): How effective are OMNI models at predicting QoS property values for other sys- tem runs than the one used to collect the execution-time observation datasets for the refinement?
  • RQ2 (Refinement Granularity). What is the effect of varying the OMNI refinement granularity on the refined model accuracy, size and verification time?
  • RQ3 (Training Dataset Size). What is the effect of the training dataset size on the refined model accuracy?
  • RQ4 (Component Classification). What is the benefit of using a component classification step within OMNI?

To assess the generality of OMNI, we carried out our experiments within two case studies that used real systems and datasets from different application domains. The first case study is based on the travel web application presented in Section 3 and used as a motivating example earlier in the paper. In the second case study, we applied OMNI to an IT support system. This system is introduced in Section 6.1, followed by descriptions of the experiments carried out to address the four research questions in Sections 6.2, 6.3, 6.4, and 6.5.