The Correctness of Service in Runtime Adaptation for Context-Aware Mobile Cloud Learning

Service-Based Applications (SBAs) have become increasingly pervasive. These applications rely on the thirdparties services available on the cloud, and services must be aware of and adapt to their changing contexts in highly dynamic environments. SBAs with context-aware capabilities have provided the users with personalized services based on their user's (intrinsic) and device's (extrinsic) contextual information, as well as the Quality of Services (QoS). The correctness of service substitution in runtime adaptation is substantial for the continuity of user activity on the system. In Mobile Cloud Learning (MCL) environment most works only focus on intrinsic context factors such as learner's profile, learner's location, etc. We then introduce a comprehensive Dynamic Service Adaptation of Context-Aware Mobile Cloud Learning (DACAMoL), which is designed to reason for bothcontextual factors and QoS inservice discovery, ranking, and selection. The framework represents the contextual information, service descriptions, and QoS using a semantic-based approach to improve the correctness of service substitution. In this paper, wepresent a quasi-experiment study to demonstrate the DACAMoL framework with a mobile app called Mudahnya BM. Mudahnya BM is a learning app to learn basic knowledge of Malay language that build using RESTful backend services. The study involved 30 participants and 33 randomized scenarios tested using One-Sample Wilcoxon Signed Rank test. The results show significantly better service substitutions with 32 out of 33educational servicesare correctly adapted (i.e. 95% of the population).


Introduction
The context-aware service-based application is made up of a composition of services where it needs to be closely monitored depending on the changes in requirements or contextual information during runtime. The dynamic service adaptation process that inclusive of service discovery, ranking, and selection is needed to operate the changes. Mobile cloud learning (MCL) applications are well-known examples of applications requiring dynamic adaptation [1]. Nevertheless, selecting a correct equivalent service to replace the unavailable service due to changing requirements or contextual is still limitedly addressed by the current related works. There are several factors to be considered while measuring the correctness of the substituted services such as the learner's context, device's context, and QoS [2]. In addition, systematic empirical investigations have been lacking to evaluate the correctness of selecting equivalence services in achieving adaptive and personalized MCL with the learners.
Thus, this paper presents a quasi-experiment study to measure the correctness of the substituted services using the designed framework named Dynamic Adaptation of Context Aware Mobile Cloud Learning (DACAMoL) [3]. A series of scenarios have been identified which comply with all possible context changes, as well as QoS scores. DACAMoL is designed to sense the learner's context and the device's context on a mobile device during a runtime environment. As a result of using context representation and service descriptions with semantic enrichment in the reasoning process, specific learning resources are able to be adapted for the learners. The framework consists of four main phases starting from context acquisition, context centric adaptation, ontology reasoning, service discovery and selection, and service adaptation. The framework is further explained in [3].
The rest of the paper is discussed as follows: Section 2 states the related works. The methodology of the study is explained in Section 3. Section 4 describes the experiment while Section 5 discusses the result and discussion. The last section provides the conclusion of the paper.

Related Works
A comprehensive study has been conducted within eight studies which were focusing on the MCL environment [4]. Most of the researches considered the contextual information to perform adaptation such as the learner's profile, learner's location, preferred language, device information, learner's interaction, goals, or time.
There are limited researches that considered the device's contextual information related to the internet network connection and battery level. These contexts are only considered by [1] and [5]. The technologies used are varied within the frameworks where most of them use OWL-S to represent their contexts. In terms of the automation of the adaptation process, most of them are self-adaptive frameworks [1]- [2], [5]- [10] and only UoLmP [11] involved human-in-the-loop adaptation as it allowed the user to choose their context if it cannot be detected automatically.Besides, [5], [6], [10] works considered functional requirements to perform the dynamic service adaptation of services. Nevertheless, these works did not consider QoS correctness for theiradapted services.Based on research done by Kolikant[12], [21], correctness in a professional definition is: "…a program is considered as a working program if it exhibits correct I/O behavior for all input in the domain of the problems space" Input in this study is the contextual information while the output is the substituted equivalent services. MCL has considered the contextual informationin implying the correctness of substituted equivalence services which are network transmission and device's battery usage.Network transmission rate and device's battery usage are two inputs that have been considered as thecontextual information in the context acquisition phase for dynamic service adaptation. However, some of the researches had not considered this contextual information [5], [6], [10]. As Sommerville mentioned [13], the low battery could lead to a device and system failure.As for the low network will lead to the changing in content's behavior where different outputsare based on the respective context.While in MCL, the network and battery usagewill affect the learner's focus as they can get easily distracted because it takes a long time to load and display the learning content [14]. These two contexts are important to be considered as they help to minimize the battery usage of the mobile app [15]. An adaptation of service is required if the network is considered as poor when it is below 66 kilobits per second (Kbps) and the device's battery level is considered low if it is below 50 percent [15], [16].
The contextual information isone of the components for discovering equivalence services to replace with the unavailable services. The correctness of the services is substantial for the continuity of user activity on the system. However, some researches [2], [8]- [11], [17] have not adequately addressed the correctness to get the most equivalent services to be deployed. Instead, their solutions have devoted to the aspects of effectiveness, reliability, availability, scalability, performance, and security. To validate the correctness of the adapted services, an evaluation is required to test within the end-user which is discussed in the next section.

The Methodology
This evaluation follows a quasi-experiment method where we have selected 30 Software Engineering third years' students from Universiti Putra Malaysia as our participants. They have prior knowledge of serviceoriented architecture (SOA) and service adaptation. In the study, we have developed a context-aware mobile app on the Android platform called Mudahnya BM to learn basic knowledge of Malay language [3]. It is build using Phone Gap technology and integrated with backend services using Representational State Transfer (RESTful), which can be accessed through Uniform Resource Identifiers (URIs). The participants were asked to install the mobile app on their Android mobile devices, or we provided a mobile phone if a participant that not own a suitable device.
The validation of the correctness of the adapted educational services is based on the results in the Assessment module fromMudahnya BM mobile app. The value of contextual information, battery level, and network status, as well as QoS value, are randomized automatically using rule-based techniques within the stipulated range. A checklist given to the participant comprises 33 different scenarios that may arise during the operation of the mobile app (refer Table 1). The experiment is explained in detail in the next section.

The Experiment
The objective of this experiment is to measure the correctness of the adapted educational services based on the contextual information and Quality of Service (QoS) for the Mudahnya BM mobile app. A framework is considered as a working framework if it exhibits correct services for all valid scenarios arise in the environment as defined by Kolikant [12] in the following quote: "Working framework = exhibits correct services for all correct scenarios" Thus, the correctness of the framework is based on more than 32 correct adapted services out of a total of 33 scenarios (i.e. 95% of the population). Therefore, the hypothesis of this studyis as follow;

Null Hypothesis (H 0 ):m≤ 32
The adapted educational services are incorrect (no) for all context changes scenarios and QoS when m (median) ≤ 32

Alternative Hypothesis (H 1 ):m>32
The adapted educational services are correct (yes) for all context changes scenarios and QoS when m (median) > 32 In this experiment, there are two independent variables which are context change scenarios (i.e. device's battery level, network status, learners' mark) and QoS (i.e. availability and reputation). The dependent variable is the correctness of the adapted educational services. The correctness of the adapted educational services is quantified using the median score from 30 participants that are answering 33 randomized scenarios is calculated using the One-Sample Wilcoxon Signed Rank Test [18]. This non-parametric test is used to determine whether the median of the sample is equal to a known standard value. Effect size test will be performed using an absolute standardized test [19] after One-Sample Wilcoxon Signed Rank Test is performed. An effect size is a measure of how important a difference is. Large effect sizes mean the difference is important. Small effect sizes mean the difference is unimportant. The result and discussion are discussed in the next section.

The Result and Discussion
Based on the experimental result in Table 1, all 30 participants score above 30 for all scenarios (i.e. 33 scenarios).These scenarios are consideringdifferent context changes which are network transmission rate, device's battery level, and QoS score (i.e. availability and reputation). QoS 1 is referring to high availability and high reputation, QoS 2 is referring to high availability and low reputation, QoS 3 is referring to low availability and high reputation, QoS 4 is referring to low availability and low reputation. Participants should be able to identify the context changes and the respective services that they should receive. If the service is available according to the expected outcome, Yes () is ticked for the correctness. If the service is unavailable and not as stated in the expected outcome, No (×) is ticked for the correctness.
Only six participants who are P13, P20, P22, P26, P27, and P30 did not get the full score, 33/33. These six participants face some issues in Scenario 4, QoS 1 where they didn't get the correct service. The service that they should retrieve according to the contextual information is the textual content. Similar to Scenario 6, QoS 4 the participant did not get the right service which is the textual content.The statistical tests were conducted using IBM SPSS Statistics version 21.0 to determine the acceptance or rejection of the null hypothesis with a given significance level.
This result shows 30 participants (N=30) score 30 and above. One participant (3.3%) scores 31/33 (93.9%), five participants (16.7%) score 32/33 (97%) and 24 participants (80%) score 33/33 (100%). The mean score is 32.77 and the standard deviation is 0.504. The median of the data is 33 with 31 as minimum data and 33 as maximum data. Based on these descriptive data, the total number of six participants did not get full services (i.e. < 33) is due to the downtime period (i.e. below 98%) where the service is unavailable to access. The correctness of the adapted educational services is based on the percentage of the median must exceed 32 (i.e. 95% of the population) as stated in the hypothesis from Section 4. Since the median of this experiment is 33 which is 100% (i.e. 33 total scenario) and it is above 32, thus, the null hypothesis can be rejected.
The Shapiro-Wilk normality test [20] is conducted on the dataset and it shows that the data is not normally distributed as the number of participants is less than 50. The Shapiro-Wilk normality test has indicated that the data is not normalized since the p-value (quoted under Sig. for Shapiro-Wilk) is 0.00 which is less than 0.05. Thus, a nonparametric test is conducted to reject the null hypothesis or not. The output in Table 2 shows that there is significant evidence of the difference between each of the respondent's results where the p-value is 0.00. A 5% significance level was adopted (i.e. 95% confidence level). The Wilcoxon test is simply the sum of the positive ranks, but to compute the p-value (Asymp. Sig), SPSS uses an approximation to the standard normal distribution to give the Standardized (Z) test statistic (4.60) and resulting p-value in the Asymptotic Sig. (2-sided test) row (p < 0.00) as shown in Figure 1. The bar graph in Figure 1 shows that the observed median is slightly higher than the hypothetical median which is 32. The value of p is 0.000 and since p-value < 0.05, we reject the null hypothesis that median is ≤ 32 and conclude that the median of the sample is significantly different and the adapted educational services are correct (yes) for all context changes scenarios and QoS when m (median) > 32. The objective of the experiment achieved.

Figure 1.Result for Wilcoxon Test
Effect size test is determined to measure the generalization of this result to the population by using Cohen's specification, Eq. [1].
The value of the effect size is 0.84. Thus, this shows that there is a large effect on the generalization of this result to the population since the value is more than 0.5. Based on the above analysis and discussion, it shows that the adapted educational services are correct for all context changes and QoS for Mudahnya BM mobile app.

Conclusion
The paper has presented an empirical study in an experiment to indicate the correctness of adapted services using a designed DACAMoL framework. The framework had been tested in the MCL environment where Mudahnya BM mobile app had been used as the case study. The correctness of adapted educational services had considered the contextual information and QoS for the mobile app in the dynamic service adaptation process. The evaluation results have demonstrated that the DACAMoL framework has significant correctness with 32 adapted educational services that are correct out of total 33 scenarios (i.e. 95% of the population) tested using One-Sample Wilcoxon Signed Rank test. SOA software developers are the main beneficiaries of the work could readily adopt the framework to handle the adaptation process in the application runtime environment. As for future work, we will conduct another experiment with the experts to evaluate the framework against related works.

Acknowledgement
We thank Faculty Computer Science & Information technology (FSKTM), UPM financial support.