Towards Clasification Exploration in Spatial Crowdsourcing Domain: A Systematic Literature Review

Today, spatial crowdsourcing concept has been widely applied in various fields. The increasing ofmobile user and adoption of social network has catalyst spatial crowdsourcing growth. It has madevarious types of data to be easily collected and transmitted from different geographical location.However, the massive amounts of task in spatial area bring challenges for the online system tomanage especially when the task is heterogeneous, and the interactions are dynamic. Such scenario has alerted the researchers to understand different types of information in order to make taskassignment reliable and efficient.This study investigates current state of task assignment for spatialcrowdsourcing. It basically, aims to identify several issues like trend in publication and crowdcomputing areas that studies task assignment in crowdsourcing. We used Systematic LiteratureReview (SLR) method for analysing the trends and significance of task classification for betterdynamic crowd-computing.


Introduction
Spatial crowdsourcing concept has been widely applied in various fields such astransportationand commercial food delivery. The reasons of increasing in usages is because it is easily been applieddue to low expenditure and distributed faster compare to traditional business model. It is a conceptwhere worker must travel to task location to complete the task. In addition, with the advancement ofmobile technology and the emergence of Internet of Thing (IoT) catalysts the concept growth rapidly.Spatial crowdsourcing consists of three main elements that play a crucial role in the process whichare requester, system platform, and crowd worker. The requester outsourced the task (i.e., takingphoto in certain place) and the platform will assign the task to the crowd worker. The system platformplays an essential role in ensuring the overall process is success. In spatial crowdsourcing system, thetasks and crowd workers are heterogeneous and has dynamic interactions. The arrival and departureof the tasks and crowd workers in the system are uncertain. They might leave the system at any timedue to various factors such as tasks expiry or due to malicious behaviour. Consequently, uncertainconditions lead to risking incomplete task assignment. The incomplete assignment will affect thesystem reliability, hence reducing the number of requestors. It further affected the wholecrowdsourcing system.
To optimize the task assignment, it is also important to ensure that the tasks are assigned to suitable (reliable and cost-effective) crowd worker. For example, the location of the task and worker plays an essential role in allocation decision-making. It is because it affects the willingness of the worker to travel to the task location. If the task location is too far from the worker location, the task might be left unattended. In addition, human behaviour factor such as malicious behaviour might affect the reliabilities of task assignment. Hence, it is important for the spatial crowdsourcing systemto study which worker to be assigned to which spatial tasks. Therefore, the objective of this study isto investigate the current state of task assignment in the spatial crowdsourcing in regards computingscope. To conduct this review, we use a systematic literature review (SLR) method. The finding ofthis study gives an insight about the current trend in spatial crowdsourcing, further brings viewpointsfor other researchers to work on future works.

Research Methods
To conduct this study, we adapted Systematic literature review (SLR) method to guide us evaluatingand interpreting available research on the crowdsourcing It is useful to developsupporting evidences and eliminate bias during theresearch process (Ali, 2018).The SLR method consists of three main phases which are planning review, conducting review, and reporting review.

Phase 1: Planning the review
In the planning review, we formulate research questions to narrow down the articles search results. To formulate the research questions, we used Population, Intervention, Comparison, Outcomes, and Context (PICOC) by referring to the authors in (Petticrew and Roberts, 2006). It ableto help us for structuring the research questions. As concluded, we structured the research questionsas follows; "what are the focus issues in the task assignment of spatial crowdsourcing?".

Phase 2: Conducting the review
In this phase, there are few strategies that have been employed. The first strategy is to derive keywords from the research questions and reconstruct it into search string to find relevant articles in the digital libraries databases. The search string used in this study is ("spatial" OR "spatial crowdsourcing" OR "crowdsourcing" AND "assign" OR "assignment"). The second strategy is to select digital libraries for retrieving comprehensive published studies. The digital libraries selected inthis study are ACM Digital Library, IEEE Explore, Google Scholar, Mendeley, and Science Direct.The selected digital libraries are subscribed by the University Putra Malaysia's (UPM) Library.Meanwhile Google Scholar and Mendeley are considered as a coverage across the boundaries ofindividual database that been used also by the authors in (Geiger and Schader, 2014). Some of thearticles are also retrieved using ResearchGate webpages and backward references search. During thesearching process, the articles are searched based on its relevancy towards the topic. The range dateof the articles published is not limited. Some features in the digital libraries such as advance searchsetting and search keywords within abstract were used to narrow down the search process.
The third strategy is to screen and analyse the selected articles using the inclusion and exclusioncriteria. It means to filter out articles that did not meet the study requirements. The inclusion criteriaare included all articles that published in English, within the sort of relevance search, the type ofpublication (i.e., journal and conference proceeding). The articles must focus on the task assignmentin spatial computing scope. In the other hand, the exclusion criteria include the articles that are notpublished in English language, published in other than journal and conference proceeding, has lessthan 3 pages. Such articles discussing the task assignment but not in specific for spatial scope.The qualities of the filtered selected articles were then assessed using the quality studyassessment criteria. Table 1 shows the quality assessment criteria used in evaluating each article. Thequality assessment consists of four questions (Q1-Q4), where each of the question given a score:Yes =1; partially = 0.5; No = 0. Based on the score given, the articles will be rated from 0 (very poor)to 4 (very good).

Phase 3: Reporting the review
The reporting process includes synthesizing the data and reporting the finding which were furtherdiscussed in next section (Result and Discussion). The data synthesis is the process of extracting theinformation from selected articles that been answered according to our identified research question.To extract and synthesis the retrieved articles, we used Mendeley version 1.61.1 thus it records thereferences details for each identified crowdsourcing scopes. We categories the articles based on theyear of publication, type of the articles, methods, abstracts, and the scopes.

Results and Discussion
The findings and arguments of the investigation work is explicitly described and illustrated. There is given figures and tables as evidence to support the prior investigation process. Overall, there are 198 articles that were deemed to be relevant to the research question. The retrieved articles are then screened and analysed based on their titles and abstracts. The results shown that there are 87 articles closely relevant to our scope.
Later, those articles are filtered based on the inclusion and exclusion criteria. During the process all the duplicated articles, irrelevant articles or articles which did not meet the exact requirement are excluded from the article selection process. The rest of the articles were then evaluated using the study quality assessment criteria. After the filtering process, we come at 38 articles. All the selected articles are classified as good and very good articles. There are 35 articles which score very good quality (92.1%) and three (3) papers at good quality (7.9%). Table 2 illustrates the filtering results of the quality assessment criteria for all final screening articles. Then, those 38 selected articles are synthesized as a supporting evidence to address the research questions. In this study, the publication types of articles are selected from journal and conference proceedings. Hence, the chapter, eBook, patents etc. were excluded during the process selection. Based on the results, the conference publication for task assignment in crowdsourcing is started in 2012. General, the result (in Figure 2) shows that there are three (3)   The increasing publication articles might be influenced by the advancement of the Internet, mobile technologies, and a rapid evolvement of social media. The advancement of these technologieshad catalysts massive amount of data in the networks which contains spatial information such as geolocation, photos, video, and etc (Chi et al., 2017). The vast amount of heterogeneous data anddynamic interaction within the environments brings challenges for the crowdsourcing system tounderstand and differentiate the data behaviour. In addition, the data collected or stored ofteninflected with noisy signals and repetitive waves hence it is difficult to produce clear and smooth data (Hassan & Curry, 2016). Consequently, it is affecting the task assignment decision-making process.It leads to unreliable and inefficient computing process. Therefore, further investigation on spatialtask criteria needed to further study.From our prior investigation on the publication trends, it reveals that the researchers are studiedwithin the same scope of issues. Surprisingly, most of the articles are mentioned on the issue ofunderstanding the crowdsourcing task assignment through classification matter. It is identified fromour synthesized articles there are subject of task classification that been studied for (i) privacy, (ii) semantic, (iii) sensing, (iv) location, (v) scheduling and (vi) software-testing domain. Figure 2illustrates the matter of crowdsourcing task assignment gained from the prior studied. For privacyand sensing subject maters there are ten (10) and eight (8) articles, respectively. Some articles in the matters are inter-related to each other that makes it influenced in the reading. Meanwhile, there are nine (9) articles focused on scheduling while five (5) articles relating to location matters. Meanwhile,three (3) articles focused on each software testing domain and semantic, respectively Based on the results, we found that the focus of the task classification in the spatial crowdsourcing is different from the traditional crowdsourcing (Hassan & Curry, 2016). The task assignment approach in spatial crowdsourcing is different due to the spatio-temporal nature of tasks that makes the tasks take longer duration to be completed compared to the traditional crowdsourcing.In traditional crowdsourcing, the task classification is more focused on optimizing the quality ofresult by the crowd workers (Dekel & Sridharan, 2012;Ho et al., 2013). Most of the literaturesfocused on enhancing the crowdsourcing system abilities to predict and analyse accuracy ofoutput/outcome provided by the crowd worker (Alabduljabbar, R. et. al, 2019; Hussin, M., 2018). Inanother example, Hassan and Curry (Hassan et al., 2013) used Bayesian approach to predict theworker performance on the new assignment tasks. Meanwhile. in Kazemi and Shahabi (2012) theyproposed the maximum task assignment where later extended to maximum score assignment foroptimizing the number of assignment and travel distance. The authors in (Cheng et al, 2014) alsofocuses on optimizing the task reliabilities while catering task diversity. In a similar research direction, Cui et.al (2017) used the agents to learn and adopt from human task allocation strategies into their task classification scheme.From the other perspective of the task classification in the spatial crowdsourcing, privacy issuebecomes one of the major concerned. It is not a surprise because the crowdcomputing applicationsthat collected and stored the tasks can be reside at everywhere. It also allows the requesters andworkers to engage in a wide range of interaction. This situation makes the requesters (users) or eventhe workers at risk of serious privacy threats (Zhang, X. et.al, 2019). Some of the researchers arefocused on protecting the users' locations from being exposes. For example, Park et.al (2017) usedEntropy-Maximizing Observation Function and identification algorithm to protect the user identity. Ma et al.,(2017) proposed APPLET frameworks for encryption, and prediction the recommendationresults to the user. Aside from the privacy of the data, there are many researchers focused on thesemantic perspective in spatial crowdsourcing in order to understand the data or description providedby the user. For example, the authors in (Vasardani et al., 2012) used task classification scheme toimprove the interaction between users and services by examines the use of preposition "at" in a setof crowdsourced place descriptions. Leung (Monteiro et.al, 2017) used taskclassification to improve the contextual information in order to help disaster monitoring andresponsing.Last but not least, we also realized that the crowdsourced has related with the software testing domain. In this area most of the studies using the crowd as the subject matter for software testing.The matter really gains the benefit from the crowdsourcing platform due to overwhelming of testingrequirements which required various documentation while maximizing the number of respondentsfor gaining the accurate results. For example, Feng  Semantic 8% PyramidMatching (SPM) technique and natural-language processing technique for collecting the output andviews from the public users and the experts that they integrated in the crowdsourcing platform. Ithelps for effective, quick and reliable communication.

Conclusion
Spatial crowdsourcing concept has opened service opportunities to engage and utilize a high-volume number of potential resources or workers. Its operation has significantly contribute toward the organisation success. However, despite the abundance of its advantage, it also brings daunting challenges which could affect organization as whole. Due to the spatial crowdsourcing enviroment itself, it is quite challenging to fullfill those service requirements. There are massive amounts of demand/task request which is heterogeneous and has dynamic interaction between the agents. In this study, we have conducted a systematic literature review to gain a better understanding about current state of how the resources assigned and classified accordingly and what are the focus has been higlighted in the previous studies. The finding of our study shows that there are an increasing study has been conducted emphased on the classification since 2012 until 2019. Based on the finding, we also identified several significant subject matters of the spatial crowdsourcing which is (i) privacy, (ii) semantic, (iii) sensing, (iv) location, (v) scheduling and (vi) software-testing domain. By understanding the subject matter in spatial crowdsourcing, we hope that it could help for in the future where the spatial crowdsourcing can be utilized and further improved.