The Limitations of Cross-Site Scripting Vulnerabilities Detection and Removal Techniques

: Web applications have become very important tools in our daily activities as we use them to share and get information, conduct businesses, and interact with family and friends on social media through the Internet. Despite their importance, web applications are plagued with many security vulnerabilities that enable hackers to attack them and compromise user information and privacy. Cross-site scripting vulnerabilities are a type of injection vulnerabilities existing in web applications. They can lead to attacks in web applications due to the lack of proper validation of input data in the affected web pages of an application. Many approaches and techniques have been proposed to mitigate this type of vulnerabilities. However, these solutions have some limitations and cross-site scripting vulnerabilities still remain as a major security problem for web applications. This paper explores and presents the existing techniques for detecting and for removing cross-site scripting vulnerabilities in web application. It gives an overview of cross-site scripting as a security issue in web application and its different types. The advantages as well as the limitations of each techniques are highlighted and discussed. Based on the limitations, some possible future research directions are identified, and recommendations are given as reference for researchers interested in this topic.


Introduction
As technology advances, we rely on the Internet to carry out daily transactions using Desktop and mobile web applications. Banking websites enable customers to transfer money online, pay bills, and conduct many transactions without having to visit any bank branch. Many online stores are also available to support the Ecommerce businesses around the world. Social and other sites enable people to consume and share information online. However, this has made web applications more complex (OWASP, 2020), resulting to a lot of security issues affecting them including cross-site scripting (XSS). Security is addressed as an afterthought in most web applications that are currently in used. Despite the efforts of integrating an application's security requirements throughout the software development lifecycle, many XSS vulnerabilities are still found in web applications. It is reported that 80% of all websites are vulnerable to XSS (Javed & Schwenk, 2014).
XSS vulnerabilities are in the top security vulnerabilities that are in desktop web applications (OWASP, 2020) as well as mobile web applications (Kaur, Pande, Bhardwaj, Bhagat, &  and are exploited through XSS attacks on web applications that are available online (OWASP, 2020). Hackers can inject or input malicious code where user inputs are inserted in these applications such as providing username and password when login to an application. If the website does not verify the user inputs or it is done incorrectly, it allows the hacker the opportunity of exploiting for vulnerabilities and conduct malicious activities.
XSS vulnerabilities are categorized into three types, Reflected, Stored and DOM-based XSS (OWASP, 2020). Reflected and stored XSS attacks do occur on the server side while the DOM-based ones occur on the client side of an application. Successful XSS attacks allow attackers to carry out malicious activities such as stealing cookies, transferring private information, hijacking a user's account, manipulating the web content, or causing denial-of-service attacks.
The existing XSS detection and removal techniques can be categorized into static analysis, dynamic analysis, secure programming, modeling, and hybrid analysis. Each of the categories is discussed in detail in the paper with their limitations.
The rest of the paper is organised as follows. Section II presents the background of and the different types of XSS. Section III details the different detection and removal techniques for XSS vulnerabilities and their limitations, and last but not the least Section IV concludes the paper and provides recommendations for future research.

Cross-Site Scripting Vulnerabilities
XSS vulnerabilities were first discovered in the 1990s following the emergence of the World Wide Web. They make the top list of the most common security vulnerabilities that are affecting web applications (Hydara, Sultan, Zulzalil, & Admodisastro, 2015). They are classified as input validation problems that make possible the injection of malicious code into trusted web applications. This is due to the failure, during software development, to validate inputs from the web application users that are used in the output (Acunetix, 2020; CWE, 2020; OWASP, 2020; Hussain et al., 2017). This failure to properly verify the user inputs enables hackers to attack an application.
Web applications usually fail to verify the user inputs or they do it incorrectly, thus allowing the hacker the opportunity of exploiting for vulnerabilities(CWE, 2020). Hence, whenever a user visits a web application, the browser can unknowingly execute the malicious code injected by the hacker. Once the browser is infected with the malicious code, the user's sessions are then hijacked and malicious activities can be conducted by the hacker. Figure 1 shows an overview of XSS attack.
Successfully carried out XSS attacks can lead to many problems for both web application and user. The attacker is able to easily inject malicious scripts to an application's user input field and, if not properly validated, can impersonate their victims to carry out many malicious activities. Such activities can include cookie stealing, manipulating web content, causing denial-of-service and transferring private information.

Types of Cross-Site Scripting Vulnerabilities
Reflected XSS (Non-Persistent XSS or Type I XSS) attack occurs when user input is used immediately by the server-side to generate an output page for the user (OWASP, 2020). If the provided input was not validated and is included without any HTML encoding, it can result to the execution of the input provided. Therefore, when invalidated user input has been included in the generated page, then the client-side code is injected and execute into the dynamic page. For example full access to a page's content can be obtained when an attacker manages to convince a potential victim to follow a malicious URL that injects code into the results page.
Stored XSS attacks are the most powerful types of attack. They are also referred to as Persistent or Type II XSS. This form of XSS attack occurs when the inserted user input is first stored on the server (databases, file systems, or other locations). Eventually, this will be displayed to the web application users in a web page without any HTML encoding (OWASP, 2020).
DOM-based XSS (Type III XSS) is quite different from Type I and II. The JavaScript malware payloads do not need to be sent to the Web server to enable exploitation (OWASP, 2020). Pieces of JavaScript code can access a URL request parameter and use it to write some HTML code into its page, if no HTML encoding is done. The newly written data will, therefore, be re-interpreted by the browser such as HTML and this may also add some client-side script. Li and Xue (Li & Xue, 2014) conducted a survey on the server-side techniques to secure web applications from attacks including XSS. They discuss three common security vulnerabilities and the types of attacks (input validation vulnerability, session management vulnerability, and application logic vulnerability) that exploit them, and identify existing approaches that to mitigate them. They also highlighted emerging challenges imposed by new programming methodologies and technologies.

Related Work
Garcia-Alfaro and Navarro-Arribas (Garcia-Alfaro & Navarro-Arribas, 2008) also surveyed the two most common XSS attacks, reflected XSS and Stored XSS and how they are carried out. They then discuss the existing solutions to tackle these attacks and their applicability.

Exisiting Xss Detection and Removal Techniques and Their Limitations
This section briefly discusses the existing techniques and approaches for the detection and removal of XSS vulnerabilities in desktop and mobile web applications. They can be categorized into static analysis, dynamic analysis, modeling, and hybrid analysis. Table 1 summarizes the techniques and their limitations.

Static Analysis
Static analysis techniques are solutions that carry out XSS vulnerability detection at the source code level of web applications(Gupta & Gupta, 2016; Kurniawan, Abbas, Trisetyarso, & Isa, 2018). They help to track data through an application and identify vulnerable parts of a source code thereby detecting XSS vulnerabilities. The most common detection techniques under static analysis are static taint analysis, data flow analysis, string analysis, precise alias analysis, program slicing, and symbolic execution. The advantages of these techniques are that they can be carried out without running the source code and the detected vulnerabilities can be removed directly from the source code. The major limitation of these techniques is their high rate of false positives. This is because of the conservative nature of these techniques. Another limitation is the source codes of applications are needed to conduct these security tests.

Dynamic Analysis
Dynamic analysis techniques detect XSS vulnerabilities in web applications that are already deployed online during runtime (Gupta, Gupta, & Chaudhary, 2018;Kaur et al., 2018). They intercept and analyze input data coming into an application from users and determine whether it is harmful or not. Such techniques include penetration testing, web monitoring, filtering, dynamic analysis, taint tracking, and flow analysis. Their advantage is they can be carried out without the availability of the application source code. Also, some malicious behaviours in an application can only be detected while running the application. The limitations of these techniques are that attackers can use obfuscation to hide their attack patterns and carry out successful XSS attacks. Also, these techniques require more computing resources to simulate the running environment of web applications.

Secure Programming
Secured programming techniques detect unsecure coding in the programming environment and ensure that programming guidelines and rules are followed during the development of web applications ( Their advantage is they can help developers to adhere to secure coding practices while developing web applications. The limitation of these techniques is that many developers do not like to use them as they feel they slow them down in their work.

Modeling
Modeling techniques (Elhakeem & Barry, 2013;Gol & Shah, 2015) are not as commonly used in XSS detection as the previous two categories, but they are still important solutions. Existing modeling techniques for XSS detection include model checking, finite state machine, data mining, and threading. Most models are designed to provide guidelines to developers and security testers during coding and testing. They have the advantage of helping analysts and developers to mitigate vulnerabilities in all stages of application development. The major limitation of these is that in most cases the guidelines are not read and followed while coding due to time constraints of development projects.

Hybrid Analysis
Due to the limitations of the previous techniques and approaches, most research studies are now combining different techniques of XSS detection known as hybrid analysis (Wang, Zhu, Tan, & Zhou, 2017). It is used in order to reduce the limitations of using a single technique or approach. Combining static and dynamic analyses as well as modeling and secure programming has the advantage of providing more coverage in terms of designing, coding, security testing of the code and the running of the application, as well as guiding the developers. Hybrid analysis still have their limitations; however, they are an improvement on previous techniques.

Conclusion
In this short study, we have investigated the various techniques that have been proposed by previous researchers to solve the XSS security problems. We have discussed the background of XSS attacks and the different types of XSS attacks. We then identified the limitations in the existing techniques we discussed as well as the needed improvements. Based on the limitations identified, new areas of research should be explored continuously with the expectation of discovering more secure ways of developing software and preventing more XSS attacks. Researcher can work on reducing the high rates of false positives and false negatives that limit static analysis and dynamic analysis, respectively. More focus on secure programming and modelling is needed to educate, train and guide developers on the importance of vulnerabilities mitigation. A combination of different techniques with hybrid analysis that will put together the best features of the combining techniques is also a good research direction.

Acknowledgment
We acknowledge that this research received support from the Fundamental Research Grant Scheme FRGS/1/2015/ICT01/UPM/02/12 awarded by Malaysian Ministry of Higher Education to the Faculty of Computer Science and Information Technology at Universiti Putra Malaysia.