Sentiment Analysis with Deep Learning: A Bibliometric Review

: Sentiment analysis is an active area of research in natural language processing field. Prior research indicates numerous techniques have been used to perform the sentiment classification tasks which include the machine learning approaches. Deep learning is a specific type of machine learning that has been successfully applied in various field such as computer vision and various NLP tasks including sentiment analysis. This paper attempts to provide a bibliometric analysis of academic literature related to the sentiment analysis with deep learning methods which were retrieved from Scopus until the third quarter of 2020. We focus on the analysis of the research productivity in this field, the distribution of subject categories, the sources and types of the publications, their geographic distributions, the most prolific and impactful authors and institutions, the most cited papers and the trends of keywords. This study can help researchers and practitioners in keeping abreast with the global research trends in the area of sentiment analysis using deep learning approaches.


Introduction
Sentiment analysis (SA) is a field of study in Natural Language Processing (NLP). It is defined as the task of classifying people's sentiments or opinions towards certain entities ranging from products, services, organizations to events and current issues (Liu, 2015). A sentiment or opinion contains an entity, aspects of an entity, and the sentiment of aspect that represents its polarity. With the advent of Web 2.0 technology, there is an increased number of people expressing their opinions in the social media such as Facebook, Twitter, blogs and forums. This has resulted in huge amount of unstructured data that need to be analyzed so that the people's sentiments can be identified (Pang & Lee, 2008;Singh et al., 2016). Obviously, it is no longer practical to manually find or monitor the sentiments in these huge volume of texts and thus the need for the automated SA systems. As an active research area in NLP, many techniques have emerged for a variety of SA tasks. These SA approaches can be categorized as lexicon-based techniques or machine-learning-based techniques (Medhat, Hassan & Korashy, 2014). The lexicon-based approaches do not utilize any machine learning methods and training data but applies techniques that are either based on dictionary, such as Senti Word Net (Han et al., 2018) or based on corpus that employs statistical analysis of the contents documents using methods such as Hidden Markov Model (Soni & Sharaff, 2015). On the other hand, the machine learning approaches are based on the supervised machine learning algorithms that are trained with labelled data to classify texts into their corresponding sentiments. These supervised machine learning approaches include traditional machine learning methods such as Support Vector Machines (Alves et al., 2014), Maximum Entropy (Wu, Li& Xie, 2017) and Naïve Bayes (Parveen & Pandey, 2017).
Deep Learning, firstly proposed by G.E. Hinton in 2006, isa machine learning approach that is referred as Deep Neural Network (Hinton, Osindero & Teh, 2006). It is the application of artificial neural networks (ANN) to learning tasks using multiple layers of neural networks. A basic structure of an ANN consists of three layers which are the input layer, the hidden layer and the output layer. The term deep is referring to the multiple layers in the hidden layer. According to Andrew Ng (2015), a leading AI scientist, the three driving forces in the success of deep learning are the availability of huge amount of data in this big data era, the breakthrough in algorithms (such as backpropagation and activation functions) and the increase in the availability of fast computational hardware resources such as GPUs. The advantage of deep learning as compared to in traditional machine learning is that, not only it produces better results, such as in classification problems, but it also enables feature learning (Bengio, Courville & Vincent, 2013) where the task of feature selections is automatically performed by the network. Deep learning has been successfully applied in many areas such as computer vision, speech recognition and NLP such machine translations, question answering system and SA. The advancement and innovation in the neural algorithms also has led to the variations of ANN (Goller & Kuchler, 1996), Attention (Bahdanau, Cho & Bengio, 2014), Transformer (Vaswani, 2017) and the more recent architecture that is based on the Transformer which is known as Bidirectional Encoder Representations from Transformers (BERT) (Devlin et al., 2019).
Specifically, researchers in SA also have utilized these different types of deep learning architectures in their quest to improve the performance of sentiment classification tasks. In order to provide a clear perspective of the studies that have been done, we provide a bibliometric study of this important field in this paper. To our knowledge, there is no prior bibliometric study on this field that has been done. Bibliometric analysis is defined as the use of statistical methods on evaluating scholarly publications from an objective and quantitative perspective within a certain field (Radev et al., 2016). In this paper, we employed bibliometric methods to gain insights about the developments in this field including the research productivities, the main contributors in the research, the influential articles and the important issues concerned by the research communities. The rest of this paper is organized as follows. In Section 2, we describe the method used for this study. Section 3 provides the findings of this study. This paper concludes in Section 4.

Methods
In this study, we used the Scopus database as our source of data collection. Scopus is one of the largest abstract and citation database of peer-reviewed literature with more than 75 million records, 24600 titles from 5000 publishers ("About Scopus", 2020). For performing the document search, a list of keywords related to the "deep learning" and "sentiment analysis" was determined. For example, in addition to the term "deep learning", we also identified terms such as RNN, LSTM, Attention, Recursive neural network and BERT which are specific approaches to deep learning. For semantic analysis, we used similar terms such as "opinion mining' and "sentiment classification". Consequently, the following query phrase was used for searching the publications for this study: TITLE ( ( "deep learning" OR "deep neural" OR "recurrent" OR "recursive" OR "RNN" OR "long short term" OR "LSTM" OR "convolution" OR "CNN" OR "BERT" OR "transformer" OR "attention" ) AND ("sentiment analysis" OR "sentiment classification" OR "opinion mining" ) ) The search also is limited to titles of documents so as to retrieve the most relevant articles that are represented in their titles. For the date range, we used from all years to present, which is 2020. As a result of this query, 681 articles were retrieved. Next, the result set was exported as a comma-separated values file. Then, Microsoft Excel and VOSviewer ("VOSviewer", 2020) were used to analyze the data in this file. In particular, a bibliometric analysis was conducted to reveal patterns in SA with deep learning studies from the following aspects. First, performance analysis was carried out to identify the research productivity in this field, the retrieved document sources and types, the languages of the documents, the distribution of the publications by countries, the subject areas of the documents, the most active source titles, the most active institutions and authors. Second, citation analysis was performed to identify the most impactful institutions and authors as well as the top ten highly impactful articles. Finally, a frequency analysis was performed to identify the most frequently used keywords that were extracted from the title and abstract section of the retrieved articles.

Publication by Year
The research productivity in this area can be based on the number of documents produced per year. The distributions of the 681 documents according to the year of publication is shown Figure 1. Table 1

Document Types and Sources
All of the documents retrieved are also analyzed according to their types and sources. In terms of the types of the documents, more than half of the documents (429 or 63%) are Conference papers, as shown in Table 2. This is followed by Articles (226 or 33.19%), Book chapters (13 or 1.91%) and Review (8 or 1.17%). The remaining documents are discovered as Erratum (2 or 0.29%), Retracted (1 or 0.15%) and Undefined (2 or 0.29%). In terms of the sources of the documents, there are five source types which are Conference Proceeding, Journal, Book Series, Book and Trade Journal. Table 3 summarizes the distribution of the retrieved documents in these five source categories. It can be seen that a large portion of the documents are of type Conference Proceeding (318 or 47%), followed by Journal (240 or 35%) and Book Series (121 or 18%). In addition, there is one (0.2%) from Book and also one (0.2%) from Trade Journal source type.

Languages of Documents
Another interesting bibliometric attribute that is considered for this study is the languages used by the documents. Table 4 shows the distribution of the documents in terms of the utilized languages. As can be seen from the table, English is the dominant language being used by most of the documents (655 or 96%). Chinese is the second mostly used language with a total of 22 documents (3%). This is followed by Spanish (2 or 0.3%). There is one document (0.2%) for each of the French and Turkish languages.

Geographical Distribution of Publications
The next attribute of interest is countries that are prolific in publishing documents in this field. It is found that there are a total of 65 countries that contributed to all of the documents. Figure 2 shows the list of all of the countries with their number of document published. China is the most dominant country in this field with more than 300 publications and followed by India with 111 publications. The United States is at number three with 44 publications which is slightly higher than Japan which has 27 publications. Notably, Indonesia is also an active country in this area and has the same number of publications as United Kingdom, which amount to 17 publications.

Subject Areas
The subsequent bibliometric attribute that is analyzed is the subject areas of the documents. Table 5 shows the distribution of the documents based on the subject area. It can be observed that Computer Science emerges as the main subject area (611 or 45%) as both deep learning (subfield of Artificial Intelligence) and SA (subfield of NLP) are fields under Computer Science area. This is followed by Engineering (206 or 15%), Mathematics (166 or 12%), Decision Sciences (90 or 7%) and Social Sciences (70 or 5%). Other subject areas such as Material Science, Business Management & Accounting, Arts & Humanities accounted to less than 5% of the published documents. Note that the number of documents (N) in the table is 1358 because some of the documents are included in more than one subject area.

Source Titles
There were 160 source titles that published documents of "Sentiment Analysis" with "Deep Learning". Table  6 shows the top source titles that have five or more publications in this topic. About 40% of documents have been published in these source titles. The most productive source types is the Lecture Notes in Computer Science (LNCS) which published nearly 11% of all of these documents. This is followed by IEEE Access and ACM International Conference Proceeding Series.

Prolific and Impactful Organizations
Altogether there are 1155organizations that are involved in producing the 681 documents retrieved in the area of SA based on deep learning. Out of these, the top ten institutions and the country of origin are as shown in Table 7. As can be observed, there is a dominance of Asian institutions especially from China. The most prolific institution is the Chinese Academy of Sciences with 31 publications. This is followed by Beihang University (20), Beijing University of Posts and Telecommunications University (16) and Tsinghua University (16). There is only one institution among the top ten institutions which are not from China, which is the Vellore Institute of Technology from India with 15 publications.

Prolific and Impactful Authors
There are a total of 1591 authors that have contributed to the 681 documents retrieved in the area of SA based on deep learning within the stipulated period of time. Among all of these authors, the top ten most prolific authors are as displayed in Table 9.  From the perspective of citation count, the top ten most impactful authors are displayed in Table 10. The author with the most citations is Duyu Tang, who is affiliated with Harbin Institute of Technology with 947 citations and with an average citations per article of 237. Notably, he is also in the top ten most prolific authors list with four articles. This are followed by Yoshua Bengio and Xavier Glorot, both affiliated with Université de Montréal and Antoine Bordes, affliated with Université de Technologie de Compiègne, with 880 citations. All of these three authors contributed to one of the earliest articles that pioneered in the use of deep in learning inSA (Glorot,Bordes & Bengio, 2011). Although each of them has only one article, their single article has the highest impact and influence among the researchers in this field. The third most impactful authors with 686 citations are Ting Liu and Bing Qin, both are affiliated with Harbin Institute of Technology.  Table 11 displays the top ten most highly cited articles in SA based on deep learning from all of the 681 documents retrieved. The table shows both the number of citations and the citations of documents per year. As mentioned in the impactful authors section, the article written by Glorot, Bordes & Bengio (2011) is the most impactful article with 880 citations. This is one of the earliest articles written in this field which is about using deep learning with domain adaptation for a large-scale SA. This is followed by the article by Tang, Qin & Liu, (2015) with 625 citations, which is about enhancing the RNN with gated units for improving sentiment classification. The third most cited article is written by Wang et al. (2016) which discussed about integrating the Attention mechanism in LSTM for aspect-level sentiment classification. From these top ten impactful articles, nine are about using and improving the deep learning architectures for sentiment classification at various levels such as aspect, sentence or document level. Only one of the articles, which is written by Zhang, Wang & Liu(2018), is a survey paper on the research in SA based on deep learning. This paper is at the fifth position with 200 citations. Overall, all of these articles are essential reading for those that want to endeavor research in this field.  Table 12 depicts the top 20 most frequently used keywords which provided insights of the issues that had been discussed by the deep learning in SA community. Our data shows that the most frequently used keyword is "Sentiment Analysis" (used in 525 articles), followed by "Deep Learning" (320), "Sentiment classification" (271), "Data Mining" (221) and "Long Short-term Memory (LSTM)" (209). Other important keywords include "Attention Mechanisms" (146), "Semantics" (139), "Convolution Neural Network (CNN)" (125), "Social Networking" (107), "Natural Language Processing" (106) and "Recurrent Neural Network (RNN)" (100). It can be seen from these keywords that, the keyword "sentiment analysis" were more popularly used as compared to its similar meaning keywords which are "sentiment classification" and "opinion mining". In term of the deep learning architectures, the LSTM is probably the most popular architectural model, followed by the Attention mechanism, the CNN and the RNN. In addition, another important keyword that is "social networking" reflected social media as an important data sources for SA studies.

Conclusion
In this paper, we explored the trend of global research in the area of SA with deep learning approaches by performing a bibliometric analysis of the 681 publications obtained from the Scopus database which were published until near the third quarter of the year 2020. The results show that publications in this area started at 2011 and begun to rise incrementally, with an average annual growth rate of 12%, from 2013 until 2020. Nearly half of the documents are sourced from conference proceedings. Even though China is the main country in producing these articles, almost all (97%) of the documents are in the English language. The findings also indicate that the publications are distributed in many subject areas, mainly Computer Science, Engineering, Mathematics, Decision Sciences and Social Sciences. The top ten most productive institutions are all from China but the top impactful ones are also from Canada, France, USA and Singapore, in addition to China. The top highly cited articles show that popular type of research focusing on improving the performance of SA at different levels using various deep learning architectures such as LSTM, Attention mechanism, RNN and CNN. The important keywords analysis suggest that LSTM and Attention mechanism are gaining the attention from the researchers and social media is the important data source for performing SA. Overall, we believe that the findings from this study can help researchers in gaining the insights of the research trends, distributions, main contributors to this research field and the issues that had been discussed by the research communities in this field.