LeDoCl : A Semantic Model for Legal Documents Classification using Ensemble Methods
Main Article Content
Abstract
NLP is one of the components of Machine Learning. Topic Modeling is a sub component of information retrieval Information Retrieval is a broad domain research in Natural Language Processing (NLP). This downside has been broadly studied in the perspective of cluster algorithms like K–means and K-fold, that tends to converge to at least one of diverse native issue counting on the selection of format method. To overcome the instabilities and assumptions in existing systems such as Vector Space Model (VSM) and SVD, Semantic based topic modeling (SLDA) and ensemble model with generation and integration is proposed. In the case of topic modelling, instability is visible in two distinct aspects. First, when the topic descriptors are examined over multiple runs. During which there will be considerable change in the term rankings and few terms may appear or disappear completely as well. Next, there could be instability due to the extent to which topics have association with document, through several executions. In the proposed system, ensemble learning comprises of algorithms Kernel Support Vector Machine (KSVM) and Random Forest algorithm which overcomes the instability. The first issue of appearance and disappearance of words between multiple runs is overcome by Gibbs Sampling based Semantic LDA (GSLDA). The second issue of alignment of topics with document is aided by using ESLDA. This ensemble SLDA algorithm show increased accuracy in terms of retrieval and reduced time interval compared to conventional models. The accuracy increases up to 98% using ESLDA compared to SLDA (82%) and term frequency methods (78%).
Downloads
Metrics
Article Details
You are free to:
- Share — copy and redistribute the material in any medium or format for any purpose, even commercially.
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
- Attribution — You must give appropriate credit , provide a link to the license, and indicate if changes were made . You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Notices:
You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation .
No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.