Enabling Intelligence through Deep Learning using IoT in a Classroom Environment based on a multimodal approach

Smart Classrooms are becoming very popular nowadays. The boom of recent technologies such as the Internet of Things, thanks to those technologies that are tremendously equipping every corner of a diverse set of fields. Every educational institution has set some benchmark on adopting these technologies in their daily lives. But due to some constraints and setbacks, these IoT technological embodiments in the educational sector is still in the premature stage. The major success of any technological evolution is based on its full-fledged implementation to fit the society in the broader concern. The breakthrough in recent years by Deep Learning principles as it outperforms traditional machine learning models to solve any tasks especially, Computer Vision and Natural language processing problems. A fusion of Computer Vision and Natural Language Processing as a new astonishing field that have shown its existence in the recent years. Using such mixtures with the IoT platforms is a challenging task and and has not reached the eyes of many researchers across the globe. Many researchers of the past have shown interest in designing an intelligent classroom on a different context. Hence to fill this gap, we have proposed an approach or a conceptual model through which Deep Learning architectures fused in the IoT systems results in an Intelligent Classroom via such hybrid systems. Apart from this, we have also discussed the major challenges, limitations as well as opportunities that can arise with Deep Learning-based IoT Solutions. In this paper, we have summarized the available applications of these technologies to suit our solution. Thus, this paper can be taken as a kickstart for our research to have a glimpse of the available papers for the success of our proposed approach. ___________________________________________________________________________

IoT is a system or a technology that integrates various heterogeneous platforms like Artificial Intelligence, Big Data, and Machine Learning, Cloud Computing, Embedded Devices, etc. For an easy implementation and configuration of an IoT system, an alternative would be embedded devices, like ARM microcontrollers, Arduino boards, Raspberry Pi boards. In our approach, the Raspberry Pi board is taken into consideration. Raspberry Pi is a tiny credit-card-sized single-board computer with affordable cost and size [11]. It functions more than a multitasking computing systems or a desktop computer can do. Raspberry Pi was created by a charity registered in the UK registered in 2008 and later it was launched in 2012. Drastically, more IoT, Raspberry Pi models were developed such as Pi1, Pi2, Pi3, and Pi4 is the most recent. This platform can represent a typical solution to explore IoT technological uses in the educational sector. Thus in future work, intelligence inside the classroom will be experimented with using an IoT device like Raspberry Pi or any resource-constrained devices like Mobile devices for experimenting in terms of on-device inference engines.
Implementation of Raspberry Pi in an educational background started in the year 2012, and it was initially released to nurture computational education in the UK, basically to make students learn to code, and build some DIY (Do-It-Yourself) or hardware projects, automation, etc to develop their creativity skills. They have separate GPIO (General Purpose Input Output) pins to take control of physical computing using Pi for IoT systems. It operates in an open-source environment under its primary Raspbian OS. Raspberry Pi is used to encourage people to enhance their programming skills. It is manufactured based on three boards that were configured via licensed manufacturers dealt by Newark element14 (Premier Farnell), RS Components and Egoman. This is how, Raspberry Pi kits came online. Figure 1: Raspberry Pi and its components [12].

Papers related to Raspberry Pi based Deep Learning systems with Vision & Language
Understanding in a Classroom setup is yet in the developing stage. Current research papers are primarily on how Raspberry Pi can be used for image processing based systems specifically or to migrate the deep learning to the IoT devices is in the budding stage. In this research, an attempt has been made to conceptually propose a model or architecture to reach new insights in the education dimensions and its perspective on the way to promote Intelligence in the Classroom using Deep Learning leveraged to the IoT systems. Primarily, existing systems on the use of IoT and Raspberry Pi inside the classroom have been explored in combination with Images Processing Technologies. Finally, how Recent trends are getting evolved with Pi and advanced technologies such as Deep Learning along with the fusion of different modality data, besides working with Intelligence inside classrooms. Thus, its usage in Classrooms, to promote Intelligence with Raspberry Pi technology has been emphasized and highlighted.

Smart Classrooms using Raspberry Pi
Usage of Raspberry Pi kits inside the classroom setup is still in the testing stage, and the same has been discussed in my previous article [15]. It can be summarized as follows: Smart conference rooms using bluetooth and Kinetics [13], Bio sensors with database configured to Raspberry Pi along with Moodle on Wi-Fi to record attendances [14], smart projectors that upload / downloads PPTs without any laptops in the classes, and store them in separate servers logging into their unique credentials [13], Navigate through PPTs using gestures [16], Google voice assistant and Raspberry Pi to develop effective classroom [17], Lecture projection systems along with NETIO application, controlled via Wired or Wireless connections [18], or mobile applications [19], to control and sharing of displays with remote desktops [20], to handle android phones for an intranet infrastructures [21], Using Virtual Network Computing protocols to promote access in projections [22], Real-time attendance tracking systems and Recording systems with RFID, Knowledge Base System (KBS) and Twilio cloud platform [23], Smart dustbins can be used inside college classrooms [24], ELAB an IoT platform and Xively to promote smart learning environment [25], Paradigm called "Magic Mirror: to function as an interactive display to give updates or information through voice commands [26], Smart systems to secure hostels though RFID and MATLAB using Face and thumb recognition [27], To conduct online exam assessments via Wi-Fi and PHP [28], voice based controls can be used inside the college or classrooms so as to create effective classroom or college management [29], security system with PIR sensor and night vision modules to explore visual captures [30], To check the classroom temperature occupancy etc [31], To automate blood bank services for a college through direct link between donor and recipient via GSM, Raspberry Pi and SMS modules at emergency situations [32], Digital Notice boards with Mobile alerts through Wi-Fi to send notices using texts [33], To monitor environment inside the college environment in terms of temperature, air, moisture levels etc with ThingSpeak on Zigbee communications and send those results via an app [34], Garbage monitoring with Zigbee and GSM on GUI platform [35], To manage electrical appliances with Flask, SQLite3 inside user interfaces [36], To use Lecture delivery system based on Raspberry Pi to record class lectures in audio / video formats and send via FTP and Python and telecast using VNC application [37], Learning Management System with Raspberry Pi [38], Smart Blackboard/Whiteboard Cleaning systems using Raspberry Pi to find the eraser position [39], To manage classroom timetable effectively via RFID, GSM and Raspberry pi [40], based on Real-Time Clocks, Automatic bells to show collge session alarms [41], To explain lab an experiments using robots inside the classrooms or labs giving voice commands to robots [42], and also Intelligent vehicle safety surveillances with RFID for precautions [43]. Affective computing to study student activities and emotions along with Face Recognition Systems using OpenCV [44] [45], Ajantha et. al also tries to use Camera as assisting devices [46], Aryuanto et. al [47] have explored Detecting laser spot with OpenCV Library through a webcam on Raspberry Pi. Geraldine et. al [48], has come up with using Finger and Gesture Recognition systems in real-time. P.V.Vinod Kumar et. al [49], proposed QR codes integrated with Pi with image processing techniques and UVC Camera driver and Open CV library on TFT_LCD display. Dhanujalakshmi et. al [50], showed her works with Fire pattern recognition with heat signature, and alter message is send through MMS. Aditi. S et. al [51], has shown Pi Book reader with an image to text and text to speech conversion with Open CV. Face Recognition System using OpenCV for attendance marking using Java and MySQL. LZubasi Kakawete et. al [52], have proposed a Facial authentication using Object detection method, Open CV, and QR codes.

A fusion of Computer Vision and Natural Language Processing:
For the fusion of computer vision and natural language processing, we have explored use cases that can be suitable for a classroom environment. Due to the recent advancements in multimedia and its processing, it is possible to develop a methodology in combining both the tasks or approaches, and dealing with such types of data sources can be termed as "multimodal data sources". There is a theory known as 'semiotics', where the study is about the relationship between the signs and also the meaning interpret in terms of words, phrases, or sentences based on the semantic representations.
Complex tasks related to visual description, visual property description, visual content retrieval, etc. This theory is again supported by distributional semantics where the computer vision tasks are used with natural language processing via image embedding or word embedding etc. Workflow is to map visual data to their words using those embedding representation.
Generally, considering these types of multimodal data such as video, audio, text, etc. can be dealt with some cognitive perception. By the intersection of Visual & Language (V&L) understanding, the reasoning of the visual data can be dealt with deep learning models since it outperforms the traditional machine learning algorithms in solving these kinds of mixed data sources.

Possible Applications in a Classrooms Scenario:
To make the solutions available for the Educational environment especially for the Classroom environment we searched and came across a few use cases related to it. It can be made possible by experimenting a few of the scenarios in the classroom setup and evaluate its results. In the future work, it is planned to experiment with one or more scenarios with the setup ready to the classroom environment as shown below.
The following are the peculiar use cases that have been proposed for a classroom scenario as Lip Reading, Classroom description, video captioning, etc. Where we have to perceive the visual content and then describe the same in the form of text simultaneously. 2. A scenario where sign language can help the hearing impaired students to understand the lectures that are streamed or viewed as videos with some form of description from speech to text conversion. 3. A scenario where the surrounding description can help the visually challenged people in the classroom to understand the environment and adapt to it. 4. A scenario where the visual feeds in the classroom can be converted from speech to a similar visual context to help the visually or orally challenged peoples. 5. A scenario where it can help the children in schools to understand the environment by creating stories or descriptions from the images framed.

Existing Systems -Intelligent Classroom 4.1 Related Works
As in the existing systems, the classroom environment where classroom adopted with systems/services especially cameras and the existing works where camera feeds based smart computer vision or image processing systems/services were considered to be "intelligent" in a different context to form an "intelligent Classroom". Let us know what the authors have given their definitions for an intelligent classroom infrastructure. In [53], Camera has been used for AmI (Ambient Intelligence) along with RFID to analyze user behavior and make them adapt to their environment and user needs. The author [54] defines how an intelligent classroom setup is built by promoting Mixed Reality to enhance the classroom experience. show an intelligent classroom will be possible with the intellectual environment. The author Ping Zang [55], has proposed an intelligent campus (ICIoT) that can manage the campus in run-time along with WSN channels with 16.7% efficiency in energy management using camera feeds. The author in [56], depicts that Classroom setup where the camera is used for behavioral analysis for user satisfaction. In [57], the cameras have been fixed in the classroom to implement machine learning algorithms to detect student's activities inside the classroom for attendance as well as behavior analysis. In [58], the Efficient power management model along with automation on experimentation with INTEL GALILEO and Z-HOME. Based on the same scenario, an automated smart classroom had been proposed by the author [59] for efficient energy management using Mega2560 and Blynk. The author in [60] surveyed how an intelligent classroom be built that can become an intellectual environment. The author Nafhath [61] states that Cameras can be implemented inside the classroom to monitor the classroom activities. In [62], the author proposes a design of a conceptual model that can propose a smart classroom based on a multiagent system. In [63], the author tries to use camera feeds to monitor student's behavior inside the classroom using Gestures, face recognition, lip tracking implementing Machine Learning algorithms. And also, the author [64] based on the SCADA infrastructure model, tries creating an IoT architecture based classroom model for the implementation of an intelligent classroom. In [65], the author proposed emotionally aware AI implemented classrooms. Whereas in [66], deep learning and osmotic computing had been experimented to provide a smart classroom. Figure 4: A proposed conceptual model to implement Deep Learning for IoT devices in a Classroom Environment.

Proposed Conceptual Classroom Model with IoT:
As described in Figure.  extraction takes place since its a deep learning model. Thirdly, the model implementation comes into the picture, based on the input the image information is used with the model for the feature extraction process, and the pre-processed, normalized and cleaned data is then used. The total data can then be split up with more training data, to make the model generalize well to the problem specified. Then train and the model and tune the hyper-parameters for improving the accuracy. Later we have to use the finalized trained model on the real-time scenario, where, the IoT device embedded with Camera model is fixed, then the inputs are collected, split the video data into frames and make local or cloud storage such as (GCP, AWS, Azure, etc.) possibly if a Neural Network accelerator can also be be used if necessary, and the Deep Learning computation can also be switched in local or Cloud storage (GCP, AWS, Azure etc.), finally, we have to visualize and see the on-device inferences using a small scale or squeezed model to fit the embedded devices using the process of model conversion through model optimization or model pruning, vector quantization, etc. and at last, the final decision-making process to be initiated based on the scenario to take the final decision [8].
Thus, this architecture below ( Figure 5.) can be taken as a basic pipeline on how the workflow goes on mixing Deep Learning with Computer Vision along with Natural Language Processing and IoT technology to suit a classroom environment. The neural network model once trained can be freezed and then the model is converted to the TensorFlow lite (.tflite) model for deployment and make the ondevice inference possible at the edge of the embedded device here., Raspberry Pi board, mobile devices.
In this paper, along with the proposed model, we have summarized the available applications of these technologies to suit our solution. Thus, this paper can be taken as a kickstart of the research to have a glimpse on the available papers for the successful implementation of this proposed approach.
From this architecture, it reveals that distinct examples are available for this educational use cases, and considering these scenarios as an example, choosing a problem definition to build DL and IoT based systems have a promising scope for building new educational infrastructures. Thus, we can try to explore and build more projects, products, and systems by integrating these scenarios in forming a highly potential classroom system. It is just a start in deriving new invaluable sights through a deeper analysis of these scenarios. Thus, final proofready IoT based systems will help create a more efficient, intelligent classroom infrastructure. on Deep Learning for IoT based Classroom setup. Based on this investigation, we have to explore in implementation part and the Deployment part in our future research.
Due to new technological advancements and rapid growth, in the era of Artificial Intelligence, and its applications to various fields, the Educational sector is not exceptional. To make it possible, IoT can be considered as a perfect solution to integrate Deep Learning with IoT in the Classroom Environment.
The major difference lies with how and what is deciding to use efficiently to build such a solution. Like humans, the machine needs to understand the surrounding world and need to perceive it as what we see and visualize.

Methods
This section is related to how I organized my works to prepare the proposed model and additionally, how that can be supported by the recent works from other researchers related to this field.
The papers collected were taken from online databases and repositories along with specific keywords such as "Raspberry pi for the classroom", "Deep Learning for Classroom Environment" etc. to prepare my model.
Google Scholar, IEEE Xplore, Academia were the websites used to search and locate academic papers, journals, conference proceedings related to the above-mentioned keywords based quantitative and qualitative measures such as total counts of the papers and keywords/key terms to support my proposal.
As the collected papers, were more recent to the publications years, it can be considered to have a more promising scope in future research related to this field. It can be seen that the articles cited has more citations by other researchers proving it to be an exemplary representative to showcase this concept in this field of research.
There are so many recent papers available in the above-mentioned field of study individually to support each task. Still, there is a gap in how to include these setup on the fusion of models to suit multitasking models related to multimodal datasets and make them suitable for the educational environment. For this work, around 79 papers were explored collected from Google Scholar, IEEE Xplore, Academic papers, Journals, Conference proceedings, etc.

Deep Learning for IoT Devices
The process of implementing deep learning technologies in recent years using those supercomputing devices such as Graphical Processing Unit (GPU), Visual Processing Unit (VPU), etc. Thanks to those inventions. For the sake of implementing Deep Learning, by default, GPU systems are preferable. Since it would help beneficial for the processing to happen in a very efficient manner without any technical lag or interruption. But to the concern of low power and resource-constrained devices / embedded devices, mobile devices, etc.
Authors [67], described their approaches on how deep learning can be leveraged in IoT devices considering 2 different ways such as offloading to the cloud, building inferences, etc. Similarly on summarized on challenges and solutions to build deep learning-based IoT devices using powerful models In [68], has Review of various approaches on how to make deep learning possible for IoT devices and evaluated their performance in GPU and ARM processor efficiently.
Also in [69], the author tries to conduct experiments on using machine learning algorithm inferences on the edge in IoT devices for 3 algorithms and 10 different datasets on Raspberry pi.
As in [70], the author discusses various challenges for deploying neural networks in microcontrollers and developing lightweight models using keyword spotting algorithms.

Applications in a Classrooms Environment
As said earlier, few authors have provided us the dimensions of how Deep Learning be possibly enabled inside the Embedded devices especially, IoT devices or mobile devices with low resource constraints. Table 1. depicts the works on implementing Deep Learning for IoT devices inside the Classroom with problems related to Computer Vision tasks and the setup that can support several classroom use cases.Further, we will see in recent papers, where deep learning has been made possible by implementing Deep Learning for Classroom Environment to solve many computer vision tasks such as face detection, face recognition as well as Natural Language Processing tasks such as Machine Translation, Speech Recognition, etc.
Refer ences

Concept
Year [71] Using sensors inside the classroom to sense their environmental conditions and student's activities for quality education 2017 [72] A mobile-based application (CNN) where student's attendance where made based on their recognized faces 2018 [73] To take attendance based on SOTA techniques and achieved 98.67% accuracy on LFW dataset and 100% classroom data 2018 [65] Mobile-based Cloud Hybrid Architecture for emotion recognition system using deep learning based on non-verbal cues such as gestures and facial expressions 2018 [74] Methodology on how to create face datasets for student attendance management system based on 2 quantitative and qualitative analysis on the classifier 2018 [75] Web-based student attendance management system along with XAMPP and MySQL for CNN model and k_NN classifier by detecting faces [66] IoT based smart classroom based on osmotic computing and deep learning models along with fog microserver, mobile edge computing devices, etc. 2018 [76] Student engagement analysis inside the classroom on nonverbal cues for a 350 students with 71% accuracy on Gold Standard Study dataset than Cohen Kappa.
2019 [77] To study the student's affective states on the elearning environment based on spontaneous and posed datasets with 83%,76% accuracy on detection and classification 2019 [78] Web-based machine translation tools to study student's behavior among Korea-speaking students for language learning (Google translate, Navel Translate) 2019 [79] To automatically annotate the activity inside the classroom for self-report generation based on instructor's recordings with a single voice, multi-voice, no-voice, and others for identifying the time consumption 2019 [80] To automatically lip read the speakers using CNN and also RNN models by extracting lip region with VGG and other network models tested on datasets created with 88.2% accuracy 2019 [76] To detect the student's emotional states based on 5 moods to match their engagement scores to enhance teaching and learning rate with 90% accuracy.
2019 [81] An automatic student monitoring system that can monitor teacher as well as student's behavior on their performance in the classroom using 1800 frames of 6 videos participated by 10-20 students 2019 Table 1: Recent works by researchers to support my proposed architecture

Discussion and Findings
Based on the proposed architecture, it can be noted that the future of IoT in the Educational Sector has started to show its existence to help the educational settings that can provide good quality experiences among Administration, Teaching, Learning, and Other departments. This paper has been organized in such a way that it will first explore the papers that were related to the possible applications of the Internet of Things (IoT) in Education. Secondly, the impact of Raspberry Pi as an IoT platform and its related works that supports Smart Classroom implementation and also it transparently displays the implementation of Raspberry Pi in the Classroom Environment along with Computer Vision applications. Thirdly, it not only gives an introduction to how the intersection of Computer Vision as well as Natural Language Processing can suit the Classroom Environment but also, it proves to explain the fact Deep Learning can be used with this kind of multimodal data sources. Fourthly, the main objective of this proposal is how intelligence can be incorporated into a traditional classroom using more advanced technologies such as Deep Learning. Hence, applying the algorithms for several use cases and a complex problem at the same time embedding Deep Learning for IoT devices equipped inside the classroom. To relate the objective, the existing works related to the above-mentioned scenario has been highlighted and summarized. Further, it gives rise to a new scope for the application of Deep Learning to suit the educational settings. Thus, based on the given proposed model, intelligence can be leveraged into the classroom environment using Deep Learning along with the IoT platform for enhanced teaching and learning experiences.
This proposal can be taken as a start in implementation trying to gain more values from the deeper analysis of the problem definition to suit a classroom setup. To understand the surrounding world, we need to perceive what we see likely the same rule applies to machines that need to understand the environment visually. The major difference lies in how and what we use to efficiently build such a solution that can understand the surrounding world. Acknowledgment Our sincere thanks to our institute Vinayaka Mission's Kirupananda Variyar Engineering College, Salem, Tamilnadu for extending the facilities for my research through the Centre for Research and Development (CRD).

Conclusion
This huge technological community, demands its usage in diverse fields and the Educational sector is not exceptional. To position deep learning techniques useful for the benefit of the educational sectors, IoT is the perfect solution to integrate IoT with Deep Learning in the Classroom Environment to collect, organize, visualize and take decisions based on the given scenarios. In this investigation, we have come across major 5-6 use cases that apply to a traditional classroom environment using Visual IoT. Articles were taken from the past 3 years considering from 2017-2020. Since these papers were taken from high impact and scientific journals and academic repositories, it can represent a typical real-life scenario and it is proof ready concept on using Deep Learning for IoT devices adapted to a Classroom environment, leading to a promising educational transformation. From this paper, we can take a lead on how the integration of Computer Vision and Natural language processing where few image processing and text generation are simulated simultaneously focusing on their implementation inside the classroom. Thiess use cases were mainly focusing on enhancing classroom experiences efficiently and intelligently. Finally, it has been found out that future generation classrooms can be a (DLeIC) Deep learning enabled IoT Classroom to take the educational evolution to a new dimension for both the staff and the students being the part of such systems through the use of Deep Learning technologies.
This proposed architecture reveals that these state-of-the-art technologies can help the developers, researchers, and scientists to work on new directions in enabling the intelligence in a Classroom Environment. Based on this proposed architecture, we have planned to use these concepts and scenarios to build systems that can promote intelligent classrooms. The future of IoT and Deep Learning seems to be more promising and at the same time, as more connected things arise, more systems with less human intervention can be produced thus giving a positive impact on the higher education environment that prosper expeditiously.