A Comprehensive Review of Smart Glasses Technology- Future of Eyewear

This paper breaks down the technology behind smart glasses along with the advancement in this emerging field keeping in mind the hardware, software and research work which major companies have put in to develop a real-life model. This paper also comprises an analysis on the experience of users with different glasses. We have also talked about the new benchmarks which this technology has set in sectors such as medicine, gaming, corporate, sports, entertainment and many others.

Smart Glass Platform [9] As the display is placed too close to the human eye, the user can experience bleed-out problems in the display while viewing the content. Also, it can increase the stress developed on the eye and can cause damage to the eye. The smart glasses are not in mainstream as of now due to some rudimentary constraints such as having a completely different interface than the smartphones we use on a daily basis, smart glasses also require different input and display methods along with a different set of sensors arrangement to get the work done. [21] 2

. Problem Formulation
The technology used behind smart glasses is tedious and complex. The complexity of the smart glasses depends upon the features that the device is offering. Due to the size disadvantage, the processing power of the device is limited, to add features which require high computational power we need to connect smart glasses to a server which will be able to perform the tasks and return the processed result. The temperature detection module which usually consists of an infrared sensor along with various output devices and leds that are connected to the processing unit along with a power supply circuit and a control module. Output device consists of anything that can be triggered such as a buzzer. Therefore, smart glasses which consist of its own processor and powered by a battery, project the result on a tiny screen embedded into the glasses by collecting it either through its own processed data or through a server via a wireless network. The main challenge faced in the development of smart glasses was to find a method to sample a wide set of data accumulated by the various sensors every day in a random order with an unpredictable and huge set of factors the solution to which is discussed further in the paper. [6] [10]

Features of Smart Glasses
There is no virtual or physical keyboard attached to smart glass, so to interact with the device we need a wayaround to solve that problem. There are various input methods which can be used in smart glasses to make the human-device interface more user friendly. In our study we came across many input methods. Let's start with the common one which is, Air-Writing. In this method we used hand gestures to interact with smart glasses by pointing at objects viewed as per user's perspective. Additionally, it aims to allow users to write in the air using the index finger of their hands which will act as input, termed as air-writing. It is a unique and convenient way for the user to read and apply on their own. To get this output, Mask R -CNN is used as primary model architecture and modified to detect fingertips in real-time via GPU. Firstly, MobileNetV2 is used as a backbone network, instead of ResNet as it uses a smaller number of model parameters. [27] The processing speed can be significantly improved by trimming bottleneck layers yet maintaining detection accuracy. Furthermore, FPN (feature pyramid network) is used to develop multi -scale feature maps so that variations of object sizes can be dealt with. In Neural Networks, low-level features such as edges can be extracted from layers that are closer to the input, while high-level features can be extracted from layers that are closer to the output. However, small objects are bound to vanish in profound layers. To overcome this barrier, FPN is used to improve accuracy of detecting small objects and to accomplish harmony between processing speed and accuracy. [27] The GPU used on the server side for detecting texts written in air (also known as 'Air -Writing') is NVIDIA GTX 1080 Ti GPU. When the model locates the trajectory of the fingertip, it sends it to Google Input API which returns the letters or characters to the user after recognizing them in real time. If the model is not able to detect any trajectory for a few seconds, then it will return previous ten values back to smart glasses so that the user can choose from them.
They have presented a network which is used for detecting and localizing hand regions and fingertips respectively in real time. MobileNetV2 is used as the backbone of the network and trimmed the number of bottleneck layers to avoid detecting repeated features. The model can work with 640x480 RGB images at 38 fps accurately. The input method of the model considers fingertip trajectories as strokes of characters and then sends it to Google API to detect handwritten letters. There is also a pointing system within the device which is used to point on objects in order to interact with the smart glasses. So, there are at least three such methods to point on objects which are based on wearable devices such as smart glasses: (I) pointing at the target with naked eyes, (II) pointing at the target with laser pointer and then searching the laser pointer in the captured images, and (III) pointing at the object with the help of a crosshair which is displayed on OHMD. [27] Another input method which can be used as an input for smart glasses is, Palm type, which means that the user will be able to pinpoint speci c location of their palms and ngers without visual consideration, and gives visual input by means of the wearable presentations. With wrist-worn sensors and wearable showcases, Palm Type enables typing without expecting clients to hold any gadgets and doesn't require visual attention of their hands. The latest design of Palm Type incorporates QWERTY design that is 39% quicker than current touchpadbased keyboard. Palm Type is favored by 92% of the members. Palm Type uses 15 infrared sensors to identify persons' nger position and taps, and gives visual input by means of Google Glass. [34] Many companies have also designed Light indicators-based glasses for assisting navigation purposes keeping in mind near eye visual cues and as per users preferences light can be adjusted. As per feedback when elders used it, most of the participants suffered from mild dementia, but their navigation task was accompanied by smart glasses and even bikers can use it since their hands are not free to interact with hand-held devices.
Speech Recognition is also a method for inputs in smart glasses. The choice messages or the default input messages generated by smart glasses using machine learning based on user behavior and sensory data which include the most tted item in the menu choice message. If the user fails to interact with the system, the default input message is selected by the system for proceeding to the next task. The main drawback of default input messages is the threat of unsatisfactory messages which needs frequent users' interaction over the time for training the machine algorithm. [28] One of the efficient ways for input is Head Gesture recognition which enables the use of simple head gestures as input. It is accurate in various wearer activities regardless of noise. Glass Gesture achieves a gesture recognition accuracy near 96%. For authentication, Glass Gesture can accept authorized users in nearly 92% of trials, and reject attackers in nearly 99% of trials. Head Gesture uses Motion sensors (i.e. the accelerometer and gyroscope) on Glass are able to measure and detect all kinds of head movements due to their high electromechanical sensitivity. In some situations, it may be considered inappropriate or even rude to operate Glass through the provided touchpad or voice commands, in those case scenarios head gesture system has an advantage over other input methods. Head gestures in comparison, can be tiny and not easily noticeable to mitigate the social awkwardness. Second, the head gesture user interface can authenticate users which can lead to increased security of the device. [23] Another interaction method which is based on Google Glass, has a camera and OHMD. The images taken are sent to a computer via the local network due to processing limitations in google glasses. There is a L-shaped crosshair placed at the lower left corner of the OHMD so that the crosshair has minimum interference with the vision of the user. This pointing system is established by using the camera-eye geometry and distance-pixel curve. Further, two methods are developed to estimate the viewing point on a plane surface. Various experiments conducted on this method showed that an angular error of less than 0.32° can be achieved by using this method. [4]

. Components of Smart Glasses
A smart glass comprises many components. Let's start with display, There are two major displays that most of the glasses use these days which is shown in table 1.2. The data displayed is collected using a camera sensor which also plays a vital role for various other purposes such as hand gestures, feed capture and many more. The different types of cameras that are used in smart glasses are RGB, depth camera, infrared vision which also support different computer vision tasks. [12] Due to computational limitations in smart glasses, various computer vision algorithms that are used will be efficient and should also be able to work precisely with videos of low fps. Eye blink detection has various applications in health care, human computer interaction and driving safety. As closing of eyelids is an important step to detect blinking of an eye, eigen-eye approach is used to detect closing of the eyelid and then a Gradient Boosting (GB) algorithm is used to train the model for eye-blink patterns which is based on the results from eigen-eye approach. [ The camera is placed in such a way that it points towards the corresponding eye so as to capture various activities performed by the eye. A mini computer, MK802 is attached to smart glasses for processing ability. The CPU used in this mini computer is Allwinner A10 1.0 GHz Cortex-A8 along with RAM of 512MB, it supports Android ICS (4.0) and Linux Linaro as an operating system. Figure 2.1 shows a design of smart glasses used to detect blinking of the human eye.
There are various sensors that are used within the device for collecting data and processing it based on the application of the device such as, for voice input, a microphone is used which converts sound into electrical energy which can be further processed using speech recognition algorithms for further processes. The recent advancements in this field makes it more accurate and responsive. [28]  In our study we also cae across many features that bring the smartglass technology very close to smartphones. Like to enable the smart glasses to support various applications based on geo locations such as navigation, real time tracking and many more, Global Positioning System (GPS) is used. Trackpad/ External Controller provides an easy method to take input. This serves as a medium for interacting with the digital interface of the optical display of smart glasses. To determine the rate of change of velocity of a body in its own instantaneous rest frame accelerometer sensor is used. This helps in accessing the activity of the wearer such as walking, sitting, running or in particular hand gestures. Gyroscope sensor measures the angular velocity and head orientation. Eye tracking feature takes the user experience to a whole new level. The objective of this feature is to spot and locate the object that the user intends to select, which is driven by eye movement. Magnetometer sensors are used to access the strength and direction of magnetic fields which is very important for precise navigation and maps. To carry the device around, we need a battery and to charge the battery we need a charging system. The specifications of the battery differs from device to device as the energy requirements of different sensors embedded in smart glasses are different. [12] Apart from various other use cases of smart glasses, detecting heart rate of the wearer by using PPG sensor is another feature which can be added to the smart glasses. Reflectance mode sensors are used for long term HR monitoring, in which light is allowed to fall on the skin and is reflected by the subcutaneous tissue present under the skin. In these types of sensors, LED and a photo -detector are situated near to each other on the skin surface in such a way that there is a good concentration of blood vessels beneath the skin cells. [29] There are a lot of factors contributing towards the absorption of light while performing PPG. Various biological substances like bones, skin, tissues and non-pulsatile venous blood and arterial blood (vascular elements) absorb light constantly that is represented by DC level in plethysmogram. The cardiac cycle consisting of diastole and systole is represented by the alternating signal because of pulsatile arterial blood flow. [21] [25] Pulse-Glasses is an example of such a system which consists of a pulse sensor, rechargeable battery and a microcontroller. The pulse sensors send analog PPG values to the microcontroller which further transfers the data to an Android mobile phone using BLE (Bluetooth low energy) protocol. The pulse sensor in Pulse -Glasses are placed on one of the two nose-pads consisting of a green LED which sends light and is detected using a detector which is facing the skin. All the connections from the pulse sensor to the microcontroller and battery are done through the plastic frame of the glasses. The microcontroller is placed on one side of the glass frame and the rechargeable battery (3.7 V or 3700 Mah) is placed on the other side of the glass frame. Arduino Blend-Micro (ABM), manufactured by Red Bear Ltd, as the development board of the Pulse-Glasses. Atmel ATmega32U4 is the microcontroller which is hosted on the single board circuit of the ABM. Fig. 4.1 Image of Google Glass [16] It also contains Bluetooth low-energy (BLE) protocol that is Nordic nRF8001. From the pulse Sensor, 0-5 V filtered analog signals are collected by ABM. The initial threshold level was set at 2.5 V by ABM. When the rising edge of the signal crosses this threshold value, a timer is triggered which will run until a pulse crosses the threshold level again. When this event occurs for the second time, the time taken for one cycle to complete is recorded in milliseconds and stored in a register and then the timer starts from 0 again. When the register is full, the average of the total time is taken which is stored in the register and then the result is divided by 60000. This will give us value for HR. Once HR is calculated, the newest recorded time will overwrite the old values and then a new HR is measured. These glasses can also connect with a cloud -server and it can then function as an IoT device which will allow users to keep an eye on their heart rate without any obstruction. [25] Fig 2.2 Microprocessor and Architecture of a Smart Glass (minimum viable project) [11]

Advances in Smart Glasses Technology
In order to assist blind people in navigation tasks, a method is used which calculates the distance between blind people and the obstacles using deep learning and stereo cameras through smart glasses. In this design of smart glasses, stereo cameras are used to calculate the distance and gyro sensor is used to differentiate between obstacles and floor. They have also attached a buzzer and a vibration motor to the device at three positions which are, front, left and right side. They operate as per the distance between the obstacle and the wearer. The side in which an obstacle is present, then the buzzer and vibration will be signaled to turn ON of that side. [5] For example, if the obstacle is approaching from the left side, then the buzzer and vibration motor from the left side will operate. CDS sensors are used in order to detect day and night. So, at night, to prevent any mishappening, a LED turns ON and indicates to the other person about the position of the wearer. To detect the type of obstacle, YOLO v3 has been used as the algorithm for deep learning. This task requires a lot of processing power so this cannot be performed on a low -level MCU. In order to process the information fast, image data is sent to the server using wireless communication. All the processing is done by the server and it directly sends signal to the various sensors as the response via the microcontroller. [18] Apart from detecting the distance between the user (blind person) and obstacles we also have a system which can detect the presence of humans at night and in bad lighting conditions using thermal images although they do not perform upto the mark during the day due to thermal contrast between the environment and the people. Therefore, in order to detect pedestrians in difficult weather conditions. Augment thermal images with their saliency maps, which will serve as an attention mechanism for detecting persons on the street, or usage of deep learning along with saliency maps can be used for pedestrian detectors during the day as well as night.  [18] The heat generated or reflected by the objects are recorded by Infrared (IR) thermal camera which then converts energy to temperature values which further forms an image. However, while using thermal image sensors instead of receiving the information that the color provides in the visible spectrum range we get the detected temperature range in the form of thermograms. Therefore, the information provided by thermal imaging sensors is much less as compared to visible light cameras. There are various factors such as changes in ambient temperature, quality and type of thermal image, recording distance of the thermal image sensor, weather conditions, etc affect the quality of recording and thus the accuracy to detect pedestrians on the street. In order to counter the problem of ambient temperature changes, we have MWIR and LWIR bands which have negligible solar effects but they do not perform accurately in foggy or rainy weather. In warmer climates MWIR sensors are more favorable while in colder climates LWIR sensors are preferred.
To build the model, a thermal imagery database containing images of different scenes with different lighting and weather conditions. To detect humans in the scene, authors have used their self-created dataset, named UNIRI-TID, which simulates the realistic conditions to detect humans in difficult weather conditions. The architecture used for this problem is YOLOv3, this model uses a network of 53 convolutional layers containing filters of 3x3 and 1x1. Instead of using max-pooling layers which is typically used in CNNs, convolution layers with a stride size of 2 are used to down sample the feature maps which prevents the loss of low-level features, which is often termed as pooling.
The authors have divided the model into two parts bY and tY, the model bY achieves a precision and recall of 100% and 15.5% respectively, while the model tY achieves precision of 100% and recall of 50% approximately. The result shows us that the model tY can detect a number of people in the given images without any false positives as compared to base model. On looking at the performance of the model in different weather conditions, the authors get an AP score of 97.85% for foggy and clear weather, during a rainy day the AP score is 98.08%. The in depth analysis of smart glasses is shown in

Comparison of Various Smart Glasses
There are various methods developed till now with the help of which we are able to interact with smart glasses. Some methods have adopted the concepts of optics to display data to the user on the glass fitted in front of the eye while some have used LED's, microphone, and mobile device to make smart glasses interact with the user. There is another system, named as pointing technique, which is used to point on objects to perform various tasks on that object. So, there are at least three such methods to point on objects which are based on wearable devices such as smart glasses: (I) pointing at the target with naked eyes, (II) pointing at the target with laser pointer and then searching the laser pointer in the captured images, and (III) pointing at the object with the help of a crosshair which is displayed on OHMD.
[32] Table 2.2 Comparison of various Smart Glasses [12] The proposed system in [4] is based on Google Glass, which has a camera and OHMD. The images taken are sent to a computer via the local network due to processing limitations in google glasses. There is a L-shaped crosshair placed at the lower left corner of the OHMD so that the crosshair has minimum interference with the vision of the user. This pointing system is established by using the camera-eye geometry and distance-pixel curve. Further, two methods are developed to estimate the viewing point on a plane surface. Various experiments conducted on this method showed that an angular error of less than 0.32° can be achieved by using this method.
The authors of [3] have presented a design of smart glasses which consists of a head mounted display (HMD) which will be used to assist senior citizens in their day to day navigational tasks. The system presented contains LED's which will act as indicators and Bluetooth to connect with mobile devices. To assist the user in navigation tasks, they have made various LED blinking combinations which will allow the user to follow a certain path. These smart glasses will be connected to an android device via Bluetooth which will receive commands from remote caretaker through an internet connection which will further blink the LEDs in such a manner that it will guide the user to follow a particular path.
The glasses are also equipped with a Bluetooth microphone to enable communication between the user and remote caretaker. It also contains a camera which will be able to send videos and pictures from the user field of view to the caretaker for increased protection. The target audience of this prototype are senior citizens who are suffering from memory loss, so the blinking pattern of the leds are simple which makes navigational assistance easy. [3] [26]

Challenges
The challenge faced while using smart glasses is reading on the go, walking has an adverse effect on reading and it doesn't matter if you are using a smartphone or a smart glass. It affects comprehension and workload. Studies have shown that this effect can be reduced by overlaying the text in front of the user's eye in the middle of the glass. However, research work has to be done to find the best way to make reading a seamless experience. [14] Regular consumers look up to smart glasses as a mode of entertainment and for experience enhancing purposes. The other main group is professional consumers, who are envisioned to benefit from the technology's 'hands-free' features. But so far, no company has met these criteria along with a cost-efficient compact design. Risk potential non-sustainable augmentation or enhancement may reduce some cognitive capacities as they are "outsourced" to technology, e.g., navigation skills, or delegate crucial tasks to less skilled personnel. [13] 8. Scope of the Technology Smart glasses are used in various sectors such as, in education, health(fitness tracker -used to count steps, measure heart and respiration rate), tourism(for giving a tour of the place), retail stores and for entertainment purposes. Apart from this, a research was also conducted for the input of games on smart glasses. Results show that clients signi cantly favored non-contact and non-handheld communication overusing handheld info gadgets, for example, in-air motions. Additionally, for contact contribution without handheld gadgets, clients favored cooperating with their palms over wearable gadgets (51% versus 20%). Likewise, clients favored interactions that are less perceptible because of worries with social acknowledgment, and favored in-air signals before the middle instead of before the face (63% versus 37% We can also use smart glasses to highlight the way to the user. If the user is driving a car then it can also propose a speed for the user. It can also navigate employees in warehouses to navigate them to the products they need to transport, highlighting multiple products of the same order with the same color.
The eye tracking technology can also be used to track the eye movement of the employee. This will help determine whether the employee is tired and needs a break or when an employee has finished all the work and is sitting idle. In construction sites, smart glasses can be augmented with the design of the building which will help the engineers to find mistakes and it will also help the workers to prevent accidents like drilling through a water pipe. These are only a fraction of possible scenarios for smart-glass applications. And it is clear that each brings a series of ethical questions that need to be answered. [26]

Conclusion and Future Scope
One way to look at smart glasses is that human vision is limited to the visible band of the electromagnetic spectrum. In order to extend it to challenging environmental conditions and people will tend to commercially available glasses. Companies working on smart glasses have to come up with a way to offer an easy and effortless operation, customer experience and all that under appropriate ergonomic considerations in order to get mainstream. Even though the projection of smart glasses is promising, we are not clear if smart glasses will be adopted by users for daily usage in the same way as today's smartphones, as the issues of battery life and input methods are problematic. However, it seems that smart glasses will first serve as some specialized task-oriented devices, for instance, industrial glasses, smart-helmets, sport-activity coaching devices, and the like. [12]