ARTIFICIAL NEURAL NETWORKS

Aditya Kumar
10 min readFeb 21, 2021

NEURAL NETWORKS

To understand what a neural network is, it helps to first understand what machine learning is. Machine learning is a type of artificial intelligence where data is collected and used to understand the behavior of a particular process and then predict how that process will act in future settings as the system is continually fed new data.

→ A neural network is a type of machine learning used for detecting patterns in unstructured data, such as images, transcriptions or sensor readings,

for example :

“In neural networks, when data is collected about a particular process, the model that is used to learn about and understand that process and predict how that process will perform in the future is a simplified representation of how a brain neuron works,”

“A brain neuron receives an input and based on that input, fires off an output that is used by another neuron. The neural network simulates this behavior in learning about collected data and then predicting outcomes.”

Then Comes the layer of Neural Networks

This Animation shows how actually it works internally

LAYERS OF NEURAL NETWORKS :

✏ INPUT LAYER:

🔺The activity of the input units represents the raw information that can feed into the network.

✏ HIDDEN LAYER:

🔺Hidden layer is used to determine the activity of each hidden unit.

🔺The activities of the input units and weights depend on the connections between he input and the hidden units.

🔺There maybe one or more layers.

✏ OUTPUT LAYER:

🔺The behavior of the output unit depends on the activity of the hidden units and the weights between the hidden and the output units.

What is the Difference Between Computer and Human Brain?

ADVANTAGES OF ARTIFICIAL NEURAL NETWORK

Neural networks can provide highly accurate and robust solutions for complex non-linear tasks, such as fraud detection, business lapse/churn analysis, risk analysis and data-mining.

✏ A neural network can perform tasks that a linear program can not.

✏When an element of the neural network fails, it can continue without any problem by their parallel nature.

✏ A neural network learns and does not need to be reprogrammed.

✏ It can be implemented in any application.

✏ It can be performed without any problem.

APPLICATIONS OF ARTIFICIAL NEURAL NETWORK

Speech recognition

� The voice Web requires a voice recognition and authentication system incorporating a reliable speech recognition technique for secure information access on the Internet. In line with this requirement, we investigate the applicability of artificial neural networks to speech recognition.

� In our experiment, a total number of 200 vowel signals from individuals with different gender and race were recorded.

� The filtering process was performed using the wavelet approach to de-noise and compress the speech signals.

� An artificial neural network, specially the probabilistic neural network model, was then employed to recognize and classify vowel signals into their respective categories. A series of parameter settings for the PNN model was investigated and the results obtained were analyzed and discussed.

Face Recognition

Face recognition is a visual pattern recognition problem. In detail, a face recognition system with the input of an arbitrary image will search in database to output people’s identification in the input image. A face recognition system generally consists of four modules :

✏ detection,

✏ alignment,

✏ feature extraction,

✏ and matching,

where localization and normalization (face detection and alignment) are processing steps before face recognition (facial feature extraction and matching) is performed.

The face is usually further normalized with respect to photometrical properties such illumination and gray scale. After a face is normalized geometrically and photometrically, feature extraction is performed to provide effective information that is useful for distinguishing between faces of different persons and stable with respect to the geometrical and photometrical variations.

USE-CASES OF NEURAL NETWORKS

Right now, deep learning is the cutting-edge area of Artificial Intelligence While computers have traditionally been very fast, they’ve not been that clever — not compared to a human brain, at least. Where computers have lagged behind us humans is in their inability to learn from errors, and the fact that they needed a human to give precise instructions on what to do, when and how.

Deep learning, the technology behind speech and image recognition, has changed all that and is already being used in many applications today. Here’s a look at how one of the pioneers of deep learning, Google, is using it across many products and services to boost its customer offerings and business performance.

How Google uses Deep Learning in practice

🔹 Google first explored the possibilities of deep learning back in 2011, with the Google Brain project. The next year, the company announced that it had managed to build a neural network that could simulate human cognitive processes. Running on an incredible 16,000 computers, the system studied 10 million images and learned how to identify a cat in a photo. It may not sound that exciting, but it was a big leap in deep learning.

🔹 Google signalled how important deep learning was to its business with its 2014 acquisition of Deep Mind a UK Deep Specialist , The startup’s pioneering work involved connecting cutting-edge neuroscience research to machine learning techniques, resulting in systems that acted more like “real” intelligence (i.e. the human brain).

🔹 Google’s first practical use of deep learning was in image recognition, where it was used to sort through millions of internet images, and accurately classifying them according to what was in the images, in order to return better search results. Now, Google’s use of deep learning in image analytics has extended to image enhancement. The systems can fill in or restore missing details in images, simply by learning from what’s already there in the image, as well as what it’s learned from other, similar images.

🔹 In video analytics, Google Cloud Video Intelligence has opened up video analytic technology to a much wider audience, allowing video stored on Google’s servers to be analysed for context and content. This enables the generation of automated summaries, or even security alerts when the AI detects something fishy going on in a video.

🔹 Language processing is the second area where Google has implemented deep learning. Google Assistant speech recognition AI uses deep learning to understand spoken commands and questions, thanks to techniques developed by the Google Brain project. Google’s translation tool now also comes under the Google Brain umbrella, and operates in a deep learning environment.

🔹 The third key use of deep learning at Google is to provide better video recommendations on YouTube, by studying viewers’ habits and preferences as they stream content, and working out what would keep them tuned in. Google already knew from the data that suggesting videos that viewers might want to watch next would keep them hooked, and keep those advertising dollars rolling in. Google Brain is once again the, well, brains behind this technology.

🔹 In other areas of the business, Waymo, Google’s self-driving car division, uses deep learning algorithms in its autonomous systems, to enable self-driving cars to get better at analysing what’s going on around them, and react accordingly. And Deep Mind is, at present, working on projects in the healthcare field, using deep learning techniques to spot early signs of cancerous tissue growth and eye damage.

Apple

Apple started using deep learning for face detection in iOS 10. With the release of the Vision framework, developers can now use this technology and many other computer vision algorithms in their apps.

🔹 Apple first released face detection in a public API in the Core Image framework through the CIDetector class. This API was also used internally by Apple apps, such as Photos. The earliest release of CIDetector used a method based on the Viola-Jones detection algorithm. We based subsequent improvements to CIDetector on advances in traditional computer vision.

🔹 With the advent of deep learning, and its application to computer vision problems, the state-of-the-art in face detection accuracy took an enormous leap forward. We had to completely rethink our approach so that we could take advantage of this paradigm shift. Compared to traditional computer vision,

🔹 As capable as today’s mobile phones are, the typical high-end mobile phone was not a viable platform for deep-learning vision models. Most of the industry got around this problem by providing deep-learning solutions through a cloud-based API. In a cloud-based solution, images are sent to a server for analysis using deep learning inference to detect faces. Cloud-based services typically use powerful desktop-class GPUs with large amounts of memory available.

🔹 Apple’s iCloud Photo Library is a cloud-based solution for photo and video storage. However, due to Apple’s strong commitment to user privacy, we couldn’t use iCloud servers for computer vision computations. Every photo and video sent to iCloud Photo Library is encrypted on the device before it is sent to cloud storage, and can only be decrypted by devices that are registered with the iCloud account. Therefore, to bring deep learning based computer vision solutions to our customers, we had to address directly the challenges of getting deep learning algorithms running on iPhone.

🔹 Apple faced several challenges. The deep-learning models need to be shipped as part of the operating system, taking up valuable NAND storage space. They also need to be loaded into RAM and require significant computational time on the GPU and/or CPU. Unlike cloud-based services, whose resources can be dedicated solely to a vision problem, on-device computation must take place while sharing these system resources with other running applications. Finally, the computation must be efficient enough to process a large Photos library in a reasonably short amount of time, but without significant power usage or thermal increase.

Use-Case

We experimented with several ways of training such a network. For example, a simple procedure for training is to create a large dataset of image tiles of a fixed size corresponding to the smallest valid input to the network such that each tile produces a single output from the network.

The training dataset is ideally balanced, so that half of the tiles contain a face (positive class) and the other half do not contain a face (negative class). For each positive tile, we provide the true location (x, y, w, h) of the face.

We train the network to optimize the multitask objective described previously. Once trained, the network is able to predict whether a tile contains a face, and if so, it also provides the coordinates and scale of the face in the tile.

Since the network is fully convolutional, it can efficiently process an arbitrary sized image and produce a 2D output map. Each point on the map corresponds to a tile in the input image and contains the prediction from the network regarding the presence of or absence of a face in that title and its location/scale within the input tile (see inputs and outputs of DCN in Figure 1).

Given such a network, we could then build a fairly standard processing pipeline to perform face detection, consisting of a multi-scale image pyramid, the face detector network, and a post-processing module. We needed a multi-scale pyramid to handle faces across a wide range of sizes. We apply the network to each level of the pyramid and candidate detections are collected from each layer. (See Figure 2.) The post processing module then combines these candidate detections across scales to produce a list of bounding boxes that correspond to the network’s final prediction of the faces in the image.

This strategy brought us closer to running a deep convolutional network on device to exhaustively scan an image. But network complexity and size remained key bottlenecks to performance. Overcoming this challenge meant not only limiting the network to a simple topology, but also restricting the number of layers of the network, the number of channels per layer, and the kernel size of the convolutional filters. These restrictions raised a crucial problem: our networks that were producing acceptable accuracy were anything but simple, most going over 20 layers and consisting of several network-in-network modules.

We decided to leverage an approach, informally called “teacher-student” training. This approach provided us a mechanism to train a second thin-and-deep network (the “student”), in such a way that it matched very closely the outputs of the big complex network (the “teacher”) that we had trained as described previously.

RESULT

Now, finally, Apple had an algorithm for a deep neural network for face detection that was feasible for on-device execution. They iterated through several rounds of training to obtain a network model that was accurate enough to enable the desired applications. While this network was accurate and feasible, a tremendous amount of work still remained to make it practical for deploying on millions of user devices.

I Hope you find this Article useful.

Any Suggestion you want to give feel free to connect with me on LinkedIn, Here is the link of my LinkedIn profile.

Thank you Everyone For Reading ..!!

--

--

Aditya Kumar
0 Followers

Full Stack Web Developer | Linux System Administrator | Ansible | Kubernetes | Django | AWS | Coder