IU Indianapolis ScholarWorks :: Browsing by Subject "Deep Learning"

Browsing by Subject "Deep Learning"

Now showing 1 - 10 of 17

3-D Scene Reconstruction for Passive Ranging Using Depth from Defocus and Deep Learning
(2019-08) Emerson, David R.; Christopher, Lauren A.; Ben Miled, Zina; King, Brian; Salama, Paul
Depth estimation is increasingly becoming more important in computer vision. The requirement for autonomous systems to gauge their surroundings is of the utmost importance in order to avoid obstacles, preventing damage to itself and/or other systems or people. Depth measuring/estimation systems that use multiple cameras from multiple views can be expensive and extremely complex. And as these autonomous systems decrease in size and available power, the supporting sensors required to estimate depth must also shrink in size and power consumption. This research will concentrate on a single passive method known as Depth from Defocus (DfD), which uses an in-focus and out-of-focus image to infer the depth of objects in a scene. The major contribution of this research is the introduction of a new Deep Learning (DL) architecture to process the the in-focus and out-of-focus images to produce a depth map for the scene improving both speed and performance over a range of lighting conditions. Compared to the previous state-of-the-art multi-label graph cuts algorithms applied to the synthetically blurred dataset the DfD-Net produced a 34.30% improvement in the average Normalized Root Mean Square Error (NRMSE). Similarly the DfD-Net architecture produced a 76.69% improvement in the average Normalized Mean Absolute Error (NMAE). Only the Structural Similarity Index (SSIM) had a small average decrease of 2.68% when compared to the graph cuts algorithm. This slight reduction in the SSIM value is a result of the SSIM metric penalizing images that appear to be noisy. In some instances the DfD-Net output is mottled, which is interpreted as noise by the SSIM metric. This research introduces two methods of deep learning architecture optimization. The first method employs the use of a variant of the Particle Swarm Optimization (PSO) algorithm to improve the performance of the DfD-Net architecture. The PSO algorithm was able to find a combination of the number of convolutional filters, the size of the filters, the activation layers used, the use of a batch normalization layer between filters and the size of the input image used during training to produce a network architecture that resulted in an average NRMSE that was approximately 6.25% better than the baseline DfD-Net average NRMSE. This optimized architecture also resulted in an average NMAE that was 5.25% better than the baseline DfD-Net average NMAE. Only the SSIM metric did not see a gain in performance, dropping by 0.26% when compared to the baseline DfD-Net average SSIM value. The second method illustrates the use of a Self Organizing Map clustering method to reduce the number convolutional filters in the DfD-Net to reduce the overall run time of the architecture while still retaining the network performance exhibited prior to the reduction. This method produces a reduced DfD-Net architecture that has a run time decrease of between 14.91% and 44.85% depending on the hardware architecture that is running the network. The final reduced DfD-Net resulted in a network architecture that had an overall decrease in the average NRMSE value of approximately 3.4% when compared to the baseline, unaltered DfD-Net, mean NRMSE value. The NMAE and the SSIM results for the reduced architecture were 0.65% and 0.13% below the baseline results respectively. This illustrates that reducing the network architecture complexity does not necessarily reduce the reduction in performance. Finally, this research introduced a new, real world dataset that was captured using a camera and a voltage controlled microfluidic lens to capture the visual data and a 2-D scanning LIDAR to capture the ground truth data. The visual data consists of images captured at seven different exposure times and 17 discrete voltage steps per exposure time. The objects in this dataset were divided into four repeating scene patterns in which the same surfaces were used. These scenes were located between 1.5 and 2.5 meters from the camera and LIDAR. This was done so any of the deep learning algorithms tested would see the same texture at multiple depths and multiple blurs. The DfD-Net architecture was employed in two separate tests using the real world dataset. The first test was the synthetic blurring of the real world dataset and assessing the performance of the DfD-Net trained on the Middlebury dataset. The results of the real world dataset for the scenes that were between 1.5 and 2.2 meters from the camera the DfD-Net trained on the Middlebury dataset produced an average NRMSE, NMAE and SSIM value that exceeded the test results of the DfD-Net tested on the Middlebury test set. The second test conducted was the training and testing solely on the real world dataset. Analysis of the camera and lens behavior led to an optimal lens voltage step configuration of 141 and 129. Using this configuration, training the DfD-Net resulted in an average NRMSE, NMAE and SSIM of 0.0660, 0.0517 and 0.8028 with a standard deviation of 0.0173, 0.0186 and 0.0641 respectively.
A-MnasNet and Image Classification on NXP Bluebox 2.0
(ASTES, 2021-01) Shah, Prasham; El-Sharkawy, Mohamed; Electrical and Computer Engineering, School of Engineering and Technology
Computer Vision is a domain which deals with the challenge of enabling technology with vision capabilities. This goal is accomplished with the use of Convolutional Neural Networks. They are the backbone of implementing vision applications on embedded systems. They are complex but highly efficient in extracting features, thus, enabling embedded systems to perform computer vision applications. After AlexNet won the ImageNet Large Scale Visual Recognition Challenge in 2012, there was a drastic increase in research on Convolutional Neural Networks. The convolutional neural networks were made deeper and wider, in order to make them more efficient. They were able to extract features efficiently, but the computational complexity and the computational cost of those networks also increased. It became very challenging to deploy such networks on embedded hardware. Since embedded systems have limited resources like power, speed and computational capabilities, researchers got more inclined towards the goal of making convolutional neural networks more compact, with efficiency of extracting features similar to that of the novel architectures. This research has a similar goal of proposing a convolutional neural network with enhanced efficiency and further using it for a vision application like Image Classification on NXP Bluebox 2.0, an autonomous driving platform by NXP Semiconductors. This paper gives an insight on the Design Space Exploration technique used to propose A-MnasNet (Augmented MnasNet) architecture, with enhanced capabilities, from MnasNet architecture. Furthermore, it explains the implementation of A-MnasNet on Bluebox 2.0 for Image Classification.
Adversaria Attacks and Defense Mechanisms to Improve Robustness of Deep Temporal Point Processes
(2022-08) Khorshidi, Samira; Mohler, George; Al Hasan, Mohammad; Raje, Rajeev; Durresi, Arjan
Temporal point processes (TPP) are mathematical approaches for modeling asynchronous event sequences by considering the temporal dependency of each event on past events and its instantaneous rate. Temporal point processes can model various problems, from earthquake aftershocks, trade orders, gang violence, and reported crime patterns, to network analysis, infectious disease transmissions, and virus spread forecasting. In each of these cases, the entity’s behavior with the corresponding information is noted over time as an asynchronous event sequence, and the analysis is done using temporal point processes, which provides a means to define the generative mechanism of the sequence of events and ultimately predict events and investigate causality. Among point processes, Hawkes process as a stochastic point process is able to model a wide range of contagious and self-exciting patterns. One of Hawkes process’s well-known applications is predicting the evolution of viral processes on networks, which is an important problem in biology, the social sciences, and the study of the Internet. In existing works, mean-field analysis based upon degree distribution is used to predict viral spreading across networks of different types. However, it has been shown that degree distribution alone fails to predict the behavior of viruses on some real-world networks. Recent attempts have been made to use assortativity to address this shortcoming. This thesis illustrates how the evolution of such a viral process is sensitive to the underlying network’s structure. In Chapter 3 , we show that adding assortativity does not fully explain the variance in the spread of viruses for a number of real-world networks. We propose using the graphlet frequency distribution combined with assortativity to explain variations in the evolution of viral processes across networks with identical degree distribution. Using a data-driven approach, by coupling predictive modeling with viral process simulation on real-world networks, we show that simple regression models based on graphlet frequency distribution can explain over 95% of the variance in virality on networks with the same degree distribution but different network topologies. Our results highlight the importance of graphlets and identify a small collection of graphlets that may have the most significant influence over the viral processes on a network. Due to the flexibility and expressiveness of deep learning techniques, several neural network-based approaches have recently shown promise for modeling point process intensities. However, there is a lack of research on the possible adversarial attacks and the robustness of such models regarding adversarial attacks and natural shocks to systems. Furthermore, while neural point processes may outperform simpler parametric models on in-sample tests, how these models perform when encountering adversarial examples or sharp non-stationary trends remains unknown. In Chapter 4 , we propose several white-box and black-box adversarial attacks against deep temporal point processes. Additionally, we investigate the transferability of whitebox adversarial attacks against point processes modeled by deep neural networks, which are considered a more elevated risk. Extensive experiments confirm that neural point processes are vulnerable to adversarial attacks. Such a vulnerability is illustrated both in terms of predictive metrics and the effect of attacks on the underlying point process’s parameters. Expressly, adversarial attacks successfully transform the temporal Hawkes process regime from sub-critical to into a super-critical and manipulate the modeled parameters that is considered a risk against parametric modeling approaches. Additionally, we evaluate the vulnerability and performance of these models in the presence of non-stationary abrupt changes, using the crimes and Covid-19 pandemic dataset as an example. Considering the security vulnerability of deep-learning models, including deep temporal point processes, to adversarial attacks, it is essential to ensure the robustness of the deployed algorithms that is despite the success of deep learning techniques in modeling temporal point processes. In Chapter 5 , we study the robustness of deep temporal point processes against several proposed adversarial attacks from the adversarial defense viewpoint. Specifically, we investigate the effectiveness of adversarial training using universal adversarial samples in improving the robustness of the deep point processes. Additionally, we propose a general point process domain-adopted (GPDA) regularization, which is strictly applicable to temporal point processes, to reduce the effect of adversarial attacks and acquire an empirically robust model. In this approach, unlike other computationally expensive approaches, there is no need for additional back-propagation in the training step, and no further network isrequired. Ultimately, we propose an adversarial detection framework that has been trained in the Generative Adversarial Network (GAN) manner and solely on clean training data. Finally, in Chapter 6 , we discuss implications of the research and future research directions.
AI on the Edge with CondenseNeXt: An Efficient Deep Neural Network for Devices with Constrained Computational Resources
(2021-08) Kalgaonkar, Priyank B.; El-Sharkawy, Mohamed A.; King, Brian S.; Rizkalla, Maher E.
Research work presented within this thesis propose a neoteric variant of deep convolutional neural network architecture, CondenseNeXt, designed specifically for ARM-based embedded computing platforms with constrained computational resources. CondenseNeXt is an improved version of CondenseNet, the baseline architecture whose roots can be traced back to ResNet. CondeseNeXt replaces group convolutions in CondenseNet with depthwise separable convolutions and introduces group-wise pruning, a model compression technique, to prune (remove) redundant and insignificant elements that either are irrelevant or do not affect performance of the network upon disposition. Cardinality, a new dimension to the existing spatial dimensions, and class-balanced focal loss function, a weighting factor inversely proportional to the number of samples, has been incorporated in order to relieve the harsh effects of pruning, into the design of CondenseNeXt’s algorithm. Furthermore, extensive analyses of this novel CNN architecture was performed on three benchmarking image datasets: CIFAR-10, CIFAR-100 and ImageNet by deploying the trained weight on to an ARM-based embedded computing platform: NXP BlueBox 2.0, for real-time image classification. The outputs are observed in real-time in RTMaps Remote Studio’s console to verify the correctness of classes being predicted. CondenseNeXt achieves state-of-the-art image classification performance on three benchmark datasets including CIFAR-10 (4.79% top-1 error), CIFAR-100 (21.98% top-1 error) and ImageNet (7.91% single model, single crop top-5 error), and up to 59.98% reduction in forward FLOPs compared to CondenseNet. CondenseNeXt can also achieve a final trained model size of 2.9 MB, however at the cost of 2.26% in accuracy loss. Thus, performing image classification on ARM-Based computing platforms without requiring a CUDA enabled GPU support, with outstanding efficiency.
Deep Learning with Go
(2020-05) Stinson, Derek L.; Ben Miled, Zina; King, Brian; Rizkalla, Maher
Current research in deep learning is primarily focused on using Python as a support language. Go, an emerging language, that has many benefits including native support for concurrency has seen a rise in adoption over the past few years. However, this language is not widely used to develop learning models due to the lack of supporting libraries and frameworks for model development. In this thesis, the use of Go for the development of neural network models in general and convolution neural networks is explored. The proposed study is based on a Go-CUDA implementation of neural network models called GoCuNets. This implementation is then compared to a Go-CPU deep learning implementation that takes advantage of Go's built in concurrency called ConvNetGo. A comparison of these two implementations shows a significant performance gain when using GoCuNets compared to ConvNetGo.
Design Space Exploration of Convolutional Neural Networks for Image Classification
(2020-12) Shah, Prasham; Mohamed, El-Sharkawy; King, Brian; Rizkalla, Maher
Computer vision is a domain which deals with the goal of making technology as efficient as human vision. To achieve that goal, after decades of research, researchers have developed algorithms that are able to work efficiently on resource constrained hardware like mobile or embedded devices for computer vision applications. Due to their constant efforts, such devices have become capable for tasks like Image Classification, Object Detection, Object Recognition, Semantic Segmentation, and many other applications. Autonomous systems like self-driving cars, Drones and UAVs, are being successfully developed because of these advances in AI. Deep Learning, a part of AI, is a specific domain of Machine Learning which focuses on developing algorithms for such applications. Deep Learning deals with tasks like extracting features from raw image data, replacing pipelines of specialized models with single end-to-end models, making models usable for multiple tasks with superior performance. A major focus is on techniques to detect and extract features which provide better context for inference about an image or video stream. A deep hierarchy of rich features can be learned and automatically extracted from images, provided by the multiple deep layers of CNN models. CNNs are the backbone of Computer Vision. The reason that CNNs are the focus of attention for deep learning models is that they were specifically designed for image data. They are complicated but very effective in extracting features from an image or a video stream. After AlexNet won the ILSVRC in 2012, there was a drastic increase in research related with CNNs. Many state-of-the-art architectures like VGG Net, GoogleNet, ResNet, Inception-v4, Inception-Resnet-v2, ShuffleNet, Xception, MobileNet, MobileNetV2, SqueezeNet, SqueezeNext and many more were introduced. The trend behind the research depicts an increase in the number of layers of CNN to make them more efficient but with that, the size of the model increased as well. This problem was fixed with the advent of new algorithms which resulted in a decrease in model size. As a result, today we have CNN models, which are implemented on mobile devices. These mobile models are compact and have low latency, which in turn reduces the computational cost of the embedded system. This thesis resembles similar idea, it proposes two new CNN architectures, A-MnasNet and R-MnasNet, which have been derived from MnasNet by Design Space Exploration. These architectures outperform MnasNet in terms of model size and accuracy. They have been trained and tested on CIFAR-10 dataset. Furthermore, they were implemented on NXP Bluebox 2.0, an autonomous driving platform, for Image Classification.
Design Space Exploration of Convolutional Neural Networks for Image Classification
(2020-12) Shah, Prasham; Mohamed, El-Sharkawy; King, Brian; Rizkalla, Maher
Computer vision is a domain which deals with the goal of making technology as efficient as human vision. To achieve that goal, after decades of research, researchers have developed algorithms that are able to work efficiently on resource constrained hardware like mobile or embedded devices for computer vision applications. Due to their constant efforts, such devices have become capable for tasks like Image Classification, Object Detection, Object Recognition, Semantic Segmentation, and many other applications. Autonomous systems like self-driving cars, Drones and UAVs, are being successfully developed because of these advances in AI. Deep Learning, a part of AI, is a specific domain of Machine Learning which focuses on developing algorithms for such applications. Deep Learning deals with tasks like extracting features from raw image data, replacing pipelines of specialized models with single end-to-end models, making models usable for multiple tasks with superior performance. A major focus is on techniques to detect and extract features which provide better context for inference about an image or video stream. A deep hierarchy of rich features can be learned and automatically extracted from images, provided by the multiple deep layers of CNN models. CNNs are the backbone of Computer Vision. The reason that CNNs are the focus of attention for deep learning models is that they were specifically designed for image data. They are complicated but very effective in extracting features from an image or a video stream. After AlexNet won the ILSVRC in 2012, there was a drastic increase in research related with CNNs. Many state-of-the-art architectures like VGG Net, GoogleNet, ResNet, Inception-v4, Inception-Resnet-v2, ShuffleNet, Xception, MobileNet, MobileNetV2, SqueezeNet, SqueezeNext and many more were introduced. The trend behind the research depicts an increase in the number of layers of CNN to make them more efficient but with that, the size of the model increased as well. This problem was fixed with the advent of new algorithms which resulted in a decrease in model size. As a result, today we have CNN models, which are implemented on mobile devices. These mobile models are compact and have low latency, which in turn reduces the computational cost of the embedded system. This thesis resembles similar idea, it proposes two new CNN architectures, A-MnasNet and R-MnasNet, which have been derived from MnasNet by Design Space Exploration. These architectures outperform MnasNet in terms of model size and accuracy. They have been trained and tested on CIFAR-10 dataset. Furthermore, they were implemented on NXP Bluebox 2.0, an autonomous driving platform, for Image Classification.
Developing Bottom-Up, Integrated Omics Methodologies for Big Data Biomarker Discovery
(2020-11) Kechavarzi, Bobak David; Wu, Huanmei; Doman, Thompson; Dow, Ernst; Liu, Yunlong; Liu, Xiaowen; Yan, Jingwen
The availability of highly-distributed computing compliments the proliferation of next generation sequencing (NGS) and genome-wide association studies (GWAS) datasets. These data sets are often complex, poorly annotated or require complex domain knowledge to sensibly manage. These novel datasets provide a rare, multi-dimensional omics (proteomics, transcriptomics, and genomics) view of a single sample or patient. Previously, biologists assumed a strict adherence to the central dogma: replication, transcription and translation. Recent studies in genomics and proteomics emphasize that this is not the case. We must employ big-data methodologies to not only understand the biogenesis of these molecules, but also their disruption in disease states. The Cancer Genome Atlas (TCGA) provides high-dimensional patient data and illustrates the trends that occur in expression profiles and their alteration in many complex disease states. I will ultimately create a bottom-up multi-omics approach to observe biological systems using big data techniques. I hypothesize that big data and systems biology approaches can be applied to public datasets to identify important subsets of genes in cancer phenotypes. By exploring these signatures, we can better understand the role of amplification and transcript alterations in cancer.
Evaluating the Implementation of Deep Learning in LibreHealth Radiology on Chest X-Rays
(Springer, 2019-04-25) Purkayastha, Saptarshi; Buddi, Surendra Babu; Yadav, Bhawana; Nuthakki, Siddhartha; Gichoya, Judy W.
Respiratory diseases are the dominant cause of deaths worldwide. In the US, the number of deaths due to chronic lung infections (mostly pneumonia and tuberculosis), lung cancer and chronic obstructive pulmonary disease has increased. Timely and accurate diagnosis of the disease is highly imperative to diminish the deaths. Chest X-ray is a vital diagnostic tool used for diagnosing lung diseases. Delay in X-Ray diagnosis is run-of-the-mill milieu and the reasons for the impediment are mostly because the X-ray reports are arduous to interpret, due to the complex visual contents of radiographs containing superimposed anatomical structures. A shortage of trained radiologists is another cause of increased workload and thus delay. We integrated CheXNet, a neural network algorithm into the LibreHealth Radiology Information System, which allows physicians to upload Chest X-rays and identify diagnosis probabilities. The uploaded images are evaluated from labels for 14 thoracic diseases. The turnaround time for each evaluation is about 30 seconds, which does not affect clinical workflow. A Python Flask application hosted web service is used to upload radiographs into a GPU server containing the algorithm. Thus, the use of this system is not limited to clients having their GPU server, but instead, we provide a web service. To evaluate the model, we randomly split the dataset into training (70%), validation (10%) and test (20%) sets. With over 86% accuracy and turnaround time under 30 seconds, the application demonstrates the feasibility of a web service for machine learning based diagnosis of 14-lung pathologies from Chest X-rays.
HBONext: An Efficient Dnn for Light Edge Embedded Devices
(2021-05) Joshi, Sanket Ramesh; El-Sharkawy, Mohamed; King, Brian; Rizkalla, Maher
Every year the most effective Deep learning models, CNN architectures are showcased based on their compatibility and performance on the embedded edge hardware, especially for applications like image classification. These deep learning models necessitate a significant amount of computation and memory, so they can only be used on high-performance computing systems like CPUs or GPUs. However, they often struggle to fulfill portable specifications due to resource, energy, and real-time constraints. Hardware accelerators have recently been designed to provide the computational resources that AI and machine learning tools need. These edge accelerators have high-performance hardware which helps maintain the precision needed to accomplish this mission. Furthermore, this classification dilemma that investigates channel interdependencies using either depth-wise or group-wise convolutional features, has benefited from the inclusion of Bottleneck modules. Because of its increasing use in portable applications, the classic inverted residual block, a well-known architecture technique, has gotten more recognition. This work takes it a step forward by introducing a design method for porting CNNs to lowresource embedded systems, essentially bridging the difference between deep learning models and embedded edge systems. To achieve these goals, we use closer computing strategies to reduce the computer’s computational load and memory usage while retaining excellent deployment efficiency. This thesis work introduces HBONext, a mutated version of Harmonious Bottlenecks (DHbneck) combined with a Flipped version of Inverted Residual (FIR), which outperforms the current HBONet architecture in terms of accuracy and model size miniaturization. Unlike the current definition of inverted residual, this FIR block performs identity mapping and spatial transformation at its higher dimensions. The HBO solution, on the other hand, focuses on two orthogonal dimensions: spatial (H/W) contraction-expansion and later channel (C) expansion-contraction, which are both organized in a bilaterally symmetric manner. HBONext is one of those versions that was designed specifically for embedded and mobile applications. In this research work, we also show how to use NXP Bluebox 2.0 to build a real-time HBONext image classifier. The integration of the model into this hardware has been a big hit owing to the limited model size of 3 MB. The model was trained and validated using CIFAR10 dataset, which performed exceptionally well due to its smaller size and higher accuracy. The validation accuracy of the baseline HBONet architecture is 80.97%, and the model is 22 MB in size. The proposed architecture HBONext variants, on the other hand, gave a higher validation accuracy of 89.70% and a model size of 3.00 MB measured using the number of parameters. The performance metrics of HBONext architecture and its various variants are compared in the following chapters.

Browsing by Subject "Deep Learning"

Results Per Page

Sort Options