ebook img

Deep Learning with Theano PDF

284 Pages·2017·4.894 MB·english
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Deep Learning with Theano

Deep Learning with Theano Build the artificial brain of the future, today Christopher Bourez BIRMINGHAM - MUMBAI Deep Learning with Theano Copyright © 2017 Packt Publishing First published: July 2017 Production reference: 1280717 Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK. ISBN 978-1-78646-582-5 www.packtpub.com Contents Preface vii Chapter 1: Theano Basics 1 The need for tensors 2 Installing and loading Theano 3 Conda package and environment manager 3 Installing and running Theano on CPU 3 GPU drivers and libraries 4 Installing and running Theano on GPU 5 Tensors 7 Graphs and symbolic computing 10 Operations on tensors 14 Dimension manipulation operators 16 Elementwise operators 17 Reduction operators 20 Linear algebra operators 21 Memory and variables 23 Functions and automatic differentiation 25 Loops in symbolic computing 27 Configuration, profiling and debugging 31 Summary 35 Chapter 2: Classifying Handwritten Digits with a Feedforward Network 37 The MNIST dataset 38 Structure of a training program 40 Classification loss function 41 Single-layer linear model 42 Cost function and errors 44 Backpropagation and stochastic gradient descent 45 Multiple layer model 48 Convolutions and max layers 56 Training 60 Dropout 67 Inference 68 Optimization and other update rules 68 Related articles 73 Summary 74 Chapter 3: Encoding Word into Vector 75 Encoding and embedding 76 Dataset 78 Continuous Bag of Words model 80 Training the model 85 Visualizing the learned embeddings 88 Evaluating embeddings – analogical reasoning 90 Evaluating embeddings – quantitative analysis 93 Application of word embeddings 93 Weight tying 94 Further reading 95 Summary 95 Chapter 4: Generating Text with a Recurrent Neural Net 97 Need for RNN 98 A dataset for natural language 99 Simple recurrent network 102 LSTM network 104 Gated recurrent network 107 Metrics for natural language performance 108 Training loss comparison 109 Example of predictions 110 Applications of RNN 112 Related articles 114 Summary 115 Chapter 5: Analyzing Sentiment with a Bidirectional LSTM 117 Installing and configuring Keras 118 Programming with Keras 119 SemEval 2013 dataset 121 Preprocessing text data 122 Designing the architecture for the model 125 Vector representations of words 126 Sentence representation using bi-LSTM 127 Outputting probabilities with the softmax classifier 128 Compiling and training the model 129 Evaluating the model 130 Saving and loading the model 130 Running the example 131 Further reading 131 Summary 132 Chapter 6: Locating with Spatial Transformer Networks 133 MNIST CNN model with Lasagne 134 A localization network 136 Recurrent neural net applied to images 140 Unsupervised learning with co-localization 145 Region-based localization networks 146 Further reading 147 Summary 148 Chapter 7: Classifying Images with Residual Networks 149 Natural image datasets 150 Batch normalization 151 Global average pooling 152 Residual connections 153 Stochastic depth 160 Dense connections 160 Multi-GPU 162 Data augmentation 163 Further reading 164 Summary 165 Chapter 8: Translating and Explaining with Encoding – decoding Networks 167 Sequence-to-sequence networks for natural language processing 168 Seq2seq for translation 174 Seq2seq for chatbots 175 Improving efficiency of sequence-to-sequence network 176 Deconvolutions for images 178 Multimodal deep learning 183 Further reading 184 Summary 185 Chapter 9: Selecting Relevant Inputs or Memories with the Mechanism of Attention 187 Differentiable mechanism of attention 188 Better translations with attention mechanism 189 Better annotate images with attention mechanism 191 Store and retrieve information in Neural Turing Machines 192 Memory networks 195 Episodic memory with dynamic memory networks 197 Further reading 198 Summary 199 Chapter 10: Predicting Times Sequences with Advanced RNN 201 Dropout for RNN 201 Deep approaches for RNN 202 Stacked recurrent networks 204 Deep transition recurrent network 207 Highway networks design principle 208 Recurrent Highway Networks 208 Further reading 210 Summary 211 Chapter 11: Learning from the Environment with Reinforcement 213 Reinforcement learning tasks 214 Simulation environments 215 Q-learning 218 Deep Q-network 221 Training stability 222 Policy gradients with REINFORCE algorithms 225 Related articles 229 Summary 229 Chapter 12: Learning Features with Unsupervised Generative Networks 231 Generative models 232 Restricted Boltzmann Machines 232 Deep belief bets 237 Generative adversarial networks 238 Improve GANs 243 Semi-supervised learning 244 Further reading 245 Summary 246 Chapter 13: Extending Deep Learning with Theano 247 Theano Op in Python for CPU 248 Theano Op in Python for the GPU 251 Theano Op in C for CPU 254 Theano Op in C for GPU 258 Coalesced transpose via shared memory, NVIDIA parallel for all 262 Model conversions 263 The future of artificial intelligence 266 Further reading 269 Summary 270 Index 271 Preface Gain insight and practice with neural net architecture design to solve problems with artificial intelligence. Understand the concepts behind the most advanced networks in deep learning. Leverage Python language with Theano technology, to easily compute derivatives and minimize objective functions of your choice. What this book covers Chapter 1, Theano Basics, helps the reader to reader learn main concepts of Theano to write code that can compile on different hardware architectures and optimize automatically complex mathematical objective functions. Chapter 2, Classifying Handwritten Digits with a Feedforward Network, will introduce a simple, well-known and historical example which has been the starting proof of superiority of deep learning algorithms. The initial problem was to recognize handwritten digits. Chapter 3, Encoding word into Vector, one of the main challenge with neural nets is to connect the real world data to the input of a neural net, in particular for categorical and discrete data. This chapter presents an example on how to build an embedding space through training with Theano. Such embeddings are very useful in machine translation, robotics, image captioning, and so on because they translate the real world data into arrays of vectors that can be processed by neural nets. Chapter 4, Generating Text with a Recurrent Neural Net, introduces recurrency in neural nets with a simple example in practice, to generate text. Recurrent neural nets (RNN) are a popular topic in deep learning, enabling more possibilities for sequence prediction, sequence generation, machine translation, connected objects. Natural Language Processing (NLP) is a second field of interest that has driven the research for new machine learning techniques. Chapter 5, Analyzing Sentiments with a Bidirectional LSTM, applies embeddings and recurrent layers to a new task of natural language processing, sentiment analysis. It acts as a kind of validation of prior chapters. In the meantime, it demonstrates an alternative way to build neural nets on Theano, with a higher level library, Keras. Chapter 6, Locating with Spatial Transformer Networks, applies recurrency to image, to read multiple digits on a page at once. This time, we take the opportunity to rewrite the classification network for handwritten digits images, and our recurrent models, with the help of Lasagne, a library of built-in modules for deep learning with Theano. Lasagne library helps design neural networks for experimenting faster. With this help, we'll address object localization, a common computer vision challenge, with Spatial Transformer modules to improve our classification scores. Chapter 7, Classifying Images with Residual Networks, classifies any type of images at the best accuracy. In the mean time, to build more complex nets with ease, we introduce a library based on Theano framework, Lasagne, with many already implemented components to help implement neural nets faster for Theano. Chapter 8, Translating and Explaining through Encoding – decoding Networks, presents encoding-decoding techniques: applied to text, these techniques are heavily used in machine-translation and simple chatbots systems. Applied to images, they serve scene segmentations and object localization. Last, image captioning is a mixed, encoding images and decoding to texts. This chapter goes one step further with a very popular high level library, Keras, that simplifies even more the development of neural nets with Theano. Chapter 9, Selecting Relevant Inputs or Memories with the Mechanism of Attention, for solving more complicated tasks, the machine learning world has been looking for higher level of intelligence, inspired by nature: reasoning, attention and memory. In this chapter, the reader will discover the memory networks on the main purpose of artificial intelligence for natural language processing (NLP): the language understanding. Chapter 10, Predicting Times Sequence with Advanced RNN, time sequences are an important field where machine learning has been used heavily. This chapter will go for advanced techniques with Recurrent Neural Networks (RNN), to get state-of-art results. Chapter 11, Learning from the Environment with Reinforcement, reinforcement learning is the vast area of machine learning, which consists in training an agent to behave in an environment (such as a video game) so as to optimize a quantity (maximizing the game score), by performing certain actions in the environment (pressing buttons on the controller) and observing what happens. Reinforcement learning new paradigm opens a complete new path for designing algorithms and interactions between computers and real world. Chapter 12, Learning Features with Unsupervised Generative Networks, unsupervised learning consists in new training algorithms that do not require the data to be labeled to be trained. These algorithms try to infer the hidden labels from the data, called the factors, and, for some of them, to generate new synthetic data. Unsupervised training is very useful in many cases, either when no labeling exists, or when labeling the data with humans is too expensive, or lastly when the dataset is too small and feature engineering would overfit the data. In this last case, extra amounts of unlabeled data train better features as a basis for supervised learning. Chapter 13, Extending Deep Learning with Theano, extends the set of possibilities in Deep Learning with Theano. It addresses the way to create new operators for the computation graph, either in Python for simplicity, or in C to overcome the Python overhead, either for the CPU or for the GPU. Also, introduces the basic concept of parallel programming for GPU. Lastly, we open the field of General Intelligence, based on the first skills developped in this book, to develop new skills, in a gradual way, to improve itself one step further. Why Theano? Investing time and developments on Theano is very valuable and to understand why, it is important to explain that Theano belongs to the best deep learning technologies and is also much more than a deep learning library. Three reasons make of Theano a good choice of investment: • It has comparable performance with other numerical or deep learning libraries • It comes in a rich Python ecosystem • It enables you to evaluate any function constraint by data, given a model, by leaving the freedom to compile a solution for any optimization problem

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.