ebook img

Artificial neural networks technology. DACS report PDF

87 Pages·1988·0.358 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Artificial neural networks technology. DACS report

ARTIFICIAL NEURAL NETWORKS TECHNOLOGY A DACS State-of-the-Art Report Contract Number F30602-89-C-0082 (Data & Analysis Center for Software) ELIN: A011 August 20 1992 Prepared for: Rome Laboratory RL/C3C Griffiss AFB, NY 13441-5700 Prepared by: Dave Anderson and George McNeill Kaman Sciences Corporation 258 Genesse Street Utica, New York 13502-4627 TABLE OF CONTENTS 1.0 Introduction and Purpose.............................................................................1 2.0 What are Artificial Neural Networks?......................................................2 2.1 Analogy to the Brain.............................................................................2 2.2 Artificial Neurons and How They Work.........................................3 2.3 Electronic Implementation of Artificial Neurons..........................5 2.4 Artificial Network Operations............................................................7 2.5 Training an Artificial Neural Network............................................10 2.5.1 Supervised Training..................................................................10 2.5.2 Unsupervised, or Adaptive Training....................................11 2.6 How Neural Networks Differ from Traditional Computing and Expert Systems ...............................................................................12 3.0 History of Neural Networks.........................................................................17 4.0 Detailed Description of Neural Network Components and How They Work........................................................................................................20 4.1 Major Components of an Artificial Neuron....................................22 4.2 Teaching an Artificial Neural Network............................................26 4.2.1 Supervised Learning.................................................................26 4.2.2 Unsupervised Learning............................................................27 4.2.3 Learning Rates............................................................................28 4.2.4 Learning Laws.............................................................................29 5.0 Network Selection..........................................................................................31 5.1 Networks for Prediction.......................................................................32 5.1.1 Feedforward, Back-Propagation..............................................32 5.1.2 Delta Bar Delta............................................................................35 5.1.3 Extended Delta Bar Delta..........................................................36 5.1.4 Directed Random Search..........................................................37 5.1.5 Higher-order Neural Network or Functional-link Network...................................................................................................39 5.1.6 Self-Organizing Map into Back-Propagation........................40 5.2 Networks for Classification..................................................................41 5.2.1 Learning Vector Quantization................................................41 5.2.2 Counter-propagation Network...............................................43 5.2.3 Probabilistic Neural Network..................................................46 5.3 Networks for Data Association...........................................................48 5.3.1 Hopfield Network......................................................................48 5.3.2 Boltzmann Machine..................................................................50 5.3.3 Hamming Network...................................................................51 5.3.4 Bi-directional Associative Memory.......................................53 5.3.5 Spatio-Temporal Pattern Recognition (Avalanche)...........54 5.4 Networks for Data Conceptualization...............................................55 5.4.1 Adaptive Resonance Network................................................56 5.4.2 Self-Organizing Map..................................................................56 i 5.5 Networks for Data Filtering.................................................................58 5.5.1 Recirculation...............................................................................58 6.0 How Artificial Neural Networks Are Being Used...................................61 6.1 Language Processing..............................................................................62 6.2 Character Recognition...........................................................................62 6.3 Image (data) Compression....................................................................63 6.4 Pattern Recognition...............................................................................63 6.5 Signal Processing....................................................................................64 6.6 Financial...................................................................................................65 6.7 Servo Control..........................................................................................65 6.8 How to Determine if an Application is a Neural Network Candidate.................................................................................................66 7.0 New Technologies that are Emerging........................................................68 7.1 What Currently Exists...........................................................................68 7.1.1 Development Systems..............................................................68 7.1.2 Hardware Accelerators..............................................................69 7.1.3 Dedicated Neural Processors....................................................69 7.2 What the Next Developments Will Be.............................................69 8.0 Summary..........................................................................................................71 9.0 References.........................................................................................................72 ii List of Figures Figure 2.2.1 A Simple Neuron...............................................................................3 Figure 2.2.2 A Basic Artificial Neuron..................................................................4 Figure 2.2.3 A Model of a "Processing Element"................................................6 Figure 2.2.4 Sigmoid Transfer Function...............................................................7 Figure 2.4.1 A Simple Neural Network Diagram..............................................8 Figure 2.4.2 Simple Network with Feedback and Competition......................9 Figure 4.0.1 Processing Element.............................................................................21 Figure 4.1.1 Sample Transfer Functions...............................................................24 Figure 5.0.1 An Example Feedforward Back-propagation Network...............33 Figure 5.2.1 An Example Learning Vector Quantization Network................42 Figure 5.2.2 An Example Counter-propagation Network................................44 Figure 5.2.3 A Probabilistic Neural Network Example.....................................47 Figure 5.3.1 A Hopfield Network Example..........................................................49 Figure 5.3.2 A Hamming Network Example.......................................................52 Figure 5.3.4 Bi-directional Associative Memory Example...............................53 Figure 5.3.5. A Spatio-temporal Pattern Network Example..............................55 Figure 5.4.2 An Example Self-organizing Map Network..................................57 Figure 5.5.1 An Example Recirculation Network..............................................59 List of Tables Table 2.6.1 Comparison of Computing Approaches........................................13 Table 2.6.2 Comparisons of Expert Systems and Neural Networks.............14 Table 5.0.1 Network Selector Table......................................................................31 iii 1.0 I ntroduction a nd P urpose This report is intended to help the reader understand what Artificial Neural Networks are, how to use them, and where they are currently being used. Artificial Neural Networks are being touted as the wave of the future in computing. They are indeed self learning mechanisms which don't require the traditional skills of a programmer. But unfortunately, misconceptions have arisen. Writers have hyped that these neuron-inspired processors can do almost anything. These exaggerations have created disappointments for some potential users who have tried, and failed, to solve their problems with neural networks. These application builders have often come to the conclusion that neural nets are complicated and confusing. Unfortunately, that confusion has come from the industry itself. An avalanche of articles have appeared touting a large assortment of different neural networks, all with unique claims and specific examples. Currently, only a few of these neuron-based structures, paradigms actually, are being used commercially. One particular structure, the feedforward, back- propagation network, is by far and away the most popular. Most of the other neural network structures represent models for "thinking" that are still being evolved in the laboratories. Yet, all of these networks are simply tools and as such the only real demand they make is that they require the network architect to learn how to use them. This report is intended to help that process by explaining these structures, right down to the rules on how to tweak the "nuts and bolts." Also this report discusses what types of applications are currently utilizing the different structures and how some structures lend themselves to specific solutions. In reading this report, a reader who wants a general understanding of neural networks should read sections 2, 3, 6, 7 and 8. These sections provide an understanding of neural networks (section 2), their history (section 3), how they are currently being applied (section 6), the tools to apply them plus the probable future of neural processing (section 7), and a summary of what it all means (section 8). A more serious reader is invited to delve into the inner working of neural networks (section 4) and the various ways neural networks can be structured (section 5). 1 2.0 W h at a re A r tificial N e ural N e tworks ? Artificial Neural Networks are relatively crude electronic models based on the neural structure of the brain. The brain basically learns from experience. It is natural proof that some problems that are beyond the scope of current computers are indeed solvable by small energy efficient packages. This brain modeling also promises a less technical way to develop machine solutions. This new approach to computing also provides a more graceful degradation during system overload than its more traditional counterparts. These biologically inspired methods of computing are thought to be the next major advancement in the computing industry. Even simple animal brains are capable of functions that are currently impossible for computers. Computers do rote things well, like keeping ledgers or performing complex math. But computers have trouble recognizing even simple patterns much less generalizing those patterns of the past into actions of the future. Now, advances in biological research promise an initial understanding of the natural thinking mechanism. This research shows that brains store information as patterns. Some of these patterns are very complicated and allow us the ability to recognize individual faces from many different angles. This process of storing information as patterns, utilizing those patterns, and then solving problems encompasses a new field in computing. This field, as mentioned before, does not utilize traditional programming but involves the creation of massively parallel networks and the training of those networks to solve specific problems. This field also utilizes words very different from traditional computing, words like behave, react, self-organize, learn, generalize, and forget. 2.1 Analogy to the Brain The exact workings of the human brain are still a mystery. Yet, some aspects of this amazing processor are known. In particular, the most basic element of the human brain is a specific type of cell which, unlike the rest of the body, doesn't appear to regenerate. Because this type of cell is the only part of the body that isn't slowly replaced, it is assumed that these cells are what provides us with our abilities to remember, think, and apply previous experiences to our every action. These cells, all 100 billion of them, are known as neurons. Each of these neurons can connect with up to 200,000 other neurons, although 1,000 to 10,000 is typical. The power of the human mind comes from the sheer numbers of these basic components and the multiple connections between them. It also comes from genetic programming and learning. 2 The individual neurons are complicated. They have a myriad of parts, sub-systems, and control mechanisms. They convey information via a host of electrochemical pathways. There are over one hundred different classes of neurons, depending on the classification method used. Together these neurons and their connections form a process which is not binary, not stable, and not synchronous. In short, it is nothing like the currently available electronic computers, or even artificial neural networks. These artificial neural networks try to replicate only the most basic elements of this complicated, versatile, and powerful organism. They do it in a primitive way. But for the software engineer who is trying to solve problems, neural computing was never about replicating human brains. It is about machines and a new way to solve problems. 2.2 Artificial Neurons and How They Work The fundamental processing element of a neural network is a neuron. This building block of human awareness encompasses a few general capabilities. Basically, a biological neuron receives inputs from other sources, combines them in some way, performs a generally nonlinear operation on the result, and then outputs the final result. Figure 2.2.1 shows the relationship of these four parts. 4 Parts of a Typical Nerve Cell Dendrites: Accept inputs • Soma: Process the inputs Axon: Turn the processed inputs into outputs Synapses: The electrochemical contact between neurons Figure 2.2.1 A Simple Neuron. 3 Within humans there are many variations on this basic type of neuron, further complicating man's attempts at electrically replicating the process of thinking. Yet, all natural neurons have the same four basic components. These components are known by their biological names - dendrites, soma, axon, and synapses. Dendrites are hair-like extensions of the soma which act like input channels. These input channels receive their input through the synapses of other neurons. The soma then processes these incoming signals over time. The soma then turns that processed value into an output which is sent out to other neurons through the axon and the synapses. Recent experimental data has provided further evidence that biological neurons are structurally more complex than the simplistic explanation above. They are significantly more complex than the existing artificial neurons that are built into today's artificial neural networks. As biology provides a better understanding of neurons, and as technology advances, network designers can continue to improve their systems by building upon man's understanding of the biological brain. But currently, the goal of artificial neural networks is not the grandiose recreation of the brain. On the contrary, neural network researchers are seeking an understanding of nature's capabilities for which people can engineer solutions to problems that have not been solved by traditional computing. To do this, the basic unit of neural networks, the artificial neurons, simulate the four basic functions of natural neurons. Figure 2.2.2 shows a fundamental representation of an artificial neuron. ∑ I = w x Summation 1 j Y = f(I) Transfer x 0 x w 1 0 w i x 2 w • 2 • Sum Transfer • Output Path • wn • Processing Element • x n Inputs xn Weights wn . Figure 2.2.2 A Basic Artificial Neuron. 4 In Figure 2.2.2, various inputs to the network are represented by the mathematical symbol, x(n). Each of these inputs are multiplied by a connection weight. These weights are represented by w(n). In the simplest case, these products are simply summed, fed through a transfer function to generate a result, and then output. This process lends itself to physical implementation on a large scale in a small package. This electronic implementation is still possible with other network structures which utilize different summing functions as well as different transfer functions. Some applications require "black and white," or binary, answers. These applications include the recognition of text, the identification of speech, and the image deciphering of scenes. These applications are required to turn real- world inputs into discrete values. These potential values are limited to some known set, like the ASCII characters or the most common 50,000 English words. Because of this limitation of output options, these applications don't always utilize networks composed of neurons that simply sum up, and thereby smooth, inputs. These networks may utilize the binary properties of ORing and ANDing of inputs. These functions, and many others, can be built into the summation and transfer functions of a network. Other networks work on problems where the resolutions are not just one of several known values. These networks need to be capable of an infinite number of responses. Applications of this type include the "intelligence" behind robotic movements. This "intelligence" processes inputs and then creates outputs which actually cause some device to move. That movement can span an infinite number of very precise motions. These networks do indeed want to smooth their inputs which, due to limitations of sensors, comes in non-continuous bursts, say thirty times a second. To do that, they might accept these inputs, sum that data, and then produce an output by, for example, applying a hyperbolic tangent as a transfer function. In this manner, output values from the network are continuous and satisfy more real world interfaces. Other applications might simply sum and compare to a threshold, thereby producing one of two possible outputs, a zero or a one. Other functions scale the outputs to match the application, such as the values minus one and one. Some functions even integrate the input data over time, creating time-dependent networks. 2.3 Electronic Implementation of Artificial Neurons In currently available software packages these artificial neurons are called "processing elements" and have many more capabilities than the simple artificial neuron described above. Those capabilities will be discussed later in this report. Figure 2.2.3 is a more detailed schematic of this still simplistic artificial neuron. 5 Summation Transfer Function Function Sum *wo Max Hyperbolic Tangent Min Linear Inputs *w1 Average Sigmoid Outputs Or Sine And etc. *wn etc. Learning and Recall Schedule Learning Cycle Figure 2.2.3 A Model of a "Processing Element". In Figure 2.2.3, inputs enter into the processing element from the upper left. The first step is for each of these inputs to be multiplied by their respective weighting factor (w(n)). Then these modified inputs are fed into the summing function, which usually just sums these products. Yet, many different types of operations can be selected. These operations could produce a number of different values which are then propagated forward; values such as the average, the largest, the smallest, the ORed values, the ANDed values, etc. Furthermore, most commercial development products allow software engineers to create their own summing functions via routines coded in a higher level language (C is commonly supported). Sometimes the summing function is further complicated by the addition of an activation function which enables the summing function to operate in a time sensitive way. Either way, the output of the summing function is then sent into a transfer function. This function then turns this number into a real output via some algorithm. It is this algorithm that takes the input and turns it into a zero or a one, a minus one or a one, or some other number. The transfer functions that are commonly supported are sigmoid, sine, hyperbolic tangent, etc. This transfer function also can scale the output or control its value via thresholds. The result of the transfer function is usually the direct output of the processing element. An example of how a transfer function works is shown in Figure 2.2.4. This sigmoid transfer function takes the value from the summation function, called sum in the Figure 2.2.4, and turns it into a value between zero and one. 6

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.