Unit-1(MLT) - Lecture notes 1 PDF

Title	Unit-1(MLT) - Lecture notes 1
Course	Machine Learning Techniques
Institution	Dr. A.P.J. Abdul Kalam Technical University
Pages	17
File Size	481.4 KB
File Type	PDF
Total Downloads	89
Total Views	131

Preview

CLICK TO PREVIEW PDF

Summary

Notes...

Description

Machine learning Techniques (KCS-055) Departmental Elective II Unit – 1(Introduction) 1. Introduction1.1 What is Learning Learning is the process of acquiring new understanding, knowledge, behaviors, skills, values, attitudes, and preferences. The ability to learn is possessed by humans, animals, and some machines; there is also evidence for some kind of learning in certain plants. 1.2 Learning in Computer Ever since computers were invented, we have wondered whether they might be made to learn. If we could understand how to program them to learn to improve automatically with experience-the impact would be dramatic. Imagine computers learning from medical records which treatments are most effective for new diseases, personal software assistants learning the evolving interests of their users in order to highlight especially relevant stories from the online morning newspaper. A successful understanding of how to make computers learn would open up many new uses of computers and new levels of competence and customization. We do not yet know how to make computers learn nearly as well as people learn. However, algorithms have been invented that are effective for certain types of learning tasks, and a theoretical understanding of learning is beginning to emerge. Many practical computer programs have been developed to exhibit useful types of learning, and significant commercial applications have begun to appear. For problems such as speech recognition, algorithms based on machine learning outperform all other approaches that have been attempted to date. In the field known as data mining, machine learning algorithms are being used routinely to discover valuable knowledge from large commercial databases containing equipment maintenance records, loan applications, financial transactions, medical records, and the like. As our understanding of computers continues to mature, it seems inevitable that machine learning will play an increasingly central role in computer science and computer technology. In recent years, many successful ML applications have been developed, ranging from data-mining programs that learn to detect fraudulent credit card transactions, to information-filering systems that learn users’ reading preferences, to autonomous vehicles that learn to drive on public highways.

1.3

Multi disciplinary Machine learning draws on concepts and results from many fields, including statistics, artificial intelligence, philosophy, information theory, biology, cognitive science, computational complexity, and control theory. 2. Introduction to Machine learning from programming perspectiveWhat is machine learning? Let us try to understand machine learning from a programming perspective. The field of Machine Learning (ML) is concerned with the question of how to construct computer programs that automatically improve with experience. So, how the programming is different from machine learning? We will try to answer that question first and then slowly go into basic terminologies of machine learning and various different modules of the machine learning systems. So, let’s try to understand machine learning again from a programmer’s perspective. Let’s take two problems.

Figure 1 The first problem is let’s write a program to add two numbers a and b, most of you will wonder what is a question this is such a basic question probably this particular program is among some of the early programs that all of us have written. t. So, how do we really write this program? We essentially write a function f() which takes two arguments a and b and then it returns a + b. This is a program that all of you are familiar with, we can add two numbers very easily by writing a computer program. Let us try to solve a slightly different problem with the same technique and we will see whether we can solve it or if we need some more tools in our toolkit. The second

problem is let’s say, we have a bunch of handwritten digits 8,9,2. So, what we have done is, we have fixed an area in which you can write these digits and now the task is Can you write a program to recognize these digits. Your job is to write a function that recognizes digit given the picture digit image. So, can you write a program just as you did for the addition of two numbers to recognize handwritten digits. Now, I can imagine that some of you must have started thinking about writing rules for different kind of numbers. Are rules really scalable? What if I write the number in a slightly different orientation or I write a number in a very different style, probably rules will break rules would not be able to cater to all the situations. But as a human being, we are able to recognize these numbers. What makes us recognize these numbers? We will come to this question in a bit. But before that can we write down the process of recognizing these digits just as we did in the other problem with where we added two numbers. When we were giving given two numbers a and b we immediately came up with a step or we immediately came up with a function to add two numbers which was simply a + b. But as you can imagine or as you must be facing right now is it is incredibly hard to come up with the stepwise process to recognize the digits. So, how do we really solve this problem? And before getting into solving the problem, let us think what is a difference between these two problems, why are we able to solve the first problem very easily, but the second problem is a bit of a harder problem for us to recognize digits with computers. What are the key differences between these two problems? In the first problem, the formula to add two numbers was known to us. So, given two numbers a and b I can simply do a + b and that gave us the answer. But in case of the second problem where we are trying to recognize digits, we are able to recognize it with our vision but unable to come up with steps that we can code up in the computer so that computer can also start recognizing digits. So, we need to do something else: Machine learning. Let us take a step back and try to understand why we are able to recognize these digits you can think that we have been seeing these kinds of digits right from our childhood. When you started our formal education we are introduced to these digits. So, somehow our brain is trained to recognize these digits even if they are written in a slightly different style or in a slightly different orientation. Can we try to mimic the training that we provided to a brain, can we give the same training to a computer? Let’s try to explore that. This is the question that ML tries to explore. So, let us write down the key difference between the programming the traditional programming paradigm and the ML. In our traditional programming world, we have a program,we give some data as an input and we also input the rules, rather we code these rules in the program and then

pass the data into this program, the rules get applied on the data and we get the output. We did exactly the same thing while adding two numbers. When we sort the numbers, we also give step by step instructions to the computer as to how to sort these numbers. Now, let us look at how machine learning operates, remember the handwritten digit recognition examples and we see that we have data, but we do not have rules. We cannot write a traditional computer program, but we can actually provide lots of examples of handwritten digits along with the corresponding digit. For example, I can say that this is the image and 8 is the digit corresponding to this particular image, 9 is the digit corresponding to this particular image, 2 is the digit corresponding to this particular image and this is 8. We have lots of examples where we have images of handwritten digits along with their actual labels, which are nothing but the numbers that are there in the handwritten digit. We have data and we also provide the intended output as input to ML and machine learning comes up with rules or sometimes you also collect patterns or models. You can now see a clear difference here that in traditional programming, the rule is on the left-hand side, and in ML, the rule is on the right-hand side and the output which was on the right-hand side in traditional programming, had moved to the left-hand side (input side). (See Figure. 1) The traditional program takes data and rules as input, the rules are applied to the input data to produce the output. In the case of ML, we have data and the output as the input given to the ML and ML comes up with rules or patterns or models that it sees in the input data. We will write down the steps in the ML process here. So, we have data and we have labels. Input them to ML trainer. The trainer looks at the input data and corresponding labels(output) and forms rules. So, this gives us a model or rules; the model is nothing but the mapping of input to the output. Once we get this particular model what we do is, we can take the new data and pass it through the model to get the output. We can see that once we get the model the process is exactly the same as the programming world, because once I know the model I know exactly the formula to map the input to the output. The process or the work that we do in ML training is to take the data and desired output and use ML trainer to come up with a model and once we have modelled we can use that model to get output on the new data. Now we can see that, so there are two stages in the machine learning process- 1. Training 2. Inference or Prediction This particular stage where we had data and we reached to model, is called as a training stage. When we apply model to new data and get the output, this particular phase is called as inference or prediction.

3. Definition of Machine Learning A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E. For example, assume that a machine has to predict whether a customer will buy a specific product lets say “Antivirus” this year or not. The machine will do it by looking at the previous knowledge/past experiences i.e the data of products that the customer had bought every year and if he buys Antivirus every year, then there is a high probability that the customer is going to buy an antivirus this year as well. This is how machine learning works at the basic conceptual level. 4. History of ML 1950 — Alan Turing creates the “Turing Test” to determine if a computer has real intelligence. To pass the test, a computer must be able to fool a human into believing it is also human. 1952 — Arthur Samuel wrote the first computer learning program. The program was the game of checkers, and the IBM computer improved at the game the more it played, studying which moves made up winning strategies and incorporating those moves into its program. 1957 — Frank Rosenblatt designed the first neural network for computers (the perceptron), which simulate the thought processes of the human brain. 1967 — The “nearest neighbor” algorithm was written, allowing computers to begin using very basic pattern recognition. This could be used to map a route for traveling salesmen, starting at a random city but ensuring they visit all cities during a short tour. 1979 — Students at Stanford University invent the “Stanford Cart” which can navigate obstacles in a room on its own. 1981 — Gerald Dejong introduces the concept of Explanation Based Learning (EBL), in which a computer analyses training data and creates a general rule it can follow by discarding unimportant data. 1985 — Terry Sejnowski invents NetTalk, which learns to pronounce words the same way a baby does.

1990s — Work on machine learning shifts from a knowledge-driven approach to a data-driven approach. Scientists begin creating programs for computers to analyze large amounts of data and draw conclusions — or “learn” — from the results. 1997 — IBM’s Deep Blue beats the world champion at chess. 2006 — Geoffrey Hinton coins the term “deep learning” to explain new algorithms that let computers “see” and distinguish objects and text in images and videos. 2010 — The Microsoft Kinect can track 20 human features at a rate of 30 times per second, allowing people to interact with the computer via movements and gestures. 2011 — IBM’s Watson beats its human competitors at Jeopardy. 2011 — Google Brain is developed, and its deep neural network can learn to discover and categorize objects much the way a cat does. 2012 – Google’s X Lab develops a machine learning algorithm that is able to autonomously browse YouTube videos to identify the videos that contain cats. 2014 – Facebook develops DeepFace, a software algorithm that is able to recognize or verify individuals on photos to the same level as humans can. 2015 – Amazon launches its own machine learning platform. 2015 – Microsoft creates the Distributed Machine Learning Toolkit, which enables the efficient distribution of machine learning problems across multiple computers. 2015 – Over 3,000 AI and Robotics researchers, endorsed by Stephen Hawking, Elon Musk and Steve Wozniak (among many others), sign an open letter warning of the danger of autonomous weapons which select and engage targets without human intervention. 2016 – Google’s artificial intelligence algorithm beats a professional player at the Chinese board game Go, which is considered the world’s most complex board game and is many times harder than chess. The AlphaGo algorithm developed by Google DeepMind managed to win five games out of five in the Go competition. 5. Relation of Artificial Intelligence, Machine Learning and Deep Learning Artificial Intelligence (AI) -the broad discipline of creating intelligent machines.

Machine Learning (ML) -refers to systems that can learn from experience. Deep Learning (DL) -refers to systems that learn from experience on large data sets.

6.Types of Learning

6.1Supervised Learning: Supervised learning is when the model is getting trained on a labelled dataset. Labelled dataset is one which have both input and output parameters. In this type of learning both training and validation datasets are labelled as shown in the figures below.

Both the above figures have labelled data set – Figure A: It is a dataset of a shopping store which is useful in predicting whether a customer will purchase a particular product under consideration or not based on his/ her gender, age and salary. Input: Gender,Age,Salary Output : Purchased i.e. 0 or 1 ; 1 means yes the customer will purchase and 0 means that customer won’t purchase it.  Figure B: It is a Meteorological dataset which serves the purpose of predicting wind speed based on different parameters. Input : Dew Point, Temperature, Pressure, Relative Humidity, Wind Direction Output : Wind Speed



Types of Supervised Learning: 1. Classification : It is a Supervised Learning task where output is having defined labels(discrete value). For example in above Figure A, Output – Purchased has defined labels i.e. 0 or 1 ; 1 means the customer will purchase and 0 means that customer won’t purchase. It can be either binary or multi class classification. In binary classification, model predicts either 0 or 1 ; yes or no but in case of multi class classification, model predicts more than one class. Example: Gmail classifies mails in more than one classes like social, promotions, updates, forum. 2. Regression : It is a Supervised Learning task where output is having continuous value. Example in above Figure B, Output – Wind Speed is not having any discrete

value but is continuous in the particular range. The goal here is to predict a value as much closer to actual output value as our model can and then evaluation is done by calculating error value. The smaller the error the greater the accuracy of our regression model. 6.2 Unsupervised Learning Unsupervised learning is the training of machine using information that is not labeled(output label not present) and allowing the algorithm to act on that information without guidance. Here the task of machine is to group information according to similarities, patterns and differences without any prior training of data. Unlike supervised learning, no teacher is provided that means no training will be given to the machine. Therefore machine is restricted to find the hidden structure in unlabeled data by our-self. For instance, suppose it is given an image having both dogs and cats which have not seen ever.

Thus the machine has no idea about the features of dogs and cat so we can’t categorize it in dogs and cats. But it can categorize them according to their similarities, patterns, and differences i.e., we can easily categorize the above picture into two parts. First may contain all pics having dogs in it and second part may contain all pics having cats in it. Here you didn’t learn anything before, means no training data or examples. Unsupervised learning classified into two categories of algorithms: Clustering: A clustering problem is where you want to discover the inherent groupings in the data, such as grouping customers by purchasing behavior.  Association: An association rule learning problem is where you want to discover rules that describe large portions of your data, such as people that buy X also tend to buy Y. 

6.3 Semi-Supervised Learning The most basic disadvantage of any Supervised Learning algorithm is that the dataset has to be hand-labeled either by a Machine Learning Engineer or a Data Scientist. This is a very costly process, especially when dealing with large volumes of data. The most basic disadvantage of any Unsupervised Learning is that it’s application spectrum is limited. To counter these disadvantages, the concept of Semi-Supervised Learning was introduced. In this type of learning, the algorithm is trained upon a combination of labeled and unlabeled data. Typically, this combination will contain a very small amount of labeled data and a very large amount of unlabeled data. The basic procedure involved is that first, the programmer will cluster similar data using an unsupervised learning algorithm and then use the existing labeled data to label the rest of the unlabeled data. Intuitively, one may imagine the three types of learning algorithms as Supervised learning where a student is under the supervision of a teacher at both home and school, Unsupervised learning where a student has to figure out a concept himself and Semi-Supervised learning where a teacher teaches a few concepts in class and gives questions as homework which are based on similar concepts. 6.4 Reinforcement Learning Reinforcement learning addresses the question of how a system that senses and acts in its environment can learn to choose optimal actions to achieve its goals. This very generic problem covers tasks such as learning to control a mobile robot, learning to optimize operations in factories, and learning to play board games. Each time the system performs an action in its environment, a trainer may provide a reward or penalty to indicate the desirability of the resulting state.The task of the agent is to learn to choose sequences of actions that produce the greatest cumulative reward.

Consider the scenario of teaching new tricks to your cat  As cat doesn't understand English or any other human language, we can't tell her directly what to do. Instead, we follow a different strategy.  We emulate a situation, and the cat tries to respond in many different ways. If the cat's response is the desired way, we will give her fish otherwise some penalty.

 Now whenever the cat is exposed to the same situation, the cat executes a similar action with even more enthusiastically in expectation of getting more reward (food).  That's like learning that cat gets from "what to do" from positive experiences.  At the same time, the cat also learns what not do when faced with negative experiences.

7. Successful Applications of Machine Learning-Learning to recognize spoken words- All of the most successful speech recognition systems employ machine learning in some form. For example, the SPHINX system (e.g., Lee 1989) learns speaker-specific strategies ...