What is the definition of Machine Learning (Machine Learning)?

Asked

Viewed 3,446 times

38

I asked that question Algorithm to detect nudity with good accuracy here on the site and some people mentioned some things about Machine Learning (Machine Learning).

From what was said between a comment and another on the subject, he implied (very superficially) that it is a way for a machine to "learn" to have a behavior from predefined data.

I would like a more detailed explanation on the subject:

  • What Machine Learning Would Be (Machine Learning)?

  • It is related to (or is a form of) Artificial Intelligence?

  • What a simple example we can cite on this subject?

5 answers

26


What Machine Learning Would Be?

Machine Learning can be translated simply as Machine Learning (or Computer Learning)). The term refers to a huge set of techniques that aim to build computational systems whose behavior is defined based on existing data. As the behavior of the system would not be directly programmed, but adapted from some previously acquired "knowledge", this approach would have similarity with the way animals (among them, we humans) learn from experience.

It is related to (or is a form of) Artificial Intelligence?

Certainly, yes. To define concisely what intelligence is is an arduous task, because intelligence has different important aspects. The recognition and manipulation of symbols, language and verbal and written communication, vision, planning, adaptation based on experiences (also called learning), etc. Artificial Intelligence, as a sub-area of Engineering/Computer Science, has several concerns, and one of them is to simulate adaptation and learning to solve complex problems. If you look at the diagram of the general model of an intelligent agent below, which I originally mentioned in my answer about what artificial intelligence is, will realize that that little box with the question need to contain all the "logic" that would allow the agent to perceive the changes in the world with the sensors (Sensors) and decide the best way to manipulate the world according to their intentions with the actuators (Effectors):

inserir a descrição da imagem aqui

Certainly, learning would be there, being an important aspect of intelligence (artificial).

What a simple example we can cite on this subject?

Just as it is difficult to define what intelligence is, it is also difficult to define concisely what learning is. The most trivial notion (and that matters in the scope of computer science) is that learning is the ability to adapt with experience. An intelligent computer system would be able to learn if it altered its behavior as it observed the effects of its own actions and of others in the environment in which it acts. The opposite of such a system is one that insists on a certain action even though it has already shown itself to be ineffective for its purposes (although continuing to do the same thing while waiting for different results is, according to Einstein, the definition of insanity rather than stupidity. hehehe).

Adaptation is something still broad. For example, the first computational models of cellular automata, particularly the famous Game of Life (who has time left, play with an implementation of it in Javascript at this link), sought to build systems capable of replication. If this copy of itself is not exactly the same as the previous one, it can allow/implement adaptation according to special needs of the environment in a similar way to what occurs in evolution. In fact, a similar paradigm is called Algoritmos Genéticos uses a similar principle of adapting individuals to perform interesting searches and/or optimizations in problem solving. However, to say that this type of approach is machine learning is quite debatable.

Machine learning involves more traditionally the construction of systems capable of extracting information from known data and using this learned behavior to solve new problems. Therefore, many of the techniques used in this area are also used in statistics, business intelligence, data Mining (data mining), data Science, etc. In fact, there are basically three main approaches to machine learning:

1. Predictive or Supervised Learning

In this type of approach, the algorithm uses as input a previous set of data collected from the real world and used for "training" before the actual use (hence the word "supervised"). This data set has a part (usually called x) which contains the features of interest of the problem (imagine x as an array of any values, so that the data set contains several lines x1, x2, x3, etc., for each example collected from the real world), and another part (usually called y) which contains the value arising from the characteristics in x or the class of real-world examples (imagine each y1, y2, y3 indicates what the respective lines x1, x2, x3 describe or represent). Thus, the idea is that the system "learn" the mapping between x and y from the training data, so that later is able to "predict" the value of y to a new x, that is, the value of a function or class to which a new example (a new vector with all the measured characteristics) belongs.

The most common "algorithm" example in this type of approach is the Linear Regression. With this method, it is possible to estimate a linear function (an equation of the form y = ax + b) describing the behaviour of a data set (a mapping x -> y) a linear correlation. Having "learned" this function, it is possible to estimate the value of y to a new x any only using it with the new parameters.

Other algorithms may, rather than trying to estimate a numerical value, predict an enumerative value, which "sorts" a vector x with measures of interest. For example, one could build a system capable of identifying in images of oranges the options "rotten" and "good", or identify moving objects in a video between the options "car", "bike" and "truck". For a more detailed explanation of this approach, and also some other concrete examples, please read this my other answer here at Sopt.

The perceptrons, mentioned above by Mr @Gomiero in his reply, and neural networks do essentially this same mapping (the output of a neuron may indicate a value, if used for regression, or a class, if used for classification), and so are generally considered supervised learning methods (although neural networks can also be used to extract interesting patterns of data, in the sense of the next type of learning). But there are other methods that should be studied. As, for example, the decision trees inductive, where training data is used to build a check tree that decides the class of a vector. A decision tree is nothing more than a sequence of chained if’s that check each of the given attributes (the values of the vector x) to decide what the answer is (y). There are algorithms that allow building the tree from training data, such as the ID3, which uses the entropy in the data to decide which attributes to check before the others (by offering earning more immediate to each decision).

2. Descriptive or Unsupervised Learning

In this approach the algorithm does not receive a previous data set to learn a "mapping". The idea is that the system is able, by itself (hence the expression "unsupervised"), to extract interesting patterns from the data. While in the previous approach the system is fed data pairs (input and output examples) in the training phase, in this approach the system is only fed input data - the output is directly deducted.

A cool and simple example of algorithm widely used in the unsupervised approach is the K-Means (or K-Means). "k" comes from the number of classes desired (this is the least the designer should know about the problem). The algorithm works like this:

  • First, we randomly choose k vectors for the probable centers of the groups.
  • Then, the distances between the other vectors and these centers are calculated. The vectors closest to the temporary centers are "grouped" to them.
  • For each group the "geometric center" of the group is calculated (which is basically the mean value of all the data in the group - hence the rest of the name of the algorithm), and thus the center of the group is changed to this geometric center obtained.
  • The previous steps are repeated until the system reaches convergence (i.e., the centers of the groups no longer change).

A visual illustration of the algorithm, for a problem with k=2 (i.e., two "classes" in the data), is as follows (the following image is an animated gif - each frame is 4 seconds long):

inserir a descrição da imagem aqui

Its use is very broad. For example, in image processing, one can often do the segmentation (extraction) of elements of interest using this algorithm. The data "points" are the pixel values (luminous intensity in one of the RGB bands or, more commonly, grayscale), and the number of classes is given by the designer, who knows how many "elements" are in the image. The following concrete example was taken from my master’s degree: in a microscopic image captured from an aluminum plate subjected to blasting with steel pellets (image from the left), I needed to separate the craters (impact of the spheres/grains) from the rest of the image (to make an important measurement in the process). Knowing that the image contains essentially three elements (the plate, the streaks and the craters), I used the K-averages with k=3 to group the pixels in these three groups (middle image) and then chose the group of darker average value throwing the rest away (making it white), to generate an image (image on the right) able to continue being treated by other algorithms:

inserir a descrição da imagem aqui

Another example of using this algorithm is in games. Having a database collected with information of the players' performance in a game over any time interval, one can process the data using K-Averages to "automatically" infer three groupings (again, k=3). As the data deals with performance, the groups can be imagined as the characterizations of beginner, intermediate and specialist, for example. If these groups are well defined (by the central vector of each group), a new player can be automatically classified as belonging to one of these groups based on the distance to the centers (it essentially belongs to the group that is closest).

There are other algorithms that should be studied, such as K-NN (nearest k-neighbours).

3. Learning by Reinforcement

In this approach the system requalifies its evaluation rules based on feedbacks world-watched. Unlike previous approaches, where a mass of data is used to construct a prediction model or infer an interesting pattern, in this case the system is constructed essentially as a stochastic (non-deterministic transition) state machine, where nodes are states of the world and transitions are actions that can be performed leading from one state to another. Transitions have associated "rewards", which can be adjusted according to the states actually achieved in a non-deterministic way are compared to the expected states.

I admittedly have very little (almost no, actually) experience with this approach, so my explanation is quite simplistic. Anyway, everything I’ve explained so far definitely doesn’t cover what you can study in machine learning. For example, there are many approaches that are inherently probabilistic, such as particle filters, and which can be understood as learning by reinforcement in the sense that the current state is constantly refined based on the adjustment of probabilities.

  • 1

    Thanks for the great (literally) reply. I’ll have to take a special time to read quietly ;)

  • Oops. Nothing. Sorry if it’s too big.

  • 1

    Irony of artificial intelligences: I put your text on google translator, for her to read to me.

  • And it worked, right? :)

  • 1

    +1 for the translation option as "Computer Learning". "Machine Learning", at least for me, is very literal and strange.

  • +1 for the excellent answer.

Show 1 more comment

8

What Machine Learning Would Be?

In accordance with the Wikipedia, are algorithms and techniques that allow the computer to learn, that is, that allow the computer to improve its performance in some task.

It is related to (or is a form of) Artificial Intelligence?

Yes, it is. It is a sub-area (or sub-field) of artificial intelligence.

What a simple example we can cite on this subject?

I believe the simplest example is the functioning of a perceptron, or binary classifier (artificial neuron):

X1 ----- P1 -------          +----------------------------+
                    \        !  const threshold = 0.5;    !
X2 ----- P2 -----    \       !                            !
                  \   \      !  soma = ∑ (Xi*Pi)          !
                    ---O-----+  se (soma > threshold)     +-------> saída
                  /   /      !    saída = 1;              !
X3 ----- P3 -----    /       !  else                      !
                    /        !     saída = 0;             !
X4 ----- P4 -------          +----------------------------+



The perceptron, "learns" through training as follows (well simplified explanation):

Phase 1 feedforward:

  • The signs X1, X2 ... Xn are input signals.

  • These signals are multiplied by the weights (P1, P2, .... Pn)

  • The sum of Xi*Pi and compare this value with a Threshold or bias (boundary)

  • If it is larger or equal the output is 1

  • If it’s smaller, the exit is 0


Phase 2 backpropagation (This is where learning takes place):

  • The "error": the difference between the expected output and the perceptron output (SE - SP)

  • The "correction": the error is multiplied by a learning rate (a previously defined constant, e.g.: 0.01)

  • To add "correction" calculated with the initial weight (ex: P1 = P1 + "correção"; P2 = P2 + "correção", ...)

  • Update all the weights and go back to the Phase 1 until errors are zero (or small enough to determine that the network "learned")

After performing this process for a certain number (very large) of times, if there is conversion of weights (the error tends to zero), it is because the perceptron "learned".


See working here


In this example spreadsheet, the perceptron learns the operation of a logic gate "OR" (OR).

Learning occurs in the fifth iteration of training, when he "understands" what the output should be.

The values (weights and learning rate) were adjusted so that learning takes place more slowly, to facilitate the understanding of the algorithm.

Since it is a very simple example, if the learning rate has a higher value, the perceptron already "learns" in the second interaction.

To learn more about the subject, I recommend the books and links indicated in the "References" area of Wikipedia in English:

Wikipedia - Machine Learning

Other Examples of Machine Learning Algorithms (in English):

Naive Bayes classifier

Random Forest

Support vector machine

1

Machine learning is one of the most relevant sub-areas of Artificial Intelligence (AI) that arose from the idea of creating programs that learn a certain behavior or a certain pattern automatically, from examples or observations. You can read more about Machine Learning and some concepts in the post "Machine Learning and Azure".

Now, to get a better idea of how it works and have an easy example to be understood, I suggest reading the post "Creating a predictive model in Azure Machine Learning"

These two posts will give you a good idea on this subject!

1

Machine learning is a data analysis method that automates the development of analytical models. Using algorithms that learn interactively from data, machine learning allows computers to find hidden insights without being explicitly programmed to look for something specific.

0

Machine learning or machine learning ("machine Learning") is a sub-field of artificial intelligence dedicated to the development of algorithms and techniques that enable the computer to learn, that is, allow the computer to improve its performance in some task. Whereas in artificial intelligence there are two types of reasoning - the inductive, which extracts rules and patterns from large data sets, and the deductive - machine learning only cares about the inductive.

Some parts of machine learning are closely linked to data mining and statistics. His research focuses on the properties of statistical methods, as well as their computational complexity.

Its practical application includes:

  1. Recognition of purchase or consumption patterns
  2. natural language processing
  3. search engines, medical diagnostics
  4. bioinformatics
  5. speech recognition
  6. writing recognition
  7. computer vision
  8. robot locomotion.

Source: Wikpédia

Browser other questions tagged

You are not signed in. Login or sign up in order to post.