All the examples cited by @Victorstafusa in their response are complex games. Let’s look at the block Breaker, since he was the example of the question?
To define a smart agent, you must determine PEAS:
- P: performance
- And: ambience (Environment)
- To: actuators
- S: sensors
An equivalent model is PAGE, but he has ontological limitations that are best met by PEAS:
- P: perception
- To: action
- G: goal (Goal)
- And: ambience (Environment)
Have you noticed that in both you need to determine the environment? What is the environment of block Breaker?
I will leave the focus of multiple environments aside since there is only action in one environment
To begin describing the environment for the agent, we need to describe at least the following properties:
- dynamism
the environment is dynamic or static? That is, it undergoes changes in its agent-independent internal state?
- determinism
the environment is deterministic or stochastic? Is there any possibility of random event that changes its internal state? Note: chaos (as described in chaos theory, the butterfly effect) is not random, just difficult to predict
- number of agents
we’re dealing with how many smart agents in this environment?
- observability
the environment is completely observable? Or only partial knowledge?
- continuity of time
is an agent with episodic actions? Or an action has consequences in the internal state of the system that needs to be taken into account in a future action?
Responding:
- is an environment dynamic, it evolves continuously in time; this is due to the mechanics of the game, since it is an action game, which requires real-time intervention
- as far as I’m concerned, it’s an environment deterministic: given a configuration, the next step is already set; but it can be stochastic if the physics of the deployed collision allows two equal collisions to result in distinct results (I’ve seen this in Voodoo’s Idle Balls game for iPhone); it could also be stochastic if blocks appeared on the map without agent intervention
- is a typically system mono-agent, where there is only one agent in the environment
- he is fully observable by the agent, there is no hidden information; maybe you have information that is presented but unknown by the agent (for example, gray blocks require three collisions to break, golden blocks do not break; blue block is worth more points than red)but that doesn’t mean the information isn’t there
- is not episodic, an action interferes with the internal state of the environment
Well, we start with these characteristics of the environment. Phew? Well, not yet...
The environment continues to be governed by a set of "laws". These "laws", when we are talking about games, we call them "mechanics". These mechanics imply how the universe evolves by itself (it’s a dynamic environment, it’s not static) and also how you can interact with the game.
We have four main mechanics in this game:
- uniform rectilinear motion
given that the ball follows its course
- collisions
when the ball comes in contact with something (wall, block, racket), there is a collision that will change the state of the ball and the collide
- ball leak
in case the ball goes through the bottom, it leaked
- racket movement
the player can define how to act with the racket
On top of the mechanics we have the rules. For example, ball leak implies that lost a life (or lost points). Gray blocks can withstand two collisions and only leave on the third. The paddle moves continuously limited to a maximum speed, or else it makes magical jumps between positions. The ball when colliding with the moving racket gains/loses speed. Breaking blocks generates points. Anyway, there are other possible rules that don’t come to the point.
Hmmm, did you notice that while we were describing the rules of the game, we touched on another aspect of the environment? They are the "elements" of the environment, the "objects" contained in it. These objects are divided into the following classes::
- ball
- racket
- block
- wall
- bottomless pit
These classes are important because the agent will perceive that. The agent’s perception will be relative to the object and its intrinsic properties, which depend on each class.
- ball
that object contains center position ((x,y)
), speed (direction/direction/magnitude) and radius
- racket
that object contains center position ((x,y)
) and also dimensions to calculate where its four vertices are; it may also contain maximum speed if it is specified that it does not jump positions
- block
that object contains center position ((x,y)
) and, also, dimensions to calculate where its four vertices are, also contains color, with possible implication in the rules
- wall
this object is located on the upper and lateral edges of the environment, having no other relevant information
- bottomless pit
this object is located at the bottom edge of the game, having no other relevant information
And what are the possible performances agent? Well, just move the racket. But this can happen in two ways:
- goes to the specified side at speed
X
(maybe X
is informable, maybe is the maximum speed)
- go to this specific position (if it has full speed, it will move at full speed, otherwise it will be teleportation)
And how do I calculate the performance agent? Well, we can start with the score of the game.
In that specific case the block Breaker, the actions do not influence the performance directly, because no matter how much you move the racket, it will only mean something if you find the ball. So the agent needs to calculate his actions knowing that the reward is not immediate. This is particularly challenging if the agent knows the minimum of the environment ontology, as you are requesting. A learning function focused on a half-Pavlovian model of action/reward or action/punishment does not work for this type of environment (especially if the ball speed is low).
You’re talking about genetic algorithms?
– Victor Stafusa
@Victorstafusa Yes, I want the algorithm to learn for itself which way it wants to play. I understand more or less how genetic algorithms work, I just don’t quite know how to fit it into the code.
– Lucas Caresia
@Lucascarezia unsupervised learning does not need to be with genetic algorithm. It can be a utility-oriented agent. Has neural networks that learn without being supervised
– Jefferson Quesado
If you want to study more on the subject, they managed to make an AI that plays Mario. I believe they share how they did it. If I’m not mistaken, they modeled a utility agent
– Jefferson Quesado
Exactly, I don’t want you to be supervised. However my doubt is how to do this with the game and information I have at the moment, or maybe what more information I need to be able to do.
– Lucas Caresia