Spaces¶
Now that you've seen how state describes the internal world of your environment, we'll move on to our next concept and talk about the interface an agent sees. That's what spaces are for.
What is a Space?¶
A space is a contract that describes the shape, bounds, and dtype of the observations and actions flowing in and out of your environment.
You can think of it like a type signature for RL environment data that tells the agent what it can see (observations) and how it can act (actions).
Every environment must have the following:
- Observation space - what the agent sees (
env.observation_space) - Action space - how the agent interacts with the environment (
env.action_space)
Why Use Them?¶
For three reasons:
- Agents can easily shape their policy networks - a policy needs to know the action space to build its output head, and the observation space to build its input layer.
- We can easily expand environments using wrappers - certain Envrax built-in wrappers like
GrayscaleObservationcheck that the observation input isuint8[H, W, 3]. Without spaces, we'd have to add extra logic within our training loop. - You can catch bugs early - if your env claims
Box(0, 1, (4,))but actually returns shape(3,), tests can verify the contract in seconds.
Built-In Spaces¶
API Docs
Envrax ships with three space types: Discrete, Box and MultiDiscrete.
All of them implement three methods from the Space contract:
sample(rng)— draw a random elementcontains(x)— check ifxis a valid element of the spacebatch(n)— return a batched version of the space with a leading dimensionn.
Discrete¶
API Docs
Discrete spaces are one of the simplest available and are commonly used for deterministic problem sets.
Here are some example use cases:
- Action space - agent moves with 4 movements:
[up, right, down, left] - Observation space - environment is a 4x4 grid world of indices
[x, y]
We can make one like so:
| Python | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 12 | |
Because n is a static Python int, you can use it directly in shape declarations or jnp.arange(space.n) without issues.
Box¶
API Docs
Box spaces are another common type that are often used for continuous-valued observations or actions with per-dimension bounds.
When comparing Box to Discrete, Box focuses on continuous ranges with bounds while Discrete focuses on counting based approaches.
| Python | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
Integer dtypes are also supported:
| Python | |
|---|---|
1 2 | |
MultiDiscrete¶
API Docs
MultiDiscrete is less common than the others and is used when an action is a vector of independent discrete choices, e.g., a game pad with a directional stick (4 options) and two buttons (with 2 options each):
| Python | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 | |
Each element i of the sampled action satisfies 0 <= action[i] < nvec[i].
Picking the Right Space¶
A quick decision tree:
| Your data is... | Space |
|---|---|
One categorical choice ("up", "down", ...) |
Discrete |
| A continuous array (positions, velocities, pixels) | Box |
| A vector of independent categorical choices | MultiDiscrete |
If none fit, you're probably modelling something more exotic (e.g. a Tuple or Dict) that Envrax doesn't currently support. In this case, there are two options:
- Encode it as a flat
BoxorMultiDiscreteand decode it yourself inside your environment. - Build your own by subclassing
Spaceand implementingsample/contains/batch. You can learn more about this in the advanced tutorial - Creating a Custom Space.
Recap¶
And that's that! Nice job ! Let's quickly recap:
- Spaces are contracts that describe the shape and bounds of observations and actions
- Use
Discrete(n)for a single categorical choice - Use
Box(low, high, shape, dtype)for continuous arrays or images - Use
MultiDiscrete(nvec)for a vector of independent categorical choices
All three Space methods — sample(rng), contains(x), and batch(n) — are available on every space, ready for use in testing, wrappers, and VecEnv.
Next Steps¶
Two foundational pieces down. Next up: how environments hold their static, per-env settings via EnvConfig!
-
Environment Configuration
Learn how to extend
EnvConfigwith your own static fields and how it differs fromEnvState.