Autonomous Navigation and SLAM: How Robots Find Their Way

At UDHY – Explore Autonomous Navigation and SLAM (Simultaneous Localisation and Mapping ), we help learners grasp core concepts of how robots perceive, map, and navigate complex environments.

In this section, you will learn: Autonomous Navigation and SLAM: How Robots Find Their Way

Master SLAM and sensor fusion to help robots map, localize, and navigate safely — advanced robotics skills for real‑world autonomous systems.
Prerequisites: UDHY Deep Learning for Robotics · UDHY Advanced Robotics Course · ROS 2 basics (Jazzy Jalisco)

⏱ 12–16 hours · Self-paced 📋 5 modules 💻 3 code samples ✅ Free at UDHY.com

TL;DR — Quick Insights

SLAM is how robots answer “where am I?” in environments they have never seen before — without Global Positioning System (GPS), without pre-built maps.
Every autonomous vehicle, warehouse robot, and delivery drone uses SLAM or a derivative as the foundation of its navigation system.
The Nav2 stack in ROS 2 provides production-ready autonomous navigation tools used in commercial robots today — and you will implement it in this course.
Path planning algorithms (A*, RRT*, Dijkstra) determine not just where to go but how to get there safely — understanding these is essential for any robotics engineer.

Table Of Contents

Introduction
Module 1: The Navigation Problem — What Does "Getting From A to B" Really Require?
Module 2: SLAM — Simultaneous Localisation and Mapping
Module 3: Path Planning — Finding the Optimal Route
Module 4: ROS 2 Navigation Stack — Nav2 in Practice
Module 5: Learning-Based Navigation — Where Deep Learning Meets SLAM
What You Have Learned
References

Introduction

I have been in this industry for more than a decade — co-founding Moovita, Singapore’s first autonomous vehicle company, and spending years as a Principal Research Scientist at A*STAR developing navigation systems for autonomous vehicles. Navigation is where I have spent the most time — it is simultaneously one of the most beautifully mathematical problems in robotics and one of the most practically consequential.

Autonomous robot using SLAM and sensor fusion for safe navigation. — Robots rely on SLAM and sensor fusion to map and navigate safely.

Consider what a delivery robot in a hospital corridor must do: enter a building it has never been in before, build a map of the environment while simultaneously figuring out where it is within that map, plan a path to the pharmacy on the third floor, avoid doctors and patients in real time, recover when someone moves a trolley that was not there yesterday, and arrive at the right room without a single GPS signal.

Every one of those tasks requires a different component of what roboticists call the navigation stack. This course teaches you every layer — from the mathematics of SLAM to the implementation of ROS 2 Nav2 to the deep learning techniques that are transforming classical navigation in 2026.

This course completes the advanced AI trilogy. Deep Learning for Robotics gave the robot perception — the ability to see and understand the world. Reinforcement Learning for Robotics gave it the ability to learn how to act. This course gives it the ability to go somewhere — to navigate purposefully through the physical world.

Module 1: The Navigation Problem — What Does “Getting From A to B” Really Require?

1.1 The Four Core Navigation Challenges

Navigation in robotics is not a single problem — it is four interrelated problems that must be solved simultaneously:

Localisation: Where am I? The robot must know its position and orientation in the world with sufficient precision to make useful decisions. GPS works outdoors with 3–5 metre accuracy — too imprecise for indoor robots or the centimetre-level accuracy needed for autonomous vehicles. Indoor robots typically use LiDAR-based SLAM, visual odometry, or Ultra-Wideband (UWB) radio positioning.

Mapping: What does the environment look like? The robot must build and maintain a representation of its surroundings — obstacles, free space, landmarks — that is accurate enough to plan safe paths.

Path Planning: Given where I am and where I want to go, what is the optimal route? This involves global planning (the overall route from current position to goal) and local planning (adjusting moment-by-moment to avoid dynamic obstacles).

Motion Control: Given a planned path, how do I execute it precisely? This involves converting high-level waypoints into low-level motor commands — joint torques, wheel velocities, gripper forces — that physically move the robot along the planned path.

1.2 Why Classical Navigation Is Not Enough

Classical navigation relied on pre-built maps, precise localisation sensors, and deterministic environments. In a factory with fixed walls and predictable robot paths, this works. In the real world — hospitals, warehouses, outdoor environments — it fails:

Environments change: furniture moves, doors open and close, construction happens
Pre-built maps go stale the moment the environment changes
GPS fails indoors and in urban canyons
Static planners cannot avoid pedestrians, other robots, or unexpected obstacles

This is why SLAM, combined with deep learning-based semantic understanding, has become the foundation of modern robot navigation. As we explored in Why Self-Driving Cars Still Fail, the challenge is not building a robot that works in a known environment — it is building one that works when the environment does not match expectations.

Think about it: Design the navigation system for an autonomous delivery robot navigating from a loading dock to an apartment door in a residential building. What sensing, mapping, and planning components would it need? Where would it most likely fail?

Module 2: SLAM — Simultaneous Localisation and Mapping

2.1 The Chicken-and-Egg Problem

SLAM solves what is fundamentally a circular dependency:

To localise (know where I am), I need a map
To build a map, I need to know where I am

SLAM solves both simultaneously — building the map and estimating position within it at the same time. This is a remarkable computational achievement, and it is why SLAM was considered unsolvable for decades before probabilistic methods cracked it in the 1990s.

2.2 How SLAM Works — The Probabilistic Foundation

Modern SLAM represents both the robot’s position and the map as probability distributions, not exact values. At any moment, the robot maintains a belief — a probability distribution over all possible positions it might be in — and updates this belief with each new sensor observation.

The Extended Kalman Filter (EKF-SLAM): The classic approach — maintains a Gaussian distribution over robot pose and landmark positions, updating with each sensor reading. Works well in small environments with clear landmarks. Computationally expensive as map size grows (O(n²) complexity in number of landmarks).

Particle Filter SLAM (FastSLAM): Represents the robot’s position as a set of particles (hypotheses), each carrying its own map estimate. More robust to non-Gaussian noise and better handles the robot kidnapping problem (being moved to an unknown location). Used in early Roomba navigation systems.

Graph-Based SLAM: The modern standard. Represents the robot’s trajectory as a graph of poses connected by edges (sensor measurements). Optimises the entire graph simultaneously to find the globally consistent map and trajectory. Used in all serious production navigation systems.

2.3 Practical Example: LiDAR SLAM with Python

# Required: pip install open3d numpy scipy matplotlib
import numpy as np
import matplotlib.pyplot as plt
from scipy.spatial.distance import cdist

# ─────────────────────────────────────────────
# Simplified 2D LiDAR SLAM demonstration
# Implements the core scan-matching loop used in
# production SLAM systems like Cartographer and SLAM Toolbox
# ─────────────────────────────────────────────

class SimpleSLAM:
    """
    Demonstrates the core SLAM loop:
    1. Take a LiDAR scan
    2. Match it to the existing map (ICP scan matching)
    3. Update robot position estimate
    4. Add new scan data to the map
    """

    def __init__(self):
        self.map_points = np.empty((0, 2))    # Global map (x,y coordinates)
        self.robot_pose = np.array([0.0, 0.0, 0.0])  # [x, y, theta]
        self.pose_history = [self.robot_pose.copy()]

    def simulate_lidar_scan(self, true_pose, noise_std=0.02):
        """Simulate a LiDAR scan of a rectangular room"""
        angles = np.linspace(0, 2*np.pi, 360)
        room_walls = [
            (0, 0, 10, 0),   # Bottom wall
            (10, 0, 10, 8),  # Right wall
            (10, 8, 0, 8),   # Top wall
            (0, 8, 0, 0),    # Left wall
        ]

        scan_points = []
        for angle in angles:
            ray_dir = np.array([np.cos(angle + true_pose[2]),
                               np.sin(angle + true_pose[2])])
            min_dist = 15.0

            # Find closest wall intersection
            for (x1, y1, x2, y2) in room_walls:
                wall_vec = np.array([x2-x1, y2-y1])
                wall_normal = np.array([-(y2-y1), x2-x1])
                denom = np.dot(ray_dir, wall_normal)
                if abs(denom) > 1e-6:
                    t = np.dot(np.array([x1, y1]) - true_pose[:2], wall_normal) / denom
                    if 0 < t < min_dist:
                        u = np.dot(true_pose[:2] + t*ray_dir - np.array([x1,y1]),
                                  wall_vec) / np.dot(wall_vec, wall_vec)
                        if 0 <= u <= 1:
                            min_dist = t

            # Add measurement noise (simulates real LiDAR uncertainty)
            noisy_dist = min_dist + np.random.normal(0, noise_std)
            scan_points.append([
                true_pose[0] + noisy_dist * np.cos(angle + true_pose[2]),
                true_pose[1] + noisy_dist * np.sin(angle + true_pose[2])
            ])

        return np.array(scan_points)

    def icp_match(self, source, target, max_iterations=50):
        """
        Iterative Closest Point (ICP) scan matching
        Finds the transformation that best aligns source to target
        Core algorithm in all LiDAR-based SLAM systems
        """
        src = source.copy()
        total_rotation = 0.0
        total_translation = np.zeros(2)

        for _ in range(max_iterations):
            if len(target) == 0:
                break

            # Find closest target point for each source point
            distances = cdist(src, target)
            closest_idx = distances.argmin(axis=1)
            matched_target = target[closest_idx]

            # Only use good matches (within 0.5m)
            valid_mask = distances.min(axis=1) < 0.5
            if valid_mask.sum() < 10:
                break

            src_valid = src[valid_mask]
            tgt_valid = matched_target[valid_mask]

            # Compute optimal rotation and translation (SVD)
            src_centroid = src_valid.mean(axis=0)
            tgt_centroid = tgt_valid.mean(axis=0)
            H = (src_valid - src_centroid).T @ (tgt_valid - tgt_centroid)
            U, _, Vt = np.linalg.svd(H)
            R = Vt.T @ U.T
            t = tgt_centroid - R @ src_centroid

            # Apply transformation
            src = (R @ src.T).T + t
            total_rotation += np.arctan2(R[1,0], R[0,0])
            total_translation += t

        return total_translation, total_rotation

    def update(self, true_pose):
        """Full SLAM update step"""
        # Get new scan
        scan = self.simulate_lidar_scan(true_pose)

        # Match scan to existing map (localisation)
        if len(self.map_points) > 100:
            translation, rotation = self.icp_match(scan, self.map_points)
            self.robot_pose[0] += translation[0]
            self.robot_pose[1] += translation[1]
            self.robot_pose[2] += rotation

        # Add scan to map (mapping) — subsample to control density
        new_points = scan[::5]  # Every 5th point
        self.map_points = np.vstack([self.map_points, new_points]) \
            if len(self.map_points) > 0 else new_points
        self.pose_history.append(self.robot_pose.copy())


# Run SLAM simulation
slam = SimpleSLAM()
true_trajectory = [
    [1, 1, 0], [3, 1, 0], [5, 1, 0.2], [7, 2, 0.5],
    [8, 4, 1.2], [7, 6, 2.0], [5, 7, 2.5], [3, 7, 3.0],
    [1, 6, 3.5], [1, 4, 4.2], [1, 2, 4.8]
]

print("Running SLAM...")
for pose in true_trajectory:
    slam.update(pose)
    print(f"True: ({pose[0]:.1f}, {pose[1]:.1f}) | "
          f"Estimated: ({slam.robot_pose[0]:.2f}, {slam.robot_pose[1]:.2f})")

# Visualise the built map and robot trajectory
plt.figure(figsize=(10, 8))
plt.scatter(slam.map_points[:,0], slam.map_points[:,1],
           s=0.5, c='blue', alpha=0.3, label='Built Map')
history = np.array(slam.pose_history)
plt.plot(history[:,0], history[:,1], 'r-o', markersize=4, label='Estimated Path')
true_path = np.array(true_trajectory)
plt.plot(true_path[:,0], true_path[:,1], 'g--', label='True Path')
plt.legend(); plt.title("SLAM: Built Map and Robot Trajectory")
plt.xlabel("X (metres)"); plt.ylabel("Y (metres)")
plt.savefig("slam_result.png"); plt.show()
print("SLAM complete. Map saved.")

What this produces: A visual map of a room built entirely from noisy LiDAR scans, with the robot’s estimated trajectory overlaid. The difference between the green (true) and red (estimated) paths illustrates the localisation error that sophisticated SLAM systems minimise. Real production systems (Google Cartographer, SLAM Toolbox) use the same fundamental ICP + graph optimisation approach, with additional loop closure detection for large-scale environments.

Module 3: Path Planning — Finding the Optimal Route

3.1 The Planning Problem

Given a map and a goal, path planning answers: what sequence of positions should the robot move through to reach the goal safely and efficiently? This is a search problem — finding the lowest-cost path through a graph of possible robot configurations.

3.2 Classical Planning Algorithms

Dijkstra’s Algorithm: Explores all paths from start outward in order of increasing cost. Guaranteed to find the shortest path. Used when the full map is known and computation time is not critical.

A (A-Star):* Dijkstra with a heuristic — it prioritises exploring nodes that appear closer to the goal. Dramatically faster than Dijkstra in practice, while still guaranteed to find the optimal path if the heuristic is admissible (never overestimates). The most widely used path planning algorithm in robotics.

RRT (Rapidly-exploring Random Tree Star):* Samples random configurations in the robot’s state space and connects them into a tree. Particularly powerful for high-dimensional spaces (robot arms with many joints) where grid-based methods are computationally intractable. The algorithm of choice for manipulation planning.

3.3 A* Implementation for Robot Navigation

# Required: pip install numpy matplotlib heapq
import heapq
import numpy as np
import matplotlib.pyplot as plt

def astar(grid, start, goal):
    """
    A* path planning on a 2D occupancy grid
    grid: 2D array where 0=free, 1=obstacle
    start, goal: (row, col) tuples
    Returns: list of (row, col) waypoints from start to goal
    """
    rows, cols = grid.shape
    open_set = [(0, start)]       # (f_score, position)
    came_from = {}
    g_score = {start: 0}         # Cost from start to each node
    f_score = {start: heuristic(start, goal)}

    def heuristic(a, b):
        # Euclidean distance heuristic
        return np.sqrt((a[0]-b[0])**2 + (a[1]-b[1])**2)

    # 8-directional movement (includes diagonals)
    directions = [(-1,0),(1,0),(0,-1),(0,1),
                  (-1,-1),(-1,1),(1,-1),(1,1)]

    while open_set:
        current_f, current = heapq.heappop(open_set)

        if current == goal:
            # Reconstruct path
            path = []
            while current in came_from:
                path.append(current)
                current = came_from[current]
            path.append(start)
            return path[::-1]  # Reverse to get start→goal order

        for dr, dc in directions:
            neighbor = (current[0]+dr, current[1]+dc)
            nr, nc = neighbor

            if not (0 <= nr < rows and 0 <= nc < cols):
                continue  # Out of bounds
            if grid[nr][nc] == 1:
                continue  # Obstacle

            # Diagonal moves cost √2, straight moves cost 1
            move_cost = np.sqrt(2) if abs(dr)+abs(dc) == 2 else 1.0
            tentative_g = g_score[current] + move_cost

            if neighbor not in g_score or tentative_g < g_score[neighbor]:
                came_from[neighbor] = current
                g_score[neighbor] = tentative_g
                f_score[neighbor] = tentative_g + heuristic(neighbor, goal)
                heapq.heappush(open_set, (f_score[neighbor], neighbor))

    return None  # No path found

# Create a test environment — 20×20 grid with obstacles
grid = np.zeros((20, 20), dtype=int)

# Add obstacles (walls and objects the robot must navigate around)
grid[5, 2:15] = 1     # Horizontal wall
grid[2:10, 10] = 1    # Vertical wall
grid[12:18, 5] = 1    # Another vertical obstacle
grid[14, 5:15] = 1    # Another horizontal wall

start = (0, 0)
goal  = (19, 19)

path = astar(grid, start, goal)

if path:
    print(f"Path found: {len(path)} waypoints")
    print(f"Path length: {sum(np.sqrt((path[i+1][0]-path[i][0])**2 + (path[i+1][1]-path[i][1])**2) for i in range(len(path)-1)):.2f} grid units")

    # Visualise
    plt.figure(figsize=(8, 8))
    plt.imshow(grid, cmap='binary')
    path_arr = np.array(path)
    plt.plot(path_arr[:,1], path_arr[:,0], 'b-', linewidth=2, label='A* Path')
    plt.plot(start[1], start[0], 'go', markersize=12, label='Start')
    plt.plot(goal[1], goal[0], 'r*', markersize=15, label='Goal')
    plt.legend(); plt.title("A* Robot Path Planning")
    plt.savefig("astar_path.png"); plt.show()
else:
    print("No path found — goal unreachable")

Module 4: ROS 2 Navigation Stack — Nav2 in Practice

4.1 Nav2 — The Production Navigation Framework

Nav2 (Navigation2) is the ROS 2 navigation stack — a production-ready, modular framework used in commercial robots from Amazon Robotics, Fetch Robotics, and Clearpath Robotics. It implements the complete navigation pipeline:

SLAM Toolbox — online SLAM using graph-based optimisation
AMCL (Adaptive Monte Carlo Localisation) — localisation in a known map using particle filters
Nav2 Planner Server — global path planning (A*, Smac Planner)
Nav2 Controller Server — local trajectory following with obstacle avoidance (DWB, TEB)
Behaviour Trees — high-level task sequencing for complex navigation behaviours

4.2 Setting Up Nav2 for a Differential Drive Robot

# Install Nav2 on Ubuntu 24 with ROS 2 Jazzy Jalisco
sudo apt install ros-jazzy-navigation2 ros-jazzy-nav2-bringup

# Install Turtlebot3 simulation for testing
sudo apt install ros-jazzy-turtlebot3-gazebo ros-jazzy-turtlebot3-navigation2

# Set your robot model
export TURTLEBOT3_MODEL=burger

Module 5: Learning-Based Navigation — Where Deep Learning Meets SLAM

5.1 Beyond Classical Navigation

Classical SLAM and A* work well in static, well-structured environments. They fail when:

The environment changes faster than the map can update
Lighting conditions make sensor data unreliable
The robot needs semantic understanding (“navigate to the meeting room” not “navigate to coordinates 4.2, 7.8”)

Deep learning addresses all three limitations, and this is the frontier of navigation research in 2026.

5.2 Neural SLAM and Semantic Mapping

Neural SLAM replaces the hand-crafted map representation with a learned neural representation. The robot’s map is stored as a latent vector in a neural network — richer and more flexible than a 2D grid, but harder to interpret and debug.

Semantic SLAM adds object-level understanding to the map. Instead of a grid of free/occupied cells, the robot builds a map of objects — “chair at position (2.3, 4.1, 0.0)”, “door at (5.0, 2.2, 0.0)”, “person moving at (3.0, 3.5, 0.0) with velocity (0.3, 0.0)”. This enables natural language navigation instructions and long-term reasoning about the environment.

The connection to what you learned in Course 1 is direct — the YOLO object detector and LiDAR point cloud processor built in UDHY’s Advanced Robotics Course are the perception components that feed a semantic SLAM system. The deep learning pipeline powers the semantic understanding; the SLAM framework provides the geometric consistency.

5.3 End-to-End Navigation with Neural Networks

The most ambitious approach — end-to-end neural navigation — removes the modular pipeline entirely. A single neural network takes sensor inputs (camera, LiDAR, GPS) and a goal specification, and directly outputs motor commands. No explicit SLAM, no explicit planning, no hand-crafted components.

This approach, exemplified by DeepMind’s Navigation Agent (2017) and its successors, learns to navigate purely from experience in simulation. The connection to Course 2 is direct — end-to-end navigation is a reinforcement learning problem: the agent receives a reward for reaching the goal and a penalty for collisions, and learns a navigation policy through millions of simulated episodes.

The tradeoff: end-to-end systems are less interpretable, harder to debug, and struggle with systematic errors. Production systems in 2026 use hybrid approaches — classical SLAM for reliable geometric mapping, neural networks for semantic understanding and difficult perception tasks, and learned planners for handling situations the classical planner cannot handle.

What You Have Learned

By completing this course you can now:

Explain the four core navigation challenges and how they interact
Describe how SLAM solves the localisation-and-mapping circular dependency
Implement ICP scan matching and A* path planning in Python
Configure and command a Nav2-enabled robot in ROS 2
Explain how deep learning extends classical navigation with semantic understanding and end-to-end approaches

You have now completed the UDHY Advanced AI for Robotics trilogy. Your next step is UDHY’s Expert Robotics Course — where VLA models, Physical AI, and production ROS 2 deployment bring everything together into systems that operate in the real world.

1. What is the difference between SLAM and GPS navigation?

GPS provides absolute position on Earth with 3–5 metre accuracy outdoors — it tells you where you are on a global coordinate system. SLAM builds a local map of the immediate environment and localises within it, achieving centimetre-level accuracy but only relative to local landmarks, not global position. GPS fails indoors, in tunnels, and in urban canyons with tall buildings — exactly where robots most need to operate. SLAM works everywhere sensors can perceive the environment.

2. What is the best SLAM library for beginners?

Start with SLAM Toolbox — the standard SLAM package for ROS 2, well-documented and actively maintained. For pure Python learning without ROS, the simplified implementation in this course demonstrates the core concepts. For production: Google Cartographer (graph-based, highly accurate) and ORB-SLAM3 (visual SLAM using cameras only, no LiDAR needed).

3. Do autonomous vehicles use SLAM?

Yes — with important differences from mobile robot SLAM. Autonomous vehicles use HD (High Definition) mapping: pre-built, centimetre-accurate maps created by survey vehicles. At runtime, the AV localises within this pre-built map using LiDAR scan matching — essentially the “L” part of SLAM without the “M” in real-time. This is covered in UDHY’s Sensor Fusion Explained and Why Self-Driving Cars Still Fail.

4. How does Nav2 compare to ROS 1’s move_base?

Nav2 is Nav2 is a complete rewrite of move_base for ROS 2 with four major improvements: behaviour tree-based task execution (more flexible and debuggable than move_base’s state machine), lifecycle management (Nav2 nodes can be paused and restarted cleanly), plugin-based architecture (swap planners and controllers without recompiling), and active community maintenance. All new robotics projects should use Nav2 — move_base is deprecated.

5. What sensors do I need for SLAM on a budget?

The minimum viable SLAM setup: any 2D LiDAR (RPLIDAR A1 at ~$100 is the most popular budget option) plus wheel odometry (built into most differential drive robots). For better performance: 3D LiDAR (Ouster OS0, $1,500+) or depth camera (Intel RealSense D435i, ~$200). Camera-only SLAM (ORB-SLAM3) works with any RGB camera — even a webcam — though accuracy is lower than LiDAR-based approaches. Hardware recommendations are in UDHY’s How to Choose the Right LiDAR.

References

Durrant-Whyte, H. & Bailey, T. (2006). Simultaneous Localisation and Mapping: Part I. IEEE Robotics & Automation Magazine. ieeexplore.ieee.org
Macenski, S. et al. (2023). SLAM Toolbox: SLAM for the Dynamic World. Journal of Open Source Software. joss.theoj.org
Hart, P. et al. (1968). A Formal Basis for the Heuristic Determination of Minimum Cost Paths. IEEE Transactions on Systems Science. — Original A* paper.
Macenski, S. et al. (2023). From the Desks of ROS Maintainers: A ROS 2 Nav2 Book. nav2.org
Davison, A. et al. (2007). MonoSLAM: Real-Time Single Camera SLAM. IEEE TPAMI. ieeexplore.ieee.org
Campos, C. et al. (2021). ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial, and Multimap SLAM. IEEE T-RO. arxiv.org
WPI Robotics Engineering. (2025–2026). Advanced Topics: Autonomous Navigation and SLAM. wpi.edu
Deep Robot Perception Course. (2026). Robot Perception and Manipulation. University of Michigan. deeprob.org

Designed by Dr. Dilip Kumar Limbu — Former Principal Research Scientist, ASTAR · Co-Founder, Moovita, Singapore’s first autonomous vehicle company · 30 years building real-world autonomous systems. UDHY.com*