Backtracking: algorithms explained (7 of 8)

Backtracking is an important algorithm for software engineers to know. It can efficiently solve constraint satisfaction problems by eliminating partial solutions that won’t complete to a valid solution.

Below, we’ll explain exactly what backtracking is, how it works, and when and how to implement it. We’ve also included a handy cheat sheet so you can check its space-time complexity at a glance.

Here’s an outline of what we’ll cover:

  1. Backtracking basics
  2. Backtracking cheat sheet
  3. How to prepare for a coding interview

Let's get into it.

1. Backtracking basics

In order to crack questions about backtracking, you’ll need to have a strong understanding of the algorithm, how it works, and when to use it.

1.1 What is backtracking?

Backtracking is a form of brute-force problem solving, but with the ability to discard potential solutions early, before they are fully explored. It is an algorithmic paradigm for incrementally finding solutions to problems. As soon as a candidate will not result in the final complete solution being valid, the algorithm will “backtrack.” Therefore, backtracking can be applied when it is possible to test the validity of partial solutions.

Graph coloring provides us with an example of a backtracking algorithm in action. Suppose we have the following map of the continents and have a problem that requires us to give each continent a color such that no two adjacent continents have the same color, using the fewest colors possible. Such a condition is known as a constraint. We can test partial solutions against this constraint and discard them early if it’s clear the constraint won’t be met. This is the essence of backtracking.

Backtracking map coloring

Here, Europe and Asia should not be allowed to have the same color, and it's possible to color the map using only three colors.

It’s easier to view this problem as an undirected graph, like this:

Backtracking undirected graph

Each node is a continent, and the edges between the nodes represent adjacent continents. Now our problem is coloring each node so that no two adjacent nodes have the same color, using the fewest colors (e.g. EU and AS should have different colors).

The solution starts by giving the American (AM) continent the first color – red. Europe (EU) is next, and since AM is already colored red, EU takes the next available color – blue. The algorithm prunes the search tree here by not considering red for EU, because all complete solutions that color AM and EU both red will be invalid. Therefore, the subsolution coloring both AM and EU red is also invalid.

Asia (AS) is next. AS can take red, since it is not connected to AM and the objective is to use the fewest colors. If no coloring of Africa (AF) and Australia and the Pacific Islands (AU) allows AS to be red, then the algorithm will backtrack to this point, and choose the next available color for AS. In that instance, blue will be skipped, since EU has been colored blue in this partial solution, so green might be next. However, in this example red is a valid color for AS with AF getting the color green and AU the color blue.

Backtracking example

1.1.1 Use cases for backtracking

Backtracking can be applied to the following problems:

  • Most Constraint Satisfaction Problems (CSP) like:
    • The knight’s tour problem
    • The n queens problem
    • Solving sudoku
    • Solving crosswords
    • Graph coloring
  • Most brute-force algorithms
  • String pattern matching
  • To adapt a longest common subsequence (LCS) algorithm to print all the LCS in order
  • Finding all subsets of a set
  • Finding all paths from a source to a destination

        1.1.2 Example implementation

        Backtracking implementations are commonly recursive and similar to depth-first search (DFS), except that only valid subsolutions are expanded in backtracking, whereas DFS will expand all the subsolutions of a node in the search space.

        Our graph coloring results in the following implementation:

        Here backtracking is just a wrapper around our recursive function. It prepares our solution colors to None for each node in the graph. The first node to color is then selected and if the recursive call succeeds, the solution can be returned.

        The recursive call has the base case of current being None, meaning all the nodes have a valid color. Else, every possible color for the current node will be tested by adding it to the solution. If the complete solution is valid then the recursion can end, else backtracking needs to happen by setting the color back to None and the next color or node can be tested.

        There are two helper functions that are specific to each problem being solved with backtracking. The first is select_unassigned to select the next node that needs to be colored. In this implementation, this helper function colors the nodes from 0 to n by finding the first open spot in solution. When no open spot is found, it means all the nodes are colored, so None is returned to stop the recursion. The second helper, get_domain_values, returns valid values for each node at a specific subsolution. For our example, this is where the algorithm should not consider red as a color for EU when AM is already red. This is done by looping through all the available colors and only yielding when a neighboring node does not already have the same color (the break together with the else is used in python to continue the outer loop).

        The variable graph is an adjacency matrix where a graph[i][j] = 1 means that i and j are neighbors and AM is i = 0; EU is 2; AS is 3; AU is 4; and AF is 5.

        Some heuristics can be used on the helper functions to improve performance. For the example above, after AM is red and EU is blue, the only value left for AF is green. But AF is last in the adjacency matrix, so it will be evaluated last. By evaluating AF earlier, the search space can be pruned earlier. Choosing the node with the least possible values is called the minimum-remaining-values (MRV) heuristic, and changes the implementation as follows:

        Here, all the empty spots in solution are again found. But rather than stopping at the first empty spot, all valid colors for a spot are found with their count stored in dict in the first loop. If dict is empty, then all the nodes have a color and the recursion can stop, else the MRV is found using a heap.

        Other heuristics are also possible. One is deciding the very first node to color using the degree heuristic to reduce the branching factor of future choices. The degree heuristic will choose the node with the largest number of constraints first – for our continents example, it is AF, since it has 4 edges while the other nodes have 3 or 2 edges. The degree heuristic can also be used to break a tie in the MRV heuristic.

        Another heuristic is for getting the next color in get_domain_values. In our example, after AM is red and EU is green, it would be a bad idea to give AS blue since that will reduce the valid colors for AF (AF gets blue in the final solution). So trying red for AS first is a better idea by using the least constraining value heuristic.

        The idea with heuristics is to reduce branching by selecting nodes that will fail as early as possible, but to go for a maximum search space depth by selecting values (colors) that will fail as late as possible. Together, these two tactics result in earlier success.

        1.1.3 How backtracking compares to other algorithms

        A brute-force depth-first search (DFS) is similar to backtracking. A brute-force DFS will fully consider all the values (colors) for a node, but backtracking is able to prune its search space by only evaluating values that result in a valid subsolution.

        An algorithm that does not need to backtrack when it hits a dead-end is a greedy algorithm. A greedy algorithm only works for optimization problems with the greedy choice property to render the need for backtracking obsolete. But a greedy algorithm can only return a single solution, while backtracking can return all possible solutions.

        2. Backtracking cheat sheet

        Backtracking cheat sheet

        You can download the cheat sheet here.

        2.1 Related algorithms and techniques

        2.2 Cheat sheet explained

        The cheat sheet above is a summary of information you might need to know for an interview, but it’s usually not enough to simply memorize it. Instead, aim to understand each result so that you can give the answer in context.

        The cheat sheet is broken into time complexity and space complexity (the processing time and memory requirements for backtracking). Backtracking is related to several data structures, which are also included in the cheat sheet.

        For more information about time and space requirements of different algorithms, read our complete guide to big-O notation and complexity analysis.

        2.2.1 Time complexity

        In our graph coloring example, the worst case will give each node its own color. Therefore, let m = |V| be the number of colors. This is also our branching factor at each node in the search space. Since every node needs to be assigned a color, the search space has a height of |V| to give a time-complexity of O(m^|V|).

        For all implementations, the get_domain_values will always determine the branching factor and the height will be the same size as solution. But for a problem like the n queens problem, solution is a 2D array with size n^2.

        2.2.2 Space complexity

        For the graph coloring problem, both solution and the implicit stack have a size of |V| to give a O(|V|) space complexity. The complexity will be bound to solution, and therefore gives the n queens problem a space complexity of O(n^2).

        2.2.3 Related algorithms: dynamic programming and local search

        Graph coloring can also be solved with other algorithmic paradigms, like dynamic programming. A dynamic programming solution will take O(2.4423^|V|) time and space. In general, dynamic programming offers a polynomial time complexity.

        Another way of solving a CSP is to use a local search algorithm. For the n queens problem, a local search algorithm will randomly fill the board with all the queens and then iteratively move conflicting queens (to the least conflicting spot, for example). One such implementation uses the min-conflicts heuristic to solve a 1 million queen problem in an average of 50 steps for a constant time complexity. However, local searches can get stuck at a local minimum.

        3. How to prepare for a coding interview

        3.1 Refresh your knowledge

        Before you start practicing interviews, you’ll want to make sure you have a strong understanding of not only backtracking but also the rest of the relevant algorithms and data structures. Check out our guides for detailed explanations and useful cheat sheets.

        Algorithms explained:

        Data structures explained: 

        3.2 Practice on your own

        Once you’re confident in all of these, you’ll want to start working through lots of coding problems.

        Check out the articles in our algorithms questions series for algorithm questions you can practice:

        We also recommend working through our list of 73 data structure interview questions.

        One of the main challenges of coding interviews is that you have to communicate what you are doing as you are doing it. Talking through your solution out loud is therefore very helpful. This may sound strange to do on your own, but it will significantly improve the way you communicate your answers during an interview.

        It's also a good idea to work on a piece of paper or google doc, given that in the real interview you'll have to code on a whiteboard or virtual equivalent.

        3.3 Practice with others

        However, sooner or later you’re probably going to want some expert interventions and feedback to really improve your coding interview skills.

        That’s why we recommend practicing with ex-interviewers from top tech companies. If you know a software engineer who has experience running interviews at a big tech company, then that's fantastic. But for most of us, it's tough to find the right connections to make this happen. And it might also be difficult to practice multiple hours with that person unless you know them really well.

        Here's the good news. We've already made the connections for you. We’ve created a coaching service where you can practice 1-on-1 with ex-interviewers from leading tech companies. Learn more and start scheduling sessions today.

        cta_illustration arrow_left Browse FAANG ex-interviewers

        Any questions about backtracking?

        If you have any questions about backtracking or coding interviews in general, don't hesitate to ask them in the comments below. All questions are good questions, so go ahead!