To ace your coding interview for a software engineering job, you’ll need to understand strings. They come up frequently in coding interviews and are fundamental to many other data structures too.
Let’s take a look at some typical string questions.
- Given a string, create a new string without vowels and print that string.
- Given a string, create a new string with the same characters in a random order.
- Given a string containing some words in (possibly nested) parentheses, calculate if all parentheses match.
- Find the longest common prefix of two given strings.
- Given two strings, find the shortest edit distance to transform the first into the second.
Below, we take a look at some more questions and provide you with links to high quality solutions to them. We explain how strings work, their variations, and the most important things you need to know about them. Finally, we give you a preparation plan so that you ace every part of your coding interview.
This is an overview of what we’ll cover:
Easy string interview questions
- Medium string interview questions
- Hard string interview questions
- String basics
- String cheat sheet
- Mock interviews for software engineers
You might be tempted to try to read all of the possible questions and memorize the solutions, but this is not feasible. Interviewers will always try to find new questions, or ones that are not available online. Instead, you should use these questions to practice the fundamental concepts of strings.
As you consider each question, try to replicate the conditions you’ll encounter in your interview. Begin by writing your own solution without external resources in a fixed amount of time.
If you get stuck, go ahead and look at the solutions, but then try the next one alone again. Don’t get stuck in a loop of reading as many solutions as possible! We’ve analyzed dozens of questions and selected ones that are commonly asked and have clear and high quality answers.
Here are some of the easiest questions you might get asked in a coding interview. These questions are often asked during the ‘phone screen’ stage, so you should be comfortable answering them without being able to write code or use a whiteboard.
1.1 Remove Vowels from a String
1.2 Defanging an IP Address
1.3 Jewels and Stones
1.4 Shuffle String
1.5 Split a String in Balanced Strings
1.6 To Lower Case
1.7 Unique Morse Code Words
1.8 Count Substrings with Only One Distinct Letter
1.9 Robot Return to Origin
1.10 Fizz Buzz
1.11 First Unique Character in a String
1.12 Reverse String
1.13 Valid Anagram
1.14 Valid Palindrome
1.15 Implement Strstr()
1.16 Valid Parentheses
1.17 Roman to Integer
1.18 Longest Common Prefix
1.19 Excel Sheet Column Number
1.20 Palindrome Permutation
Here are some moderate-level questions that are often asked in a video call or onsite interview. You should be prepared to write code or sketch out the solutions on a whiteboard if asked.
2.1 Longest Substring Without Repeating Characters
2.2 Longest Palindromic Substring
2.3 String to Integer (atoi)
2.4 Letter Combinations of a Phone Number
2.5 Generate Parentheses
2.6 Count and Say
2.7 Group Anagrams
2.8 Decode Ways
2.9 Palindrome Partitioning
2.10 Word Break
2.11 Fraction to Recurring Decimal
2.12 Largest Number
2.13 Implement Trie (Prefix Tree)
2.14 Basic Calculator II
2.15 Longest Substring with At Least K Repeating Characters
2.16 Palindrome Partitioning
2.17 Reorganize String
2.18 ZigZag Conversion
2.19 Decode String
2.20 Multiply Strings
Similar to the medium section, these more difficult questions may be asked in an onsite or video call interview. You will likely be given more time if you are expected to create a full solution.
3.1 Regular Expression Matching
3.2 Wildcard Matching
3.3 Minimum Window Substring
3.4 Word Ladder
3.5 Word Break II
3.6 Word Search II
3.7 Serialize and Deserialize Binary Tree
3.8 Longest Valid Parentheses
3.9 Edit Distance
3.10 Alien Dictionary (string/graph)
3.11 Design Search Autocomplete System
In order to crack the questions above and others like them, you’ll need to have a strong understanding of strings and how they work. Let’s get into it.
4.1 What is a string?
A string is an ordered sequence, or string, of characters. It is usually considered a data type and is often included as part of language primitives. In most languages, strings are implemented using an array of bytes. The bytes are encoded using some character encoding. Earlier systems used ASCII encoding, with Unicode encoding used in later systems.
4.1.1 Types of strings (Java, Python, C++)
C++ provides a mutable string, meaning the string contents can be changed after creation.
Python and Java provide an immutable string type. Immutable strings require a new string to be created if any changes are made. This has performance implications for string concatenation when joining a number of strings, running in quadratic time.
Java provides a mutable StringBuilder class, which should be used for concatenating multiple strings. In Python, generally the `.join` method on strings should be used, which accepts an iterable of multiple strings. Using Java's StringBuilder or Python’s `.join` provides a linear time solution for multiple string concatenations.
Another class of strings are C-style strings, so named as they are used in the C language. These strings are simple arrays. A special terminating character is stored directly after the last character of the string. This terminating character marks the end of the string within the array, or buffer. C++ has support for C-style strings, but it is preferable to use C++ style strings in pure C++ code.
Java, Python and C++ strings are more complex data structures, with class or struct methods for manipulating and querying them.
4.1.2 How strings store data
There are generally two types of string implementations: null-terminated strings (C strings), and non-null-terminated strings. The principal difference between these types of strings is how the termination and tracking the length of the string is handled.
Null-terminated strings (C strings) are character arrays terminated with a null (NUL) character, typically a byte with all bits set to 0. The length of the string is calculated by counting the characters in the string up to the terminating character. Care must be taken in working with C-style strings, as writing and reading past the end of the string terminator can result in buffer overflow errors, causing unexpected behaviour and security issues by overwriting or reading unrelated memory contents.
Non-null-terminated strings, as used in C++, Java and Python implementations, have the underlying array and the length stored separately. In Java this is a class with private fields for the character array and length, which are not directly accessible from outside the class.
C++, Java and Python strings have useful class and instance methods to work with strings. Commonly used are:
- Substring queries to return a new string from a subset of the original string
- Replace methods to return a new string with a specific sequence of character substituted for another
- String formatters to create strings from a template and data variables
- Trimming methods to create new strings without leading or trailing whitespace
- Upper and lower case conversion methods
- Methods to check if a string contains a given substring
For more advanced pattern matching in strings, Regular Expressions, or regexes, are commonly used. Most modern string implementations have support for regex queries.
4.1.3 Deep dive into tries
A trie, or prefix tree, is a type of search tree, often used with strings. The strangely spelt name is from "reTrieval," but is mostly pronounced as "try." Tries are useful for answering questions like, "Given an input sequence 'cod,' what possible words could be formed?" This makes tries useful for auto-completing words in text input in a range of use cases, including search and texting.
A trie data structure is made of nodes indexed with a single character key. Each node has references to its child nodes. The number of child nodes is bounded by the number of unique characters in the alphabet space making up the stored words. Often this is described as 26 for the English language, but it can be larger when taking into consideration accented characters, or if there are numeric and special characters in the alphabet space. Each node also has a flag signaling the end of a word, represented by a blue node below:
Since each node only contains one character, the complete key to that node is distributed from its parents down to that node. Retrieval of a key is a depth-first search of the trie. A search for a key ends when the end of the word node is reached. Inserting a new key is done by following the trie through each character of the new key, and adding each character as a trie node where needed, and marking the node of the last character as the end of the word.
Inserting and searching is O(keySize) time. Space requirements are much larger than a simple string though, at O(AlphabetSpace*keySize*n), n being the total number of keys in the trie.
String definition: A string is a sequence of characters, often implemented as an array.
You can download this cheat sheet here.
5.1 Related algorithms and techniques
- State Machines
- String Builder
- Two pointers
- Levenshtein Distance
- Boyer-Moore string search algorithm
- Knuth-Morris-Pratt Algorithm
- Rubin-Karp Algorithm
- Bitap algorithm
- BW Data transform
5.2 Cheat sheet explained
The cheat sheet above is a summary of information you might need to know for an interview, but it’s usually not enough to simply memorize it. Instead, aim to understand each result so that you can give the answer in context.
The cheat sheet is broken into time complexity (the processing time for the various string operations) and space complexity (the amount of memory required). While you might be asked about these directly in relation to string data structures, it’s more likely that you will need to know these in relation to specific string-related algorithms, such as substring searching, which is what the second section of the cheat sheet details.
5.2.1 Time complexity
Accessing a character at a particular index in a string is an O(1) operation, as strings are stored as arrays. Deleting characters from immutable strings means creating a new string and copying the remaining characters over, so this is an O(n) operation. For mutable strings, it means removing characters and shifting remaining characters, which is also O(n). Inserting into a string is a similar level of work as deleting, again O(n) time for both mutable and immutable strings.
Searching for a substring using built-in methods in most languages is generally O(n*m), depending on the algorithm used. We explain this more in the algorithm section.
Slicing, or splitting, a string refers to creating multiple substrings by splitting a string on a character or character sequence. This is often used in parsing CSV type files. It runs in O(n) time, as each character needs to be copied out into the new strings.
Joining, or concatenating, strings is a more complex operation. For immutable strings, concatenating multiple strings can result in O(n2) time complexity for large numbers of strings. In such cases, StringBuilder, or Python's '.join()' method is recommended. For concating two mutable strings, it is O(n+m), where n and m represent the number of characters in either string. Mutable strings can concatenate faster, depending on the underlying implementation. For implementations where the array is doubled in size each time a resizing is needed, string concatenation is closer to regular array time complexity.
Comparing strings is also O(n) time, as generally each character needs to be checked.
For tries, there are two significant operations: Inserting a new key, and retrieving (searching for) a key. Both run in O(m) time, where m is the size of the key. This makes tries very scalable, as the time complexity is not dependent on the size of the trie itself.
5.2.2 String algorithm complexity
Radix sort is usually shown in the context of sorting numbers, but it can be used to sort strings, if the buckets are indexed by characters rather than numbers. Essentially the alphabet space of the strings becomes the base as used by the radix sort algorithm.
Many string algorithms center around searching for all occurrences of a substring within a string. This is a fundamental operation in text-centered applications, from HTTP processing to word processing. The naive implementation slides the search pattern by one character over the length of the string to check for the pattern starting at each index. This makes it an O(m*(n-m+1)) operation, as the pattern needs to be checked at each character in the string. The more sophisticated algorithms in the list attempt to reduce this complexity. These algorithms reduce the time by using strategies such as precomputing tables, creating rolling hashes, and using caching state from previous matching. They also use the fact that useful information for shortcutting the naive search approach is available in the search pattern itself, for example, whether or not the search pattern has repeating characters can allow some sliding indexes to be skipped. In all, these algorithms provide ways to reduce searching to O(n) time complexity.
Before you start practicing interviews, you’ll want to make sure you have a strong understanding of not only linked lists but also the rest of the relevant data structures. Check out our guides for questions, explanations and helpful cheat sheets.
- Linked lists
- Stacks and Queues
- Graphs (coming soon)
- Map (coming soon)
- Heap (coming soon)
Once you’re confident on all of these, you’ll want to start practicing answering coding questions in an interview situation. One way of doing this is by practicing out loud, which is a very underrated way of preparing. However, sooner or later you’re probably going to want some expert interventions and feedback to really improve your interview skills.
That’s why we recommend practicing with ex-interviewers from top tech companies. If you know a software engineer who has experience running interviews at a big tech company, then that's fantastic. But for most of us, it's tough to find the right connections to make this happen. And it might also be difficult to practice multiple hours with that person unless you know them really well.
Here's the good news. We've already made the connections for you. We’ve created a coaching service where you can practice system design interviews 1-on-1 with ex-interviewers from leading tech companies. Learn more and start scheduling sessions today.
Any questions about string interview questions?
If you have any questions about strings or coding interviews in general, don't hesitate to ask them in the comments below. All questions are good questions, so go ahead!