Design an algorithm to encode a list of strings to a string. The encoded string is then sent over the network and is decoded back to the original list of strings.
Key Challenge: Handle strings that may contain any characters, including delimiters, numbers, and special characters.
Input: strs = ["neet", "code", "love", "you"]
Output: ["neet", "code", "love", "you"]
Input: strs = ["we", "say", ":", "yes"]
Output: ["we", "say", ":", "yes"]
0 <= strs.length < 1000 <= strs[i].length < 200strs[i] contains only UTF-8 characters
Time Complexity: O(n) where n is total characters in all strings
Space Complexity: O(1) excluding input/output
Use format "length#string" for each string. The length prefix tells us exactly how many characters to read, and the '#' delimiter clearly separates length from content.
Encode Process:
- For each string, prepend its length followed by '#'
- Concatenate all encoded strings
- Example:
["abc", "def"]→"3#abc3#def"
Decode Process:
- Find the '#' delimiter to separate length from content
- Extract the length (number before '#')
- Extract exactly that many characters after '#'
- Move to next encoded string
Why This Works:
- Length prefix tells us exactly how many characters to read
- '#' delimiter clearly separates length from content
- Handles any characters in strings (including numbers and '#')
- No ambiguity in parsing - each string boundary is clearly defined
Example Walkthrough:
Input: ["we", "say", ":", "yes"]
Encode: "2#we3#say1#:3#yes"
Decode:
- Read "2#" → length=2, read "we"
- Read "3#" → length=3, read "say"
- Read "1#" → length=1, read ":"
- Read "3#" → length=3, read "yes"
Time Complexity: O(n)
Space Complexity: O(1)
Use escape sequences for special characters (e.g., \: for colon, \# for hash).
class Codec {
public:
string encode(vector<string>& strs) {
string result;
for (const string& str : strs) {
for (char c : str) {
if (c == '\\') {
result += "\\\\"; // Escape backslash
} else if (c == ':') {
result += "\\:"; // Escape colon
} else {
result += c;
}
}
result += ":;"; // Delimiter between strings
}
return result;
}
vector<string> decode(string s) {
vector<string> result;
string current;
for (int i = 0; i < s.length(); i++) {
if (s[i] == '\\' && i + 1 < s.length()) {
// Handle escape sequence
if (s[i + 1] == '\\') {
current += '\\';
i++; // Skip next backslash
} else if (s[i + 1] == ':') {
current += ':';
i++; // Skip next colon
}
} else if (s[i] == ':' && i + 1 < s.length() && s[i + 1] == ';') {
// End of string delimiter
result.push_back(current);
current.clear();
i++; // Skip semicolon
} else {
current += s[i];
}
}
return result;
}
};Limitations:
- More complex to implement and debug
- Requires defining comprehensive escape sequences
- Less efficient for strings with many special characters
Time Complexity: O(n)
Space Complexity: O(1)
Convert strings to base64 format, separate with delimiters.
#include <sstream>
#include <iomanip>
class Codec {
private:
string base64_encode(const string& input) {
const string chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
string result;
int val = 0, valb = -6;
for (unsigned char c : input) {
val = (val << 8) + c;
valb += 8;
while (valb >= 0) {
result.push_back(chars[(val >> valb) & 0x3F]);
valb -= 6;
}
}
if (valb > -6) result.push_back(chars[((val << 8) >> (valb + 8)) & 0x3F]);
while (result.size() % 4) result.push_back('=');
return result;
}
string base64_decode(const string& input) {
const string chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
string result;
int val = 0, valb = -8;
for (char c : input) {
if (chars.find(c) == string::npos) break;
val = (val << 6) + chars.find(c);
valb += 6;
if (valb >= 0) {
result.push_back(char((val >> valb) & 0xFF));
valb -= 8;
}
}
return result;
}
public:
string encode(vector<string>& strs) {
string result;
for (const string& str : strs) {
result += base64_encode(str) + "|";
}
return result;
}
vector<string> decode(string s) {
vector<string> result;
stringstream ss(s);
string token;
while (getline(ss, token, '|')) {
if (!token.empty()) {
result.push_back(base64_decode(token));
}
}
return result;
}
};Limitations:
- Increases string size by ~33%
- More complex encoding/decoding process
- Overkill for this problem
- Length-prefix encoding: Prepend each string with its length for unambiguous parsing
- Delimiter-based parsing: Use special character ('#') to separate length from content
- Substring extraction: Parse encoded data by extracting specific portions
- String concatenation: Efficiently combine multiple encoded strings
- State machine parsing: Track position and parse length then content sequentially
- Boundary detection: Find delimiters to separate different parts of encoded data
- Sequential processing: Process encoded data from left to right without backtracking
string: Text manipulation and processingvector<string>: Dynamic array of stringsto_string(): Convert integer to stringsubstr(): Extract substring from stringstoi(): Convert string to integer
- Empty strings: Length 0 is correctly encoded and decoded (
""→"0#") - Strings containing numbers: Length prefix is clearly separated by '#'
- Strings containing '#': Length tells us exactly where string ends
- Empty input list: Returns empty string, decodes to empty list
- Single string input: Works correctly with one element
Edge Case Examples:
Input: [""] → Encode: "0#" → Decode: [""]
Input: ["#"] → Encode: "1##" → Decode: ["#"]
Input: ["123"] → Encode: "3#123" → Decode: ["123"]
Input: ["a#b"] → Encode: "3#a#b" → Decode: ["a#b"]
- 394. Decode String - Nested string decoding with brackets
- 443. String Compression - Compressing strings in-place
- 151. Reverse Words in a String - String manipulation and parsing
- 8. String to Integer (atoi) - Parsing strings with state machines
- Think about edge cases early in the design process
- Choose appropriate delimiters that won't conflict with data content
- Consider memory optimization through pre-allocation
- Test thoroughly with various input combinations
- Understand the encoding format before implementing