TLDR
Use deep learning to solve complicated algorithmic coding interview questions to ace your next interview.
It’s been a while since I’ve practiced programming interview questions, and I worry that my skills are lacking. It’s always important to work through some every now and then to stay sharp so here we go:
Let’s start with Fizz Buzz:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
import tensorflow as tf
import numpy as np
INPUT_SIZE = 10 # We can encode up to 2^15 numbers with binary encoding.
HIDDEN_SIZE = 100 # Hidden layer of size 100
OUTPUT_SIZE = 4 # One hot encoding for the 4 possible outputs: "Fizz" "Buzz" "FizzBuzz", ""
NUM_EPOCHS = 10000
def binary_encode(number):
    data = np.zeros(INPUT_SIZE)
    for bitshift in range(INPUT_SIZE):
        data[bitshift] = number >> bitshift & 1 # Get the bit at position bitshift
    return data
def fizz_buzz_encode(i):
    if   i % 15 == 0: return np.array([0, 0, 0, 1])
    elif i % 5  == 0: return np.array([0, 0, 1, 0])
    elif i % 3  == 0: return np.array([0, 1, 0, 0])
    else:             return np.array([1, 0, 0, 0])
def fizz_buzz_decode(encoding):
    max_idx = np.argmax(encoding)
    if max_idx == 0:
        return ""
    if max_idx == 1:
        return "fizz"
    if max_idx == 2:
        return "buzz"
    if max_idx == 3:
        return "fizzbuzz"Now, we can generate some training data (we will generate training data from 101 to 1024) since our actual problem will solve fizzbuzz for 1–100.
1
2
X_train = np.array([binary_encode(i) for i in range(101, 2 ** INPUT_SIZE)])
y_train = np.array([fizz_buzz_encode(i) for i in range(101, 2 ** INPUT_SIZE)])And now, let’s define our multilayer perceptron model that will actually perform the bulk of the work. First, we define the inputs to the model:
Next, let’s define the weights and biases that we will learn via backpropagation:
1
2
3
4
5
w1 = tf.get_variable("w1", [INPUT_SIZE, HIDDEN_SIZE], initializer=tf.random_normal_initializer())
b1 = tf.get_variable("b1", [HIDDEN_SIZE])
w2 = tf.get_variable("w2", [HIDDEN_SIZE, OUTPUT_SIZE],
initializer=tf.random_normal_initializer())
b2 = tf.get_variable("b2", [OUTPUT_SIZE])And now, let’s feed our input through the model:
1
2
3
z1 = tf.add(tf.matmul(X, w1), b1)
a1 = tf.nn.relu(z1)
z2 = tf.add(tf.matmul(a1, w2), b2)Then, let’s compute the loss and tell tensorflow to minimize that loss:
1
2
cost = tf.nn.softmax_cross_entropy_with_logits(logits=z2, labels=Y)
optimizer = tf.train.GradientDescentOptimizer(0.001).minimize(cost)Now, let’s actually feed our real training data into that model:
And print out the results:
1
2
3
4
5
6
7
8
9
10
11
12
def real_fizz_buzz(i):
    txt = ""
    if i % 3 == 0:
        txt += "fizz"
    if i % 5 == 0:
        txt += "buzz"
    return txt
    
for i in range(len(X_test)):
    text = fizz_buzz_decode(pred[i])
    true_text = real_fizz_buzz(i + 1)
   print("{}: {} ({})".format(i + 1, text, "✔" if text == true_text else "x"))Results:
1
2
3
4
5
6
7
8
1:  (✔)
2:  (✔)
3:  (x)
...
97:  (✔)
98:  (✔)
99: fizz (✔)
Total Accuracy: 89.89%For all that work, we achieved an embarrassingly low 90% accuracy. I was going to try to fix this but then quickly lost interest. Let’s move onto something a bit more interesting:
This sounds like a problem that can be solved with an LSTM. Let’s generate some data.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
import tensorflow as tf
import random
import numpy as np
CHARS = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
STRING_LENGTH = 12
num_examples = 10000
# Args:
#   n: Number of examples to generate.
# Returns:
#   strings_v: numpy array of the form (n, STRING_LENGTH, len(CHARS)). One hot encoding of
sequences of text
#   strings: Array of actual generated random text:
#   uniques_v: numpy array of the form (n, len(CHARS)). One hot encoding of number of unique
characters
#   uniques: numpy array of length n, number of unique characters for each sequence.
def generate_data(n=num_examples):
    chars_to_idx = { c: i for i, c in enumerate(CHARS)}
    
    strings_v = np.zeros([n, STRING_LENGTH, len(CHARS)])
    strings = [''] * n
    uniques = np.zeros(n)
    uniques_v = np.zeros([n, len(CHARS)])
    for x in range(n):
        for y in range(STRING_LENGTH):
            random.shuffle(CHARS)
            char = CHARS[0]
            
            strings_v[x][y][chars_to_idx[char]] = 1
            strings[x] += char
            
        uniques[x] = len(set(strings[x]))
        uniques_v[x][len(set(strings[x])) - 1] = 1
        
    return strings_v, strings, uniques_v, uniquesNext let’s create our LSTM model:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
HIDDEN_LAYERS = 64
X = tf.placeholder("float", [None, STRING_LENGTH, len(CHARS)])
y = tf.placeholder("float", [None, len(CHARS)])
X_seq = tf.unstack(X, STRING_LENGTH, 1)
lstm_cell = tf.contrib.rnn.BasicLSTMCell(HIDDEN_LAYERS)
#sequence of 12 chars to output of 7
outputs, states = tf.contrib.rnn.static_rnn(lstm_cell, X_seq, dtype=tf.float32)
final_output = outputs[-1]
weights = tf.get_variable("weights", [HIDDEN_LAYERS, len(CHARS)],
initializer=tf.random_normal_initializer())
biases = tf.get_variable("biases", [len(CHARS)], initializer=tf.random_normal_initializer())
prediction = tf.add(tf.matmul(final_output, weights), biases)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=y))
optimizer = tf.train.AdamOptimizer(1e-2)
train_op = optimizer.minimize(cost)Now, let’s go ahead and run our model:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
init = tf.global_variables_initializer()
costs = []
EPOCHS = 300
with tf.Session() as sess:
    sess.run(init)
    for i in range(EPOCHS):
        _, c, _ = sess.run([train_op, cost, prediction], feed_dict={X:X_train, y: y_train})
        costs.append(c)
        if i % 10 == 0:
            print('cost: {}, epoch: {}'.format(c, i))
            
            X_test, X_test_strings, y_test, y_test_strings = generate_data(100)
            p = sess.run(prediction, feed_dict={X: X_test, y: y_test})
            prediction_idxs = np.argmax(p, axis=1)
            prediction_vals = prediction_idxs + 1
correct = 0.0
            for i in range(len(y_test_strings)):
                string = X_test_strings[i]
                actual_val = y_test_strings[i]
                predicted_val = prediction_vals[i]
                # Print the first 5 examples
                if i < 5:
                    print('string: {}, pred: {}, actual: {}'.format(string, predicted_val,
actual_val))
                    
                if predicted_val == actual_val:
                    correct += 1
            print("{}% accuracy\n\n".format(correct * 100 / len(y_test_strings)))
cost: 0.0496655665338, epoch: 280
string: eeaccbdeagac, pred: 6, actual: 6.0
string: gefaddbbcfac, pred: 7, actual: 7.0
string: acedcbgdcagf, pred: 7, actual: 7.0
string: aadebcdacefg, pred: 7, actual: 7.0
string: abeeaebcbbag, pred: 5, actual: 5.0
100% accuracycost: 0.0413268692791, epoch: 290
string: ebbfbfaebede, pred: 5, actual: 5.0
string: fgaccdbabedg, pred: 7, actual: 7.0
string: dgdcbaefcdad, pred: 7, actual: 7.0
string: ffagdfedccad, pred: 6, actual: 6.0
string: gfggfgbebcdb, pred: 6, actual: 6.0
99% accuracyHopefully the interviewer won’t be disappointed that we cannot solve this problem for any string that’s greater than MAX_LENGTH, but overall, it achieved a 99% accuracy. I didn’t deal with variable length inputs in this case, we could’ve easily done that by adding padding.
Ex: “acabdb” => “acbd”
Oof. Thats a much tougher problem. In this case, the value we need to return is a sequence of characters instead of a single character or number so we will need some form of sequence to sequence model in order to learn this relationship.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
import numpy as np
import tensorflow as tf
from tensorflow.contrib import rnn
import random
#"abc" => "abc"
#"aabbac" => "abc"
#"abacd" => "abcd"
MAX_LENGTH = 6 # Max length of 6 
chars = ["a", "b", "c", "d", "e", "f"]
all_chars = chars + [' '] # Space for padding
NUM_EXAMPLES = 50000
# Args:
#   n: number of examples to generate
# Returns:
#   strings: list of strings that may contain duplicates
#   solutions: strings without duplicates
#   strings_v: One hot encoding of strings with duplicates (without padding)
#   solutions_v: One hot encoding of solutions (with padding)
def generate_data(n=NUM_EXAMPLES):
    all_chars_to_idx = { c:i for i, c in enumerate(all_chars) }
    strings_v = np.zeros((NUM_EXAMPLES, MAX_LENGTH, len(all_chars)))
    solutions_v = np.zeros((NUM_EXAMPLES, MAX_LENGTH, len(all_chars)))
    
    strings = [''] * NUM_EXAMPLES
    solutions = [''] * NUM_EXAMPLES
    
    for i in range(NUM_EXAMPLES):
        for l in range(MAX_LENGTH):
            char = random.choice(chars) # only sample from valid characters
            strings[i] += char
            if char not in solutions[i]:
                solutions[i] += char
                
        # Pad solutions strings
        num_missing = MAX_LENGTH - len(solutions[i])
        solutions[i] += ' ' * num_missing
    
    for x in range(len(strings)):
        for y in range(MAX_LENGTH):
            string_char = strings[x][y]
            strings_v[x][y][all_chars_to_idx[string_char]] = 1
            
            solution_char = solutions[x][y]
            solutions_v[x][y][all_chars_to_idx[solution_char]] = 1
            
    return strings, solutions, strings_v, solutions_vAgain, we can one hot encode the sequences, but this time, we will add some padding since the length of the output string is variable. Instead of trying to return a variable length sequence, we will just return a sequence that is equal to the length of the input, and pad the output with spaces.
For example, “abdab” would map to a string of the same length, but with spaces as padding: “abd “.
Let’s create a test and training split.
1
2
3
4
5
6
7
8
9
10
strings, solutions, strings_v, solutions_v = generate_data()
split_at = len(strings) - (len(strings) // 10)
strings_train = strings[:split_at]
solutions_train = solutions[:split_at]
X_train = strings_v[:split_at]
y_train = solutions_v[:split_at]
strings_test = strings[split_at:]
solutions_test = solutions[split_at:]
X_test = strings_v[split_at:]
y_test = solutions_v[split_at:]Now we can set up our model. Let’s start with the encoder:
1
2
3
4
5
6
encoded_input = tf.placeholder(tf.float32, shape=(None, MAX_LENGTH, len(all_chars)))
decoded_input = tf.placeholder(tf.float32, shape=(None, MAX_LENGTH, len(all_chars)))
with tf.name_scope("basic_rnn_seq2seq") as scope:
    encoded_sequence = tf.unstack(encoded_input, MAX_LENGTH, 1)
    encoder_cell = rnn.BasicLSTMCell(128, forget_bias=1.0)
    encoded_outputs, states = rnn.static_rnn(encoder_cell, encoded_sequence, dtype=tf.float32)And now, the decoder:
1
2
3
4
with tf.name_scope("lstm_decoder") as scope:
    decoded_sequence = tf.unstack(decoded_input, MAX_LENGTH, 1)
    decoder_cell = rnn.BasicLSTMCell(128, reuse=True)
    decoded_outputs, _ = rnn.static_rnn(decoder_cell, decoded_sequence, initial_state=states, dtype=tf.float32)Now, we can compute the predictions by multiplying the decoder’s hidden layer output with a fully connected layer:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
with tf.name_scope("fully_connected") as scope:
    weights = tf.get_variable('weights', (128, len(all_chars)),
initializer=tf.random_normal_initializer())
    biases = tf.get_variable('biases', (len(all_chars)),
initializer=tf.random_normal_initializer())
    
    predictions = []
    encoded_sequence = tf.unstack(decoded_input, MAX_LENGTH, 1)
for output in decoded_outputs:
        prediction = tf.add(tf.matmul(output, weights), biases)
        predictions.append(prediction)
concatenated_outputs = tf.stack(predictions, 0)
    concatenated_outputs = tf.transpose(concatenated_outputs, perm=[1, 0, 2])
    concatenated_inputs = tf.concat(decoded_input, 0)Now, we can compute the loss:
1
2
3
4
5
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=concatenated_outputs,
labels=concatenated_inputs))
# FC Layer
optimizer = tf.train.AdamOptimizer(1e-2)
train_op = optimizer.minimize(cost)And now let’s run our model:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
def decode_guess(one_hot):
    return ''.join([all_chars[m] for m in np.argmax(one_hot, axis=1)])
init = tf.global_variables_initializer()
costs = []
with tf.Session() as sess:
    sess.run(init)
    for i in range(30):
        e, r, c, t, c_out, c_in = sess.run([encoded_outputs, predictions, cost, train_op,
concatenated_outputs, concatenated_inputs], feed_dict={encoded_input: X_train, decoded_input:
y_train})
        costs.append(c)
        if i % 5 == 0:
            print('training cost: {}, epoch: {}'.format(c, i))
            
            results = sess.run(predictions, feed_dict={encoded_input: X_test, decoded_input:
y_test})
            guesses = np.array(results).transpose(1, 0, 2)
            for i in range(5):
                string = strings_test[i]
                solution = solutions_test[i]
                guess_decoded = decode_guess(guesses[i])
                print("{}: {} - {}".format(string, solution, guess_decoded))
                
            correct = 0.0
            for i in range(len(strings_test)):
                string = strings_test[i]
                solution = solutions_test[i]
                guess_decoded = decode_guess(guesses[i])
                if i < 5:
                    print("input: {}, solution: {}, prediction: {}".format(string, solution,
guess_decoded))
if solution == guess_decoded:
                    correct += 1
                    
            print("{} % accuracy".format(correct / len(strings_test) * 100))And here are the results:
1
2
3
4
5
6
7
8
9
10
11
12
13
training cost: 2.07600975037, epoch: 0
cebbdf: cebdf  -       
dffcbc: dfcb   -       
aadeab: adeb   -       
faceec: face   -       
bfaeec: bfaec  -       
0.0 % accuracy...training cost: 0.219934388995, epoch: 25
cebbdf: cebdf  - cebdf 
dffcbc: dfcb   - dfcb  
aadeab: adeb   - adeb  
faceec: face   - face  
bfaeec: bfaec  - bfaec 
100.0 % accuracyIt looks like by 25 epochs, our model has a fairly good understanding of how to solve this problem.
Please do not solve real interview problems in this way.
Start building for free
No need for a credit card to get started.
Trying out Mage to build ranking models won’t cost a cent.
No need for a credit card to get started. Trying out Mage to build ranking models won’t cost a cent.