Using tf.keras instead of tf.layers breaks exercise 1.2 #73

juliuskunze · 2018-12-15T04:53:25Z

Starting from the correct solution for exercise 1.2, replacing mlp from

    for h in hidden_sizes[:-1]:
        x = tf.layers.dense(x, units=h, activation=activation)
    return tf.layers.dense(x, units=hidden_sizes[-1], activation=output_activation)

to

    for h in hidden_sizes[:-1]:
        x = tf.keras.layers.Dense(units=h, activation=activation)(x)
    return tf.keras.layers.Dense(units=hidden_sizes[-1], activation=output_activation)(x)

resulted in the solution recognized as being incorrect consistently over 10 runs. It was correct 10/10 runs before the change.

Controlling initialization with kernel_initializer=tf.glorot_uniform_initializer(), bias_initializer=tf.zeros_initializer() in any of the two variants for all layers does not influence the result.

Comparing tf.keras.layers.Dense and tf.layers.dense in isolated test resulted in the same empirical initialization statistics, and when starting from the same initialization, the same optimization behaviour.

Comparing the code for tf.keras.layers.Dense and tf.layers.dense, the only difference is that tf.layers.dense returns a Dense object that inherits not only from tf.keras.layers.Dense, but also from the legacy base.Layer, which therefore is certainly related to the cause.

Increasing episodes from 20 to 25 seems to solve the issue, suggesting that for some reason tf.keras results in slightly slower learning. Since the Keras API will become standard in Tensorflow 2.0, I am keen to understand why this is happening.

jachiam · 2018-12-15T07:49:04Z

That is bizarre and unsettling. Thanks for the heads-up. If I can find the bandwidth for it, I'll try to suss it out (although to be honest, odds are low that I'll have time for this soon).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using tf.keras instead of tf.layers breaks exercise 1.2 #73

Using tf.keras instead of tf.layers breaks exercise 1.2 #73

juliuskunze commented Dec 15, 2018 •

edited

Loading

jachiam commented Dec 15, 2018

Using tf.keras instead of tf.layers breaks exercise 1.2 #73

Using tf.keras instead of tf.layers breaks exercise 1.2 #73

Comments

juliuskunze commented Dec 15, 2018 • edited Loading

jachiam commented Dec 15, 2018

juliuskunze commented Dec 15, 2018 •

edited

Loading