Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using tf.keras instead of tf.layers breaks exercise 1.2 #73

Open
juliuskunze opened this issue Dec 15, 2018 · 1 comment
Open

Using tf.keras instead of tf.layers breaks exercise 1.2 #73

juliuskunze opened this issue Dec 15, 2018 · 1 comment

Comments

@juliuskunze
Copy link

juliuskunze commented Dec 15, 2018

Starting from the correct solution for exercise 1.2, replacing mlp from

    for h in hidden_sizes[:-1]:
        x = tf.layers.dense(x, units=h, activation=activation)
    return tf.layers.dense(x, units=hidden_sizes[-1], activation=output_activation)

to

    for h in hidden_sizes[:-1]:
        x = tf.keras.layers.Dense(units=h, activation=activation)(x)
    return tf.keras.layers.Dense(units=hidden_sizes[-1], activation=output_activation)(x)

resulted in the solution recognized as being incorrect consistently over 10 runs. It was correct 10/10 runs before the change.

Controlling initialization with kernel_initializer=tf.glorot_uniform_initializer(), bias_initializer=tf.zeros_initializer() in any of the two variants for all layers does not influence the result.

Comparing tf.keras.layers.Dense and tf.layers.dense in isolated test resulted in the same empirical initialization statistics, and when starting from the same initialization, the same optimization behaviour.

Comparing the code for tf.keras.layers.Dense and tf.layers.dense, the only difference is that tf.layers.dense returns a Dense object that inherits not only from tf.keras.layers.Dense, but also from the legacy base.Layer, which therefore is certainly related to the cause.

Increasing episodes from 20 to 25 seems to solve the issue, suggesting that for some reason tf.keras results in slightly slower learning. Since the Keras API will become standard in Tensorflow 2.0, I am keen to understand why this is happening.

@jachiam
Copy link
Contributor

jachiam commented Dec 15, 2018

That is bizarre and unsettling. Thanks for the heads-up. If I can find the bandwidth for it, I'll try to suss it out (although to be honest, odds are low that I'll have time for this soon).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants