You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
first of all the paper is really cool and an excellent choice and I really enjoyed seeing you go through things, the mistake is not really a big deal its probably just gonna make the model work better since you r adding some parameters.
so in the paper, they do not use bias (page 4 formula) on the linear projection layer (theta) and you did, the reason this is kinda dumb is that you are doing a linear activation after so you are just going to sum biases like a lot of these and so you may as well use one biases to make it easier on the computer.
also on a bigger note I think you are better of not making a custom layer but using a keras model. for one its shorter and the more important part is that custom layers have risks of making errors because you didn't change the get config. models on the other hand can be used as is. also in my experience playing around with custom layers they can sometimes work very slow since you are not allowing TensorFlow to optimize everything it can. pretty much anything you need can be done by using models and lambda layers.
only real reason to go ahead and make a custom layer is if you want to have something truly unique like for instance the snake layer in the addons. tho i do apreshate the learning experience this is, its a good tool to have, learned a lot from seeing you work.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
first of all the paper is really cool and an excellent choice and I really enjoyed seeing you go through things, the mistake is not really a big deal its probably just gonna make the model work better since you r adding some parameters.
so in the paper, they do not use bias (page 4 formula) on the linear projection layer (theta) and you did, the reason this is kinda dumb is that you are doing a linear activation after so you are just going to sum biases like a lot of these and so you may as well use one biases to make it easier on the computer.
also on a bigger note I think you are better of not making a custom layer but using a keras model. for one its shorter and the more important part is that custom layers have risks of making errors because you didn't change the get config. models on the other hand can be used as is. also in my experience playing around with custom layers they can sometimes work very slow since you are not allowing TensorFlow to optimize everything it can. pretty much anything you need can be done by using models and lambda layers.
only real reason to go ahead and make a custom layer is if you want to have something truly unique like for instance the snake layer in the addons. tho i do apreshate the learning experience this is, its a good tool to have, learned a lot from seeing you work.
Beta Was this translation helpful? Give feedback.
All reactions