Can not find a better model for this NLP Problem. Any help? #430
Unanswered
MiguelCanalGarcia
asked this question in
Q&A
Replies: 1 comment 1 reply
-
what accuracy you get when you build this model, and evaluate on test dataset? Because if your training loss and validation loss isn't near to each other there is definitely overfitting or underfitting your model or if loss is okay but accuracy isn't as good then you must add more layers or tune some units of layers that must help as overall it helps for many people. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello, I am trying to use the imdb reviews dataset to get a 0.9 accuracy on the test set. However I am not able to think a about a model capable of that. Could someone help me?
LOADING DATA
import tensorflow as tf
import tensorflow_datasets as tfds
import numpy as np
(train, test), metadata = tfds.load('imdb_reviews',
split=['train', 'test'],
as_supervised=True,
with_info=True)
PREPARING DATA
training_sentences = []
training_labels = []
testing_sentences = []
testing_labels = []
for s, l in train:
training_sentences.append(s.numpy().decode('utf8'))
training_labels.append(l.numpy())
for s, l in test:
testing_sentences.append(s.numpy().decode('utf8'))
testing_labels.append(l.numpy())
training_labels_final = np.array(training_labels)
testing_labels_final = np.array(testing_labels)
TOKENIZATION
vocab_size = 1000
embedding_dim = 12
max_len = 120
trun_type = 'post'
oov_tok = ''
tokenizer = tf.keras.preprocessing.text.Tokenizer(num_words=vocab_size,
oov_token=oov_tok)
tokenizer.fit_on_texts(training_sentences)
sequences = tokenizer.texts_to_sequences(training_sentences)
padded_seq = tf.keras.preprocessing.sequence.pad_sequences(sequences,
maxlen=max_len,
truncating=trun_type)
test_sequences = tokenizer.texts_to_sequences(testing_sentences)
test_padded_seq = tf.keras.preprocessing.sequence.pad_sequences(test_sequences,
maxlen=max_len,
truncating=trun_type)
MODEL
model_3 = tf.keras.Sequential([
tf.keras.layers.InputLayer(input_shape=(120,)),
tf.keras.layers.Embedding(vocab_size, embedding_dim, input_length=max_len ),
tf.keras.layers.Conv1D(filters=36, kernel_size=5, padding='same', activation='relu'),
])
model_3.compile(
loss='binary_crossentropy',
optimizer=tf.keras.optimizers.Adam(),
metrics=['accuracy']
)
history = model_3.fit(
padded_seq, training_labels_final,
validation_data=(test_padded_seq, testing_labels_final),
epochs=10,
)
Beta Was this translation helpful? Give feedback.
All reactions