encoder-decoder-keras

A keras implementaion of encoder decoder architecture for ChatBot

Demonstrates how to implement a basic word-level sequence-to-sequence model.

In the general case, input sequences and output sequences have different lengths (e.g. machine translation) and the entire input sequence is required in order to start predicting the target. This requires a more advanced setup, which is what people commonly refer to when mentioning "sequence to sequence models" with no further context.

Training Mode

A RNN acts as "encoder", processes input sequence and return its own internal state

    encoder_inputs = Input(shape=(MAX_LEN, ), dtype='int32',)
    encoder_embedding = embed_layer(encoder_inputs)
    encoder_LSTM = LSTM(HIDDEN_DIM, return_state=True)
    encoder_outputs, state_h, state_c = encoder_LSTM(encoder_embedding)
    encoder_states = [state_h, state_c]

Another RNN acts as "decoder", takes in output of encoder and returns next characters of target sequence, given previous character of the target sequence.

    decoder_inputs = Input(shape=(MAX_LEN, ))
    decoder_embedding = embed_layer(decoder_inputs)
    decoder_LSTM = LSTM(HIDDEN_DIM, return_state=True, return_sequences=True)
    decoder_outputs, _, _ = decoder_LSTM(
        decoder_embedding, initial_state=encoder_states)
    outputs = TimeDistributed(
        Dense(VOCAB_SIZE, activation='softmax'))(decoder_outputs)
    model = Model([encoder_inputs, decoder_inputs], outputs)

The model is trained to turn the target sequence into the same sequences but offset by one timestep in the future, this process is called teacher forcing in this context.

Inference Mode

In order to decode an unknown sequence a different approach is taken:

Encode the input sequence into state vectors
Start with a dummy sequence with start token(here bos)
Feed state vectors and dummy sequence to the decoder
Append sampled word to the target sequence
Repeat until sapled character is end-of-sequence token(here eos) or limit is reached

    encoder_model = Model(encoder_inputs, encoder_states)
    decoder_state_input_h = Input(shape=(None,))
    decoder_state_input_c = Input(shape=(None,))
    decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
    decoder_outputs, state_h, state_c = decoder_LSTM(
    decoder_embedding, initial_state=decoder_states_inputs)

Model Summary

Data

We'll be using Cornell Movie--Dialogs Corpus as our dataset
Data is extracted out using process_data.py as: encoders_inputs.txt- context;deocoder_inputs.txt- response
<BOS> and <EOS> tags are added to decoder_inputs to mark sequence beginning and end

Algorithm Summary

Sentences are tokenized, indexed and glove embeddings are used; Shape(samples, max_limit, glove_dim)
Embedding matrix; Shape(vocab, golve_dim)

Encoder

Encoder input - Sequence; Shape(None, 20)
Encoder output - States; Shape(None, 200)

Decoder

Decoder input - States; Shape(None, 20)
Decoder output - Outputs; Shape(None, 20, 200)
Time distributed - Outputs; Shape(None, 20, 15000)

Working with Code

Download data into data folder
main.py (with no args) for training
main.py -r : reply mode; main.py -r --message=$message - replying to the $message

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
preprocessing		preprocessing
support		support
utils		utils
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

encoder-decoder-keras

Training Mode

Inference Mode

Model Summary

Data

Algorithm Summary

Encoder

Decoder

Working with Code

About

Uh oh!

Releases

Packages

Languages

saransh7/encoder-decoder-keras

Folders and files

Latest commit

History

Repository files navigation

encoder-decoder-keras

Training Mode

Inference Mode

Model Summary

Data

Algorithm Summary

Encoder

Decoder

Working with Code

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages