The Packing problem has gained much relevance with the recent upheaval of the delivery and retail industry. Companies all over the world are now subject to massive logistics & operations schemes, and their warehouses‘ e ectiveness is irrevocably bound to how well their products are packed into trucks for distribution. Optimizing this process may lead to huge improvements in performance, time use, resource management and to ultimately increasing profits. Seeking to perform and deliver this optimization, this work proposes a new method called “Deep Box Packing” (DBP), an online system which is able to provide an optimized packing strategy for an arbitrary set of three-dimensional boxes arriving in real-time. DBP was trained using Deep Reinforcement Learning and leverages the power of attention mechanisms in a modified version of the Transformer Network called here the Mapping Transformer. It was conceived to work under partial information, in real-time, and to respond to all of the three inherent questions of packing: which box to take (selection), where to place it in the container (position) and how to place it (orientation) at every given moment in time. Its reward function was tailored not only in terms of optimizing the final Volume utilization of the container but also in terms of the feasibility of the packing sequence, withholding constraints such as box stability and accessibility to the packing positions from the entrance of the container. Under this scenario, DBP was capable of achieving outstanding results in the tested instances up to 100% volume utilization in fully feasible packings. Under comparative tests, DBP considerably improved results obtained from a wall-building LB-Greedy heuristic and showed high generalization capacity to different sizes of the Information window (number of boxes from the whole sequence it can see and choose from at any moment in time). After a set of visual step-by-step analyses of DBP’s behavior in generated packing sequences, it was also shown that it was able to achieve high geometric understanding and great potential for being expanded into a real warehouse scenario.
## File structure:
* run.sh: Script for running DBP directly from the terminal (calls main.py)
* main.py: File for Training/ Testing DBP (calls data.py for generating data, DBP.py for packing and draw.py for visualizing and
logging results)
* DBP.py: Packs sequences of boxes into containers according to the DBP algorithm (calls transformer.py for the Neural Network
architecture and reward.py for computing the Reward function used in the training procedure)
* data.py: Generates box sequences and container placeholders
* transformer.py: Neural Network architecture used by DBP (Mapping Transformer)
* reward.py: Implements the Constructive-Utilization reward function
* draw.py: File for drawing and logging results obtained form DBP
* watch.py: Independent file used after training/ testing for visualizing packing results interactively in a step-by-step manner
## Folders:
* data: Holds the box sequences generated by data.py (generated sequences can be used by multiple training/ testing procedures)
* results: Holds the results for each training/ test run. Already contains the experiment folder mentioned in Chapter 4 of this
thesis for an information-window of size 10 as well as its test results for 100 validation containers.
## File structure:
* run.sh: Script for running WB-LB-Greedy directly from the terminal (calls main.py)
* main.py: File for running WB-LB-Greedy (calls data.py for generating data and pack.py for packing & logging)
* pack.py: Packs a sequence of boxes into containers according to the WB-LB-Greedy algorithm (calls tools.py for running
WB-LB-Greedy and for visualizing results)
* tools.py: WB-LB-algorithm and auxiliary functions for visualizing its results
* data.py: Generates box sequences and container placeholders
## Folders:
* data: Holds the box sequences generated by data.py (generated sequences can be used by multiple training/ testing procedures)
* results: Holds the results for each run of the WB-LB-Greedy algorithm
Thanks to "TAP-Net: Transport-and-Pack using Reinforcement Learning" authors for the code made available at https://vcc.tech/research/2020/TAP, which was used as base for this thesis.