Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add object detection training docs #1435

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

Alex-idk
Copy link

Adds documentation for training a custom model.

@Alex-idk Alex-idk requested a review from a team as a code owner September 27, 2024 11:58
Copy link
Contributor

@gerth2 gerth2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

timely. I"ll reference this and see if I can give it a shot

```bash
git clone https://github.com/ultralytics/yolov5.git
git clone https://github.com/airockchip/yolov5.git airockchip-yolov5
wget https://gist.githubusercontent.com/Alex-idk/9a512ca7bd263892ff6991a856f1a458/raw/e8c0c9d8d5a1a60a2bbe72c065e04a261300baac/onnx2rknn.py # This is the onnx to rknn convertion script
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Script looks reasonable, seems like the same thing most online resources have been hinting at.

Do we wanna pull this into the photonvision repo proper?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this script should be owned by the PV organization. Once we start supporting more platforms it can be updated to go from a single .onnx file to all of our supported platform.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As part of this PR, commit this file to our scripts folder or something in the monorepo.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is inside the scripts folder, however I did update the script but haven't tested it yet.


#### Training Command

Please research what each of these parameters do and adjust them to fit your dataset and training needs. Make sure to change the number of classes in the `models/yolov5s.yaml` file to how many classes are in your dataset, otherwise you will run into problems with class labeling. Currently as of `September 2024` only YOLOv5s models have been tested.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"you must know what you are doing" lolz.

Is there anything major you could recommend an initial user look at, just to get them started? Things like "Be careful touching these parameters" or "These parameters are inter-linked" or "Here's the values we started with for the note detector" or similar?

Copy link
Author

@Alex-idk Alex-idk Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main parameters for training are dependent on your dataset and hardware. I can add a link to a docs page that explains what they do. But I'm also assuming that the people who are making a custom model already have an idea of what they are doing. The special thing about doing it for PV is the conversion to RKNN.

Please research what each of these parameters do and adjust them to fit your dataset and training needs. Make sure to change the number of classes in the `models/yolov5s.yaml` file to how many classes are in your dataset, otherwise you will run into problems with class labeling. Currently as of `September 2024` only YOLOv5s models have been tested.

```bash
python train.py --img 640 --batch 16 --epochs 10 --data path/to/dataset/data.yaml --cfg 'models/yolov5s.yaml' --weights '' --cache
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might still need a concrete example of what data.yaml is

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might just be best to point people towards some other resources for preparing yolov5 datasets.

#### Export Command

```bash
cd /path/to/airockchip-yolov5 && python export.py --weights '/path/to/best.pt' --rknpu --include 'onnx'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the instructions above were used, the path should just be airockchip-yolov5 , right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than including this as a separate step could it be merged into the current conversion command?

My thought is people who want to share models (say the PV devs, or community members) could benefit from a single script that goes from the weights to all of our supported platforms.

python onnx2rknn.py /path/to/best.onnx /path/to/export/best.rknn /path/to/imagePaths.txt
```

If you have any questions about this process feel free to mention `alex_idk` in the PhotonVision Discord server.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A wise man once told me to never put my name on anything :)

@Alextopher
Copy link
Contributor

I'm going to try following these instructions and see if I can get it to work 👍

@Alex-idk
Copy link
Author

I will boot up a vm later and try this process from scratch and look for anything I missed.

Copy link
Contributor

@Alextopher Alextopher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Training right now, will report back soon.

Please research what each of these parameters do and adjust them to fit your dataset and training needs. Make sure to change the number of classes in the `models/yolov5s.yaml` file to how many classes are in your dataset, otherwise you will run into problems with class labeling. Currently as of `September 2024` only YOLOv5s models have been tested.

```bash
python train.py --img 640 --batch 16 --epochs 10 --data path/to/dataset/data.yaml --cfg 'models/yolov5s.yaml' --weights '' --cache
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might just be best to point people towards some other resources for preparing yolov5 datasets.

#### Export Command

```bash
cd /path/to/airockchip-yolov5 && python export.py --weights '/path/to/best.pt' --rknpu --include 'onnx'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than including this as a separate step could it be merged into the current conversion command?

My thought is people who want to share models (say the PV devs, or community members) could benefit from a single script that goes from the weights to all of our supported platforms.


#### Conversion Command

Run the script, passing in the ONNX model and a text file containing paths to images from your dataset:
Copy link
Contributor

@Alextopher Alextopher Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and a text file containing paths to images from your dataset:

So we've got to create this file ourselves?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I can probably make a revision of the conversion script to just take a path and make the file itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

find /path -type f -name "*.jpg" > images.txt

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works too.

@Alextopher
Copy link
Contributor

Alextopher commented Sep 27, 2024

I was able to follow the instructions and got the model running on an Orange PI. I'll verify tonight if the model is working.

As for the state of this PR - I think this is a huge help for the PV developers. I'm no longer concerned about the bus factor regarding training object detection models. However, I don't think these instructions would be enough for the average team.

I wonder if we can just change the name of this content from "training a custom model" to "converting trained model to RKNN" for now. I think a proper "training custom model" section would require teaching how to set up an environment, installing all of the dependencies, how to label data, export to YOLO formats, training parameters, and only then conversion and uploading.

@mcm001
Copy link
Contributor

mcm001 commented Oct 13, 2024

Yeah broadly agree with @Alextopher -- it's a great set of docs for developers, and as long as we call that out i'd like to get this shipped before it bitrots :)

@Alex-idk
Copy link
Author

Sorry for the late response, I've been busy with school and theater. Here is a revised version of the docs that removes the model training part, I will work on a update for the conversion script to just take in a directory instead of a file with paths and clean it up a bit.

Is the scripts folder a good place for the conversion script to go in the PV repo?

@mcm001
Copy link
Contributor

mcm001 commented Oct 18, 2024

Sounds legit

@mcm001
Copy link
Contributor

mcm001 commented Oct 25, 2024

Any chance we can get this shoved in for beta (maybe tentatively this weekend?)

@mcm001
Copy link
Contributor

mcm001 commented Nov 1, 2024

Is this ready?

@mcm001
Copy link
Contributor

mcm001 commented Nov 2, 2024

immediately starts pushing new stuff
lol

@Sam948-byte
Copy link
Contributor

Covered by #1723 and #1715. We can still leave it open to document this alternative process, but maybe make it a draft?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants