-
Notifications
You must be signed in to change notification settings - Fork 40
Description
Hi,
I want to finetune the Gemma-3-4b-pt for item detection task.
My results are : Training loss reaches around 0.02 in the end. I also added a cross-validation loss but it always fluctuates around 0.9. My test results are not quite satisfying, it gives random output on all test sample.
I have built my dataset of around 1000 images containing bounding boxes. I also applied the logic of create_dataset.py to create the <locxxxx> column.
My data is following a similar format of the license-detection-paligemma dataset: here

I'm using one A100-40GB for training, and I use model.gradient_checkpointing_enable() to avoid OOM problem. The other parameters I put are batch_size=4, epoch=50.
As mentioned in the latest PR, I trained in only 1 stage, embedding and attention trained together.
I get responses like this:
detect
;
locate
[ 29 3109
]
[outputs/output_0.png] No bounding box detected. Skipping visualization.
or
detect
The leaf is a plant structure consisting of two layers of tissues. It has a variety of functions in plant reproduction, photosynthesis, gas exchange, evaporation, storage and
You can try my train script with python train.py --dataset_id zhiyingzou0202/object_detection_bbox_paligemma --batch_size 4 --epochs 25 --include_loc_tokens
Thanks!