Finetuning Gemma3 on my dataset

Hi,

I want to finetune the **Gemma-3-4b-pt** for item detection task. 

My results are : **Training loss** reaches around 0.02 in the end. I also added a **cross-validation loss** but it always fluctuates around 0.9. My test results are not quite satisfying, it gives random output on all test sample. 

I have built my dataset of around 1000 images containing bounding boxes. I also applied the logic of `create_dataset.py` to create the `<locxxxx>` column. 

My data is following a similar format of the _license-detection-paligemma_ dataset: [here](https://huggingface.co/datasets/zhiyingzou0202/object_detection_bbox_paligemma)
<img width="1120" height="377" alt="Image" src="https://github.com/user-attachments/assets/e231b8c6-5288-49ad-95ff-ea340fa8b3d2" />

I'm using one **A100-40GB** for training, and I use `model.gradient_checkpointing_enable()` to avoid OOM problem. The other parameters I put are `batch_size=4, epoch=50`. 

As mentioned in the latest PR, I trained in only 1 stage, embedding and attention trained together.

I get responses like this: 
```
detect

;

locate

[ 29 3109

]
[outputs/output_0.png] No bounding box detected. Skipping visualization.

```
or
```
detect

The leaf is a plant structure consisting of two layers of tissues. It has a variety of functions in plant reproduction, photosynthesis, gas exchange, evaporation, storage and
```

You can try my train script with `python train.py --dataset_id zhiyingzou0202/object_detection_bbox_paligemma --batch_size 4 --epochs 25 --include_loc_tokens`

Thanks!

 cc @sergiopaniego @ariG23498 @haixuantao

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Finetuning Gemma3 on my dataset #43

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Finetuning Gemma3 on my dataset #43

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions