Dec 30, 2024●17 reads

Awesome Described Object Detection

c
@chi_xie

News

[02/14/2024] Evaluation on several SOTA methods (SPHNX (the first MLLM evaluated!), G-DINO, UNINEXT, etc.) are released, together with a leaderboard for . :fire::fire:

[10/12/2023] We released an awesome-described-object-detection list to collect and track related works.

[09/22/2023] Our DOD paper just got accepted by NeurIPS 2023! :fire:

[07/25/2023] This toolkit is available on PyPI now. You can install this repo with pip install ddd-dataset.

[07/25/2023] The paper preprint introducing the DOD task and the dataset, is available on arxiv. Check it out!

[07/18/2023] We have released our Description Detection Dataset () and the first version of toolbox. You can download it now for your project.

[07/14/2023] Our GRES paper has been accepted by ICCV 2023.

Task and Dataset Highlight

The

dataset is meant for the Described Object Detection (DOD) task. In the image below we show the difference between Referring Expression Comprehension (REC), Object Detection/Open-Vocabulary Detection (OVD) and Described Object Detection (DOD). OVD detect object based on category name, and each category can have zero to multiple instances; REC grounds one region based on a language description, whether the object truly exits or not; DOD detect all instances on each image in the dataset, based on a flexible reference. Related works are tracked in the awesome-DOD list.

For more information on the characteristics of this dataset, please refer to our paper.

Download

Currently we host the

dataset on cloud drives. You can download the dataset from Google Drive or Baidu Pan.

After downloading the d3_images.zip (images in the dataset), d3_pkl.zip (dataset information for this toolkit) and d3_json.zip (annotation for evaluation), please extract these 3 zip files to your custom IMG_ROOT, PKL_PATH and JSON_ANNO_PATH directory. These paths will be used when you perform inference or evaluation on this dataset.

git clone https://github.com/shikra/d-cube.git
# option 1: install it as a python package
cd d-cube
python -m pip install .
# done

# option 2: just put the d-cube/d_cube directory in the root directory of your local repository

Usage

Please refer to the documentation 📚 for more details.
Our toolbox is similar to cocoapi in style.

Here is a quick example of how to use

from d_cube import D3
d3 = D3(IMG_ROOT, PKL_ANNO_PATH)
all_img_ids = d3.get_img_ids()  # get the image ids in the dataset
all_img_info = d3.load_imgs(all_img_ids)  # load images by passing a list of some image ids
img_path = all_img_info[0]["file_name"]  # obtain one image path so you can load it and inference

Some frequently asked questions are answered in this Q&A file.

Citation

If you use our

dataset, this toolbox, or otherwise find our work valuable, please cite our paper:

@inproceedings{xie2023DOD,
  title={Described Object Detection: Liberating Object Detection with Flexible Expressions},
  author={Xie, Chi and Zhang, Zhao and Wu, Yixuan and Zhu, Feng and Zhao, Rui and Liang, Shuang},
  booktitle={Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS)},
  year={2023}
}

@inproceedings{wu2023gres,
  title={Advancing Referring Expression Segmentation Beyond Single Image},
  author={Wu, Yixuan and Zhang, Zhao and Xie, Chi and Zhu, Feng and Zhao, Rui},
  booktitle={International Conference on Computer Vision (ICCV)},
  year={2023}
}

More works related to Described Object Detection are tracked in this list: awesome-described-object-detection.

Awesome Described Object Detection

Table of contents

News

Contents

Task and Dataset Highlight

Download

Installation

Prerequisites

Install with pip

Install from source