[go: up one dir, main page]

Skip to content
/ RMSIN Public

Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation

Notifications You must be signed in to change notification settings

Lsan2401/RMSIN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RMSIN

This repository is the offical implementation for "Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation." Pipeline Image

Setting Up

Preliminaries

The code has been verified to work with PyTorch v1.7.1 and Python 3.7.

  1. Clone this repository.
  2. Change directory to root of this repository.

Package Dependencies

  1. Create a new Conda environment with Python 3.7 then activate it:
conda create -n RMSIN python==3.7
conda activate RMSIN
  1. Install PyTorch v1.7.1 with a CUDA version that works on your cluster/machine (CUDA 10.2 is used in this example):
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=10.2 -c pytorch
  1. Install the packages in requirements.txt via pip:
pip install -r requirements.txt

The Initialization Weights for Training

  1. Create the ./pretrained_weights directory where we will be storing the weights.
mkdir ./pretrained_weights
  1. Download pre-trained classification weights of the Swin Transformer, and put the pth file in ./pretrained_weights. These weights are needed for training to initialize the model.

Datasets

We perform all experiments on our proposed dataset RRSIS-D. RRSIS-D is a new Referring Remote Sensing Image Segmentation benchmark which containes 17,402 image-caption-mask triplets. It can be downloaded from Google Drive or Baidu Netdisk (access code: sjoe).

Usage

  1. Download our dataset.
  2. Copy all the downloaded files to ./refer/data/. The dataset folder should be like this:
$DATA_PATH
├── rrsisd
│   ├── refs(unc).p
│   ├── instances.json
└── images
    └── rrsisd
        ├── JPEGImages
        ├── ann_split

Training

We use DistributedDataParallel from PyTorch for training. To run on 4 GPUs (with IDs 0, 1, 2, and 3) on a single node:

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node 4 --master_port 12345 train.py --dataset rrsisd --model_id RMSIN --epochs 40 --img_size 480 2>&1 | tee ./output

Testing

python test.py --swin_type base --dataset rrsisd --resume ./your_checkpoints_path --split val --workers 4 --window12 --img_size 480

Acknowledgements

Code in this repository is built on LAVT. We'd like to thank the authors for open sourcing their project.

About

Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages