IMAGGarment: Fine-Grained Garment Generation for Controllable Fashion Design

1 National University of Singapore 2 Nanjing University of Science and Technology
3 Nanjing University 4 Nanjing Forestry University

Abstract

This paper presents IMAGGarment, a fine-grained garment generation (FGG) framework that enables high-fidelity garment synthesis with precise control over silhouette, color, and logo placement. Unlike existing methods that are limited to single-condition inputs, IMAGGarment addresses the challenges of multi-conditional controllability in personalized fashion design and digital apparel applications. Specifically, IMAGGarment employs a two-stage training strategy to separately model global appearance and local details, while enabling unified and controllable generation through end-to-end inference. In the first stage, we propose a global appearance model that jointly encodes silhouette and color using a mixed attention module and a color adapter. In the second stage, we present a local enhancement model with an adaptive appearance-aware module to inject user-defined logos and spatial constraints, enabling accurate placement and visual consistency. To support this task, we release GarmentBench, a large-scale dataset comprising over 180K garment samples paired with multi-level design conditions, including sketches, color references, logo placements, and textual prompts. Extensive experiments demonstrate that our method outperforms existing baselines, achieving superior structural stability, color fidelity, and local controllability performance. Code, models, and datasets are publicly available at https://github.com/muzishen/IMAGGarment.

Motivation

Image0

Method

Image0

Our framework comprises two components: a global appearance model (stage I) and a local enhancement model (stage II), which explicitly disentangle and jointly control the global appearance and local details under multi-conditional guidance, enabling accurate synthesis of garment silhouette, color, and logo placement. The global appearance model first generates a latent of coarse garment image conditioned on the textual prompt, garment silhouette, and color palette. Subsequently, the local enhancement model refines this latent representation by integrating user-defined logo and spatial constraint, producing the final high-fidelity garment image with fine-grained controllability.

Example

Example comparison on seen and unseen garments

Example comparison on cross-dataset results
IMAGDressing-v1 Features

GarmentBench Dataset Demo

GarmentBench Dataset

GarmentBench is the first publicly available dataset that includes textual descriptions, sketches, colors, logos and locations. Specifically, GarmentBench includes 189,966 garments.

Citation Information

@article{shen2025IMAGGarment-1,
    title={IMAGGarment-1: Fine-Grained Garment Generation for Controllable Fashion Design},
    author={Shen, Fei and Yu, Jian and Wang, Cong and Jiang, Xin and Du, Xiaoyu and Tang, Jinhui},
    booktitle={arXiv preprint arXiv:2504.13176},
    year={2025}
  }
You can add a tracker to track page visits by creating an account at statcounter.com