Consistency Margin Adapter for Few-Shot Medical Anomaly Detection

Authors

  • Liqiang Song College of Computer Science and Technology, Qingdao University, Qingdao 266071, China Author
  • Yu Zhu College of Computer Science and Technology, Qingdao University, Qingdao 266071, China Author

DOI:

https://doi.org/10.63313/AJET.9033

Keywords:

Medical anomaly detection, Few-shot, Vision-language models

Abstract

Vision-language models such as CLIP have become a versatile foundation for zero and few-shot medical anomaly detection, yet most adapter designs still rely on shal-low projections between visual tokens and handcrafted textual prompts. We introduce a Consistency Margin Adapter that aligns multi-layer visual representations with normal prompts via learnable directional shifts, semantic margin objectives, layer-wise consistency, and entropy regularization. The adapter produces reliable heatmaps from only k-shot support samples and integrates seamlessly with existing CLIP+CoOp pipelines. Extensive experiments on BUSI, BrainMRI, and CheXpert demonstrate superior image and pixel-level AUROC over prior adapters, particularly in the low-shot regime.

References

[1] Radford A, Kim JW, Hallacy C, et al. Learning Transferable Visual Models From Natural Language Supervision [Homepage on the Internet]. 2021 [cited 2026 Jan 21];Available from: http://arxiv.org/abs/2103.00020

[2] Zhou K, Yang J, Loy CC, Liu Z. Learning to Prompt for Vision-Language Models. Int J Comput Vis 2022;130(9):2337–2348.

[3] Zhou K, Yang J, Loy CC, Liu Z. Conditional Prompt Learning for Vision-Language Models [Homepage on the Internet]. 2022 [cited 2024 July 5];Available from: http://arxiv.org/abs/2203.05557

[4] Tjoa E, Guan C. A Survey on Explainable Artificial Intelligence (XAI): Toward Medical XAI. IEEE Transactions on Neural Networks and Learning Systems 2021;32(11):4793–4813.

[5] Jia M, Tang L, Chen B-C, et al. Visual Prompt Tuning [Homepage on the Internet]. 2022 [cited 2026 Jan 21];Available from: http://arxiv.org/abs/2203.12119

[6] Laine S, Aila T. TEMPORAL ENSEMBLING FOR SEMI-SUPERVISED LEARNING. 2017;

[7] Al-Dhabyani W, Gomaa M, Khaled H, Fahmy A. Dataset of breast ultrasound im-ages. Data in Brief 2020;28:104863.

[8] Irvin J, Rajpurkar P, Ko M, et al. CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison. Proceedings of the AAAI Conference on Artificial Intelligence 2019;33(01):590–597.

[9] Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A. Learning Deep Features for Discriminative Localization [Homepage on the Internet]. 2015 [cited 2026 Jan 21];Available from: http://arxiv.org/abs/1512.04150

[10] Jeong J, Zou Y, Kim T, Zhang D, Ravichandran A, Dabeer O. WinCLIP: Ze-ro-/Few-Shot Anomaly Classification and Segmentation [Homepage on the Internet]. 2023 [cited 2026 Jan 21];Available from: http://arxiv.org/abs/2303.14814

[11] Park S, Byun H. Fair-VPT: Fair Visual Prompt Tuning for Image Classifica-tion [Homepage on the Internet]. In: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, WA, USA: IEEE, 2024 [cited 2026 Jan 21]; p. 12268–12278.Available from: https://ieeexplore.ieee.org/document/10655754/

[12] Zhu J, Lai S, Chen X, Wang D, Lu H. Visual Prompt Multi-Modal Tracking [Homepage on the Internet]. 2023 [cited 2026 Jan 22];Available from: http://arxiv.org/abs/2303.10826

[13] Zavrtanik V, Kristan M, Skocaj D. DRÆM – A discriminatively trained recon-struction embedding for surface anomaly detection [Homepage on the Internet]. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal, QC, Canada: IEEE, 2021 [cited 2026 Jan 22]; p. 8310–8319.Available from: https://ieeexplore.ieee.org/document/9710329/

[14] Xie Q, Luong M-T, Hovy E, Le QV. Self-Training With Noisy Student Im-proves ImageNet Classification [Homepage on the Internet]. In: 2020 IEEE/CVF Con-ference on Computer Vision and Pattern Recognition (CVPR). Seattle, WA, USA: IEEE, 2020 [cited 2026 Jan 21]; p. 10684–10695.Available from: https://ieeexplore.ieee.org/document/9156610/

[15] Lee D-H. Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks. ICML 2013 Workshop : Challenges in Representa-tion Learning (WREPL) 2013;

[16] Dosovitskiy A, Beyer L, Kolesnikov A, et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale [Homepage on the Internet]. 2021 [cited 2026 Jan 21];Available from: http://arxiv.org/abs/2010.11929

[17] Lin T-Y, Goyal P, Girshick R, He K, Dollar P. Focal Loss for Dense Object Detection.

[18] Sudre CH, Li W, Vercauteren T, Ourselin S, Cardoso MJ. Generalised Dice overlap as a deep learning loss function for highly unbalanced segmentations [Homep-age on the Internet]. 2017 [cited 2026 Jan 23]; p. 240–248.Available from: http://arxiv.org/abs/1707.03237

[19] Ronneberger O, Fischer P, Brox T. U-Net: Convolutional Networks for Bio-medical Image Segmentation [Homepage on the Internet]. 2015 [cited 2026 Jan 21];Available from: http://arxiv.org/abs/1505.04597

[20] Isola P, Zhu J-Y, Zhou T, Efros AA. Image-to-Image Translation with Condi-tional Adversarial Networks [Homepage on the Internet]. 2018 [cited 2026 Jan 23];Available from: http://arxiv.org/abs/1611.07004

[21] Brain MRI Images for Brain Tumor Detection [Homepage on the Internet]. [cited 2026 Jan 23];Available from: https://www.kaggle.com/datasets/navoneel/brain-mri-images-for-brain-tumor-detection

[22] Pawłowska A, Karwat P, Żołek N. Letter to the Editor. Re: “[Dataset of breast ultrasound images by W. Al-Dhabyani, M. Gomaa, H. Khaled & A. Fahmy, Data in Brief, 2020, 28, 104863]”. Data Brief 2023;48:109247.

[23] Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S. End-to-End Object Detection with Transformers [Homepage on the Internet]. In: Vedal-di A, Bischof H, Brox T, Frahm J-M, editors. Computer Vision – ECCV 2020. Cham: Springer International Publishing, 2020 [cited 2026 Jan 21]; p. 213–229.Available from: https://link.springer.com/10.1007/978-3-030-58452-8_13

[24] Roth K, Pemula L, Zepeda J, Schölkopf B, Brox T, Gehler P. Towards Total Recall in Industrial Anomaly Detection [Homepage on the Internet]. 2022 [cited 2026 Jan 21];Available from: http://arxiv.org/abs/2106.08265

[25] Liu Z, Zhou Y, Xu Y, Wang Z. SimpleNet: A Simple Network for Image Anomaly Detection and Localization [Homepage on the Internet]. 2023 [cited 2026 Jan 23];Available from: http://arxiv.org/abs/2303.15140

[26] Yi J, Yoon S. Patch SVDD: Patch-level SVDD for Anomaly Detection and Segmentation [Homepage on the Internet]. 2020 [cited 2026 Jan 23];Available from: http://arxiv.org/abs/2006.16067

[27] Xiang T, Zhang Y, Lu Y, et al. SQUID: Deep Feature In-Painting for Unsuper-vised Anomaly Detection [Homepage on the Internet]. 2023 [cited 2026 Jan 23];Available from: http://arxiv.org/abs/2111.13495

[28] Wyatt J, Leach A, Schmon SM, Willcocks CG. AnoDDPM: Anomaly Detec-tion with Denoising Diffusion Probabilistic Models using Simplex Noise [Homepage on the Internet]. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recogni-tion Workshops (CVPRW). New Orleans, LA, USA: IEEE, 2022 [cited 2026 Jan 23]; p. 649–655.Available from: https://ieeexplore.ieee.org/document/9857019/

[29] Li C-L, Sohn K, Yoon J, Pfister T. CutPaste: Self-Supervised Learning for Anomaly Detection and Localization [Homepage on the Internet]. 2021 [cited 2026 Jan 23];Available from: http://arxiv.org/abs/2104.04015

[30] Huang C, Guan H, Jiang A, Zhang Y, Spratling M, Wang Y-F. Registration based Few-Shot Anomaly Detection [Homepage on the Internet]. 2022 [cited 2026 Jan 23];Available from: http://arxiv.org/abs/2207.07361

[31] Mallya A, Lazebnik S. PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning [Homepage on the Internet]. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018 [cited 2026 Jan 21]; p. 7765–7773.Available from: https://ieeexplore.ieee.org/document/8578908/

[32] Houlsby N, Giurgiu A, Jastrzebski S, et al. Parameter-Efficient Transfer Learning for NLP [Homepage on the Internet]. 2019 [cited 2026 Jan 21];Available from: http://arxiv.org/abs/1902.00751

Downloads

Published

2026-01-30

Issue

Section

Articles

How to Cite

Consistency Margin Adapter for Few-Shot Medical Anomaly Detection. (2026). Academic Journal of Emerging Technologies, 2(2), 11-19. https://doi.org/10.63313/AJET.9033