ArXiv TLDR

Addressing Image Authenticity When Cameras Use Generative AI

🐦 Tweet
2604.21879

Umar Masud, Abhijith Punnappurath, Luxi Zhao, David B. Lindell, Michael S. Brown

cs.CVcs.AI

TLDR

This paper introduces a method to recover the unhallucinated version of camera images, addressing authenticity concerns when cameras integrate generative AI.

Key contributions

  • Addresses image authenticity issues arising from generative AI integration in camera ISPs.
  • Proposes an image-specific MLP decoder and modality-specific encoder to reverse AI hallucinations.
  • Enables recovery of the 'unhallucinated' image version post-capture, without camera ISP access.
  • Solution is compact (180 KB) and can be stored as metadata in standard image formats.

Why it matters

This paper is crucial as it tackles the emerging challenge of generative AI altering images directly within cameras, potentially misleading users. It provides a practical, post-capture solution to restore image authenticity, maintaining trust in digital photography.

Original Abstract

The ability of generative AI (GenAI) methods to photorealistically alter camera images has raised awareness about the authenticity of images shared online. Interestingly, images captured directly by our cameras are considered authentic and faithful. However, with the increasing integration of deep-learning modules into cameras' capture-time hardware -- namely, the image signal processor (ISP) -- there is now a potential for hallucinated content in images directly output by our cameras. Hallucinated capture-time image content is typically benign, such as enhanced edges or texture, but in certain operations, such as AI-based digital zoom or low-light image enhancement, hallucinations can potentially alter the semantics and interpretation of the image content. As a result, users may not realize that the content in their camera images is not authentic. This paper addresses this issue by enabling users to recover the 'unhallucinated' version of the camera image to avoid misinterpretation of the image content. Our approach works by optimizing an image-specific multi-layer perceptron (MLP) decoder together with a modality-specific encoder so that, given the camera image, we can recover the image before hallucinated content was added. The encoder and MLP are self-contained and can be applied post-capture to the image without requiring access to the camera ISP. Moreover, the encoder and MLP decoder require only 180 KB of storage and can be readily saved as metadata within standard image formats such as JPEG and HEIC.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.