Approaching Under-Explored Image-Space Problems with Optimization | TU Wien

Information

Publication Type: PhD-Thesis
Workgroup(s)/Project(s):
- Advanced Visual and Geometric Computing for 3D Capture, Display, and Fabrication
Date: December 2024
Date (Start): April 2019
Date (End): 19. December 2024
TU Wien Library: AC17414787
Open Access: yes
1st Reviewer: Amal Dev Parakkat
2nd Reviewer: Jorge Manuel de Oliveira Henrique
Rigorosum: 19. December 2024
First Supervisor: Michael Wimmer
Pages: 110
Keywords: variable-rate shading, light-fields, limited animation, anime, convolutional neural networks

Abstract

This doctoral dissertation delves into three distinct yet interconnected problems in the realm of interactive image-space computing in computer graphics, each of which has not been tackled by existing literature.The first problem centers on the prediction of visual error metrics in real-time applications, specifically in the context of content-adaptive shading and shading reuse. Utilizing convolutional neural networks, this research aims to estimate visual errors without requiring reference or rendered images. The models developed can account for 70%–90% of the variance and achieve computation times that are an order of magnitude faster than existing methods. This enables a balance between resource-saving and visual quality, particularly in deferred shading pipelines, and can achieve up to twice the performance compared to state-of-the-art methods depending on the portion of unseen image regions. The second problem focuses on the burgeoning field of light-field cameras and the challenges associated with depth prediction. This research argues for the refinement of cost volumes rather than depth maps to increase the accuracy of depth predictions. A set of cost-volume refinement algorithms is proposed, which dynamically operate at runtime to find optimal solutions, thereby enhancing the accuracy and reliability of depth estimation in light fields.The third problem tackles the labor-intensive nature of hand-drawn animation, specifically in the detailing of character eyes. An unsupervised network is introduced that blends inpainting and image-to-image translation techniques. This network employs a novel style-aware clustering method and a dual-discriminator optimization strategy with a triple-reconstruction loss. The result is an improvement in the level of detail and artistic consistency in hand-drawn animation, preferred over existing work 95.16% of the time according to a user study.Optimization techniques are the common thread that ties these problems together. While dynamic optimization at runtime is employed for cost volume refinement, deep-learning methods are used offline to train global solutions for the other two problems. This research not only fills gaps in the existing literature but also paves the way for future explorations in the field of computer graphics and optimization, offering new avenues for both academic research and practical applications.

Additional Files and Images

thesis

Weblinks

BibTeX

@phdthesis{cardoso-thesis,
  title =      "Approaching Under-Explored Image-Space Problems with
               Optimization",
  author =     "Joao Afonso Cardoso",
  year =       "2024",
  abstract =   "This doctoral dissertation delves into three distinct yet
               interconnected problems in the realm of interactive
               image-space computing in computer graphics, each of which
               has not been tackled by existing literature.The first
               problem centers on the prediction of visual error metrics in
               real-time applications, specifically in the context of
               content-adaptive shading and shading reuse. Utilizing
               convolutional neural networks, this research aims to
               estimate visual errors without requiring reference or
               rendered images. The models developed can account for
               70%–90% of the variance and achieve computation times that
               are an order of magnitude faster than existing methods. This
               enables a balance between resource-saving and visual
               quality, particularly in deferred shading pipelines, and can
               achieve up to twice the performance compared to
               state-of-the-art methods depending on the portion of unseen
               image regions. The second problem focuses on the burgeoning
               field of light-field cameras and the challenges associated
               with depth prediction. This research argues for the
               refinement of cost volumes rather than depth maps to
               increase the accuracy of depth predictions. A set of
               cost-volume refinement algorithms is proposed, which
               dynamically operate at runtime to find optimal solutions,
               thereby enhancing the accuracy and reliability of depth
               estimation in light fields.The third problem tackles the
               labor-intensive nature of hand-drawn animation, specifically
               in the detailing of character eyes. An unsupervised network
               is introduced that blends inpainting and image-to-image
               translation techniques. This network employs a novel
               style-aware clustering method and a dual-discriminator
               optimization strategy with a triple-reconstruction loss. The
               result is an improvement in the level of detail and artistic
               consistency in hand-drawn animation, preferred over existing
               work 95.16% of the time according to a user
               study.Optimization techniques are the common thread that
               ties these problems together. While dynamic optimization at
               runtime is employed for cost volume refinement,
               deep-learning methods are used offline to train global
               solutions for the other two problems. This research not only
               fills gaps in the existing literature but also paves the way
               for future explorations in the field of computer graphics
               and optimization, offering new avenues for both academic
               research and practical applications.",
  month =      dec,
  pages =      "110",
  address =    "Favoritenstrasse 9-11/E193-02, A-1040 Vienna, Austria",
  school =     "Research Unit of Computer Graphics, Institute of Visual
               Computing and Human-Centered Technology, Faculty of
               Informatics, TU Wien ",
  keywords =   "variable-rate shading, light-fields, limited animation,
               anime, convolutional neural networks",
  URL =        "https://www.cg.tuwien.ac.at/research/publications/2024/cardoso-thesis/",
}