Algorithms and Architectures for Light Fields Coding

Nome do Projeto

Ênfase

Pesquisa

Data inicial - Data final

20/01/2020 - 31/12/2024

Unidade de Origem

Centro de Desenvolvimento Tecnológico

Coordenador Atual

BRUNO ZATT

Área CNPq

Ciências Exatas e da Terra

Resumo

This Project aims to develop algorithms and dedicated hardware architectures for light-fields coding. Novel coding algorithms will be proposed targeting prediction step and residual coding step within the light field codec considering the impact in terms of complexity and memory pressure. Based on the proposed algorithms, novel hardware architectures will be designed including the processing units and memory hierarchies. When considering the technical perspective, developing systems able to encode or decode light fields brings a set of challenges. At the one hand, there is the huge amount of data associated to light fields and the need for efficient coding algorithms. At the other hand, it is necessary to develop high throughput systems able to process light fields at real time whereas respecting processing, memory and energy constraints. Novel algorithms will be proposed targeting high coding efficiency whereas considering computation, memory traffic and energy consumption aspects. Dedicated hardware architectures will be developed, including processing units and memory hierarchy, to implement the proposed algorithm and applying low-power design techniques. The architectures will be described using VHDL, validates, synthesized for ASIC and characterized considering distinct energy operation points The project coordinator presents extensive experience in the field of research and has coauthored one book (published my international press) and dozens scientific papers publish in premier conferences and journals. The execution of this project will lead to increased visibility for the masters and PhD programs of the Programa de Pós-Graduação em Computação at UFPel. This project promotes interaction between local researchers and students but also strengthen relations with other Brazilian and international universities.

Objetivo Geral

MAIN GOAL: The main technical goal of this project is to develop algorithms and dedicated hardware architectures for light-fields coding. Novel coding algorithms will be proposed targeting prediction step and residual coding step within the light field codec considering the impact in terms of complexity and memory pressure. Based on the proposed algorithms, novel hardware architectures will be designed including the processing units and memory hierarchies. The specific goals associated to the main technical goals are detailed in the following.

SPECIFIC GOALS:
The first specific objective of this project is the construction of a survey article about the published solutions for light-field coding in the literature, considering a wide exploration of related works. This survey will be a key reference for the planned activities.
The development of MV-HEVC-based algorithms for prediction and transform is another specific objective of this project. The main idea is to provide a fast solution for light-field coding, developing a modified version of the HTM reference software (ISO/IEC, 2016a) with support for light-field images and video coding, as an “extension” of the MV-HEVC standard, keeping the MV-HEVC coding structure and using many of the already implemented coding tools. Moreover, many characteristics of light-field images, considering videos captured by cameras with microlens array, will be explored, as intra self-similarity (inter microlens) and the redundancy exploration inside the microlens. The MV-HEVC tools for inter or disparity prediction could be used to explore the inter-microlens similarity, as well as conventional intra prediction techniques could be used to handle intra-microlens redundancies.
Another specific objective of this project is the development of new codec algorithms for light-field image and video coding. The idea of this effort is to develop optimized solutions for light-field coding with no concern about previous coding standards, providing complete new reference encoder implementations. We intend to develop specific new algorithms for light-field coding based, for example, in non-block based coding (TAUBMAN, 2002), EPI-based coding (Epipolar-Plane Image) (BRITES, 2015), EPI-hypercubes based coding (BRITES, 2015), Graph and N-D transforms (COHEN, 2016), and others, exploiting the characteristics of the light-field images.
This project also intents the development of dedicated hardware architectures for the developed algorithms. The focus is to provide energy-efficient dedicated hardware designs for real-time coding of high definition light-field images and videos. The developed hardware designs will be first evaluated in FPGA prototyping boards for validation and initial results of hardware resources utilization and performance. The development of a demonstration running at the FPGA board is also an objective of this project, where the most prominent solution designed in hardware will be used for promoting the project main results. The ASIC implementation of the developed designs are also planned, considering the use of standard cells libraries. The ASIC synthesis flow will provide more precise and reliable results for in-chip area and energy consumption of the developed designs.
The publication of the achieved results in relevant international and national conferences and journals is also a specific objective of this project. We are also planning at least one patent deposit with the most relevant proposed solution. Finally, a proposal for standardization body is also an important specific objective for this project.

Justificativa

Despite being introduced as a research topic to the computer graphics community three decades ago (ADELSON, 1992), digital light field imaging has only recently become a hot topic with interest to both academy and consumer market. This has mainly happened due to the increasing availability of digital sensors, computational and storage resources, and the constant improvements of color displays, which together allow capturing, processing, recording and displaying the huge amounts of visual information comprised in light-field images and videos (LEVOY, 2006). Also, the appeal of light-field imaging is currently growing because the stereo-based solutions for 3D image and video proposed in the last years present several limitations that prevented the actual and massive deployment of such technology.
Light-field imaging is based on the plenoptic function, which is a theoretical 7D model that represents all the visual information in the world. The plenoptic function P(x,y,z,θ,Φ,t,λ) models the intensity of light seen from any viewpoint or 3D spatial position (x,y,z), at any angular viewing direction (θ,Φ), at any time (t), and for each possible wavelength (λ). As representing the seven dimensions incurs in very large amounts of data, in practical scenarios the plenoptic function is generally simplified. The concept of light-fields derives from the plenoptic function by introducing a set of constraints, leading to a 4D function. As only the visual electromagnetic spectrum is of interest to represent visual information, the wavelength dimension can be discarded and instead, three functions (each for R, G and B color channels) are used. Also, the camera position (x,y,z) can be replaced by an index k that represents the camera position in a one-dimensional array of cameras. These two simplifications result in a 4D monochromatic model P(k,u,v,t), where the (u,v) pair is equivalent to the ray orientation (θ,Φ) in the 7D function.
The use of light fields in digital image and video applications presents a lot of potential to be a disruptive solution in this scenario, increasing the quality of the represented images, visual comfort for spectators, and opportunities for application developers. Augmented reality, virtual reality and mixed reality are examples of technologies which could be impressively impacted with the development of efficient solutions for light fields. Among many possibilities, light-field image processing allows changing the acquired image focus, lighting, perspective and viewpoint, leading to great user experience, especially in interactive technologies.
The process of acquiring, compressing, rendering and displaying light fields is complex and challenging (CHAN, 2004). After acquired, light-field data can be converted to a format for compression or representation. As the amount of acquired data is enormous, compression is mandatory. Finally, before displaying, it must be decompressed and rendered. The next paragraphs briefly describes these steps in the light-field imaging flow. As the focus of this project is centered on compression solutions for light-field images and videos, special attention will be given to this topic.
Acquisition: Light fields are acquired by capturing a scene with a more or less dense array of traditional cameras or, more often, by a compact sensor with microlenses that sample individual rays of light emanating from different directions, called in-camera light fields (IHRKE, 2016). Commercial light-field cameras available nowadays, such as the Lytro-Illum camera (LYTRO, 2016), are based on in-camera light-field imaging. Basically, it allows the real-world light-field to be first captured by a main lens and then acquired by a set of internal microlenses organized in front of a 2D sensor. With a dense collection of lenses, it is possible to generate new information necessary to correct views and adjust observation positions with different perspectives of the scene where no camera has ever stood. In other words, any viewpoint outside the object convex hull can be generated.
Coding/Compression: After acquired by a light-field camera, the light-field image is usually represented as a 2D structure of micro-images (also called sub-aperture images). The resolution of each micro-image corresponds to the number of directions in which the radiance is measured by the set of microlenses within the camera. Typical LFs (captured using Lytro Illum cameras) are composed of a matrix of 625x434 micro-images. Each micro-image is composed of 15x15 samples, each one corresponding to one angular ray captured by that corresponding microlens. By rearranging the light-field data according to the angular view and not by the microlens, a 15x15 matrix that represents all possible views within the light field can be obtained. Notice that the amount of data captured by such systems is enormous. For example, the array of microlenses in Lytro-Illum cameras allows capturing 7728x5368 10-bit samples in the GRGB format, which makes 98.9 MB per image. Compression is thus essential and the research community has started studies on the field in the last few years.
Compression of light fields is a new and not yet extensively explored issue, so that only a small number of incipient solutions have been published so far in the literature. The most straightforward solution for light-field compression would be to employ existing image/video coding standards. In fact, any current standard could be used without modifications to compress light fields. However, they will not take advantage of the notoriously strong redundancy that exists between the micro-images within a light-field image and between the diverse views of rearranged samples. To explore this redundancy, a few works adapt current state-of-the-art video encoders and compression techniques, aiming at increasing their compression efficiency for light field images.
As mentioned, the amount of data associated to light fields is extremely high and demand high processing power leading to high energy consumption, especially when considering moving light fields (analogous to videos). However, evolving the coding algorithms is not enough to guarantee the evolution of this field and the applicability in real-world applications. It is necessary to develop dedicated hardware architectures for light fields coding able to deliver high throughput and low energy consumption. To reach such efficient architectures, one must joint consider coding algorithms, hardware accelerators, memory hierarchy and power control and adaptation. Therefore, this project focuses on algorithms and dedicated hardware architectures for light fields coding.

Metodologia

Since this project proposes the development of algorithms and hardware for light-fields compression, this will be the scope of all developed activities during the period that comprises its execution. To reach the main project objectives, we have defined a set of specific aims and activities related with those aims, as presented below.

AIM 1: The development of MV-HEVC-based algorithms for prediction and transform is a specific aim of this project. The main idea is to provide a fast solution for light-field coding as an “extension” of the MV-HEVC standard, keeping the MV-HEVC coding structure and using many of the already implemented coding tools. Moreover, many characteristics of light-field images, considering videos captured by cameras with microlenses array, will be explored.
A1.1 - Modified Intra-Frame Prediction: Collocated blocks of images extracted from the same light field presents strong correlations, then, strategies like mode inheritance will be investigated.
A1.2 - Modified Motion Estimation (ME) and Disparity Estimation (DE): When considering video/moving light fields it is necessary to efficiently exploit the temporal and inter-picture (distinct pictures extracted from the same light field) correlations. This activity aims at evaluating alternatives to current ME and DE techniques.
A1.3 Self-Similarity Prediction: The light field may be represented as a large array of micro-images. As neighboring micro-images present similarities, we intend to propose alternative solutions using the concept of self-similarity prediction.
A1.4 EPI-based (Epipolar-Plane Image) Prediction: Epipolar-plane image is a powerful light fields representation that are composed of straight lines with inclination inversely proportional to the depth. We intend to explore this behavior to propose efficient prediction algorithms.

AIM 2: Another specific objective of this project is the development of new codec algorithms for light-field image and video coding, with no concern about previous coding standards, providing complete new reference encoder implementations.
A2.1 - EPI hypercubes coding: Similarly to the previously discussed EPI-based prediction, here we intend to exploit the redundancies/correlations in EPIs (BRITES, 2015). However, EPI may be composed in order to build 4D hypercubes with high correlation across different dimensions. The encoding of light fields considering EPI hypercubes may lead to efficient coding efficiency.
A2.2 - N-D transforms: Motivated by the employment of higher dimensional data structures, such EPI hypercubes, we intend to evaluate the use of higher dimensionality transforms (COHEN, 2016).
A2.3 - Graph transforms: Given the format of micro-lenses implemented in current light field cameras, the micro-images are typically hexagon shaped. Thus, the squared transforms are not ideal in this scenario, then we intend to consider the use of graph transforms in the encoder.

AIM 3: This project also aims at developing dedicated hardware architectures for the proposed algorithms. The focus is to provide energy-efficient dedicated hardware designs for real-time coding of high definition light-field images and videos. The developed hardware designs will be first evaluated in FPGA prototyping boards for validation and initial results of hardware resources utilization and performance. The ASIC implementation of the developed designs are also planned, considering the use of standard cells libraries. The ASIC synthesis flow will provide more precise and reliable results for in-chip area and energy consumption of the developed designs.
A3.1 - Hardware architectures for light-fields prediction: The best prediction algorithms developed in the previous activities will be designed in hardware targeting energy-efficient implementations.
A3.2 - Hardware architecture for transforms: A hardware architecture for the most promising transform algorithm for new codecs will be designed, targeting an energy efficiency solution.

AIM 4: The publication of the achieved results in relevant international conferences (ICIP, ISCAS, ICME, PCS, ACM MM) and journals (TMM, TCSVT, TIP, JRTIP) is also a objective of this project. A proposal for Standard Development Organization (SDO) is another objective.
A4.1 - Survey writing: Based on the bibliography overview and implementations, we plan to write a survey article on light-fields coding techniques and challenges.
A4.2 - Article publications: Article writing and submission to conferences and journals with the main project results.
A4.3 - Contribution to SDO: Elaboration of contributions to Standards Development Organizations.

AIM 5: The technology transfer of the reached results is the final aim of this project. The group is planning patent deposits with the most important reached results and also the presentation and negotiation of these inventions with companies around the globe. The group will try also encourage the PhD and master students involved in this project to open a startup at the university incubator.
A5.1 - Patent deposits: Deposit of patents with the main project results.
A5.2 - Technology transfer: Technology transfer negotiation of main reached results.
A5.3 - Encourage a startup creation: Stimulate and support the PhD and master students of this project to open a startup at the University incubator.
Table 1 presents the schedule of the previously presented activities and their respective bimester (B) of execution.

Indicadores, Metas e Resultados

The expected results at the end of this project are listed below:

• ONE survey about state-of-the-art solutions for light-field coding.
• ONE MV-HEVC-based algorithm for light-field Intra prediction.
• ONE MV-HEVC-based algorithm for light-field Motion/Disparity estimation of Inter prediction.
• ONE dedicated hardware architecture for energy-efficient real-time intra prediction in light-field video coding.
• ONE dedicated hardware architecture for energy-efficient real-time inter prediction in light-field video coding.
• ONE new codec algorithm for light-field prediction.
• ONE new coded algorithm for light-field transform.
• ONE dedicated hardware architecture for energy-efficient real-time transform in light-field video coding.
• ONE demo running in a FPGA board.
• SIX papers published in international conferences as:
o ACM/IEEE Design, Automation and Test in Europe (DATE)
o ACM/EDAC/IEEE Design Automation Conference (DAC)
o IEEE International Symposium on Circuits and Systems (ISCAS)
o IEEE International Conference on Image Processing (ICIP)
o IEEE/ACM Symposium on Integrated Circuits and Systems Design (SBCCI)
o IEEE International Conference on Electronics, Circuits, and Systems (ICECS):
o IEEE Latin American Symposium on Circuits and Systems (LASCAS)
• THREE articles in international journals as:
o IEEE Transaction com Circuits and Systems for Video Technology (TCSVT)
o IEEE Transaction on Circuits and Systems (TCAS)
o IEEE Transaction on Computer-Aided Design (TCAD)
o IEEE Transactions on Very Large Scale Integration (TVLSI)
o IEEE Transaction on Consumer Electronics (TCE)
o IEEE Transaction on Multimedia (TMM)
o Elsevier Signal Processing: Image Communication
o Springer Analog Integrated Circuits and Signal Processing (ALOG)
o Springer Journal of Real-Time Image Processing (JRTIP)
• ONE patent deposit.
• ONE proposal for Standards Development Organization.

Equipe do Projeto

Nome	CH Semanal	Data inicial	Data final
BRUNO ZATT	10
DANIEL MUNARI VILCHEZ PALOMINO	2
DOUGLAS SILVA CORRÊA
EDUARDO AMARO DA ROSA
GUILHERME RIBEIRO CORRÊA	2
JONES WILLIAM GÖEBEL
LUCAS DIAS DOS SANTOS
LUCIANO VOLCAN AGOSTINI	2
MARCELO SCHIAVON PORTO	2
MATEUS SANTOS DE MELO
MATHEUS DA ROSA MOELLER CHAVES
MATHEUS DA SILVA JAHNKE
ÍGOR DE SOUZA ROSLER

Fontes Financiadoras

Sigla / Nome	Valor	Administrador
CNPq / Conselho Nacional de Desenvolvimento Científico e Tecnológico	R$ 39.600,00	Coordenador
CAPES / Coordenação de Aperfeiçoamento de Nível Superior	R$ 30.217,00	Coordenador

Recursos Arrecadados

Fonte	Valor	Administrador
Instituto Serrapilheira	R$ 100.000,00	Coordenador

Plano de Aplicação de Despesas

Descrição	Valor
Bolsas	R$ 78.000,00
Outros encargos	R$ 5.000,00
Material de laboratório	R$ 1.600,00
Material de manutenção de máquinas e equipamentos	R$ 5.000,00
Equipamentos e material permanente (móveis, máquinas, livros, aparelhos etc.)	R$ 50.000,00