1Retina and real-world vision {#tutorial_bioinspired_retina_model} 2============================================================= 3 4Goal 5---- 6 7I present here a model of human retina that shows some interesting properties for image 8preprocessing and enhancement. In this tutorial you will learn how to: 9 10- discover the main two channels outing from your retina 11- see the basics to use the retina model 12- discover some parameters tweaks 13 14General overview 15---------------- 16 17The proposed model originates from Jeanny Herault's research @cite Herault2010 at 18[Gipsa](http://www.gipsa-lab.inpg.fr). It is involved in image processing applications with 19[Listic](http://www.listic.univ-savoie.fr) (code maintainer and user) lab. This is not a complete 20model but it already present interesting properties that can be involved for enhanced image 21processing experience. The model allows the following human retina properties to be used : 22 23- spectral whitening that has 3 important effects: high spatio-temporal frequency signals 24 canceling (noise), mid-frequencies details enhancement and low frequencies luminance energy 25 reduction. This *all in one* property directly allows visual signals cleaning of classical 26 undesired distortions introduced by image sensors and input luminance range. 27- local logarithmic luminance compression allows details to be enhanced even in low light 28 conditions. 29- decorrelation of the details information (Parvocellular output channel) and transient 30 information (events, motion made available at the Magnocellular output channel). 31 32The first two points are illustrated below : 33 34In the figure below, the OpenEXR image sample *CrissyField.exr*, a High Dynamic Range image is 35shown. In order to make it visible on this web-page, the original input image is linearly rescaled 36to the classical image luminance range [0-255] and is converted to 8bit/channel format. Such strong 37conversion hides many details because of too strong local contrasts. Furthermore, noise energy is 38also strong and pollutes visual information. 39 40![image](images/retina_TreeHdr_small.jpg) 41 42In the following image, applying the ideas proposed in @cite Benoit2010, as your retina does, local 43luminance adaptation, spatial noise removal and spectral whitening work together and transmit 44accurate information on lower range 8bit data channels. On this picture, noise in significantly 45removed, local details hidden by strong luminance contrasts are enhanced. Output image keeps its 46naturalness and visual content is enhanced. Color processing is based on the color 47multiplexing/demultiplexing method proposed in @cite Chaix2007 . 48 49![image](images/retina_TreeHdr_retina.jpg) 50 51*Note :* image sample can be downloaded from the [OpenEXR website](http://www.openexr.com). 52Regarding this demonstration, before retina processing, input image has been linearly rescaled 53within 0-255 keeping its channels float format. 5% of its histogram ends has been cut (mostly 54removes wrong HDR pixels). Check out the sample 55*opencv/samples/cpp/OpenEXRimages_HighDynamicRange_Retina_toneMapping.cpp* for similar 56processing. The following demonstration will only consider classical 8bit/channel images. 57 58The retina model output channels 59-------------------------------- 60 61The retina model presents two outputs that benefit from the above cited behaviors. 62 63- The first one is called the Parvocellular channel. It is mainly active in the foveal retina area 64 (high resolution central vision with color sensitive photo-receptors), its aim is to provide 65 accurate color vision for visual details remaining static on the retina. On the other hand 66 objects moving on the retina projection are blurred. 67- The second well known channel is the Magnocellular channel. It is mainly active in the retina 68 peripheral vision and send signals related to change events (motion, transient events, etc.). 69 These outing signals also help visual system to focus/center retina on 'transient'/moving areas 70 for more detailed analysis thus improving visual scene context and object classification. 71 72**NOTE :** regarding the proposed model, contrary to the real retina, we apply these two channels on 73the entire input images using the same resolution. This allows enhanced visual details and motion 74information to be extracted on all the considered images... but remember, that these two channels 75are complementary. For example, if Magnocellular channel gives strong energy in an area, then, the 76Parvocellular channel is certainly blurred there since there is a transient event. 77 78As an illustration, we apply in the following the retina model on a webcam video stream of a dark 79visual scene. In this visual scene, captured in an amphitheater of the university, some students are 80moving while talking to the teacher. 81 82In this video sequence, because of the dark ambiance, signal to noise ratio is low and color 83artifacts are present on visual features edges because of the low quality image capture tool-chain. 84 85![image](images/studentsSample_input.jpg) 86 87Below is shown the retina foveal vision applied on the entire image. In the used retina 88configuration, global luminance is preserved and local contrasts are enhanced. Also, signal to noise 89ratio is improved : since high frequency spatio-temporal noise is reduced, enhanced details are not 90corrupted by any enhanced noise. 91 92![image](images/studentsSample_parvo.jpg) 93 94Below is the output of the Magnocellular output of the retina model. Its signals are strong where 95transient events occur. Here, a student is moving at the bottom of the image thus generating high 96energy. The remaining of the image is static however, it is corrupted by a strong noise. Here, the 97retina filters out most of the noise thus generating low false motion area 'alarms'. This channel 98can be used as a transient/moving areas detector : it would provide relevant information for a low 99cost segmentation tool that would highlight areas in which an event is occurring. 100 101![image](images/studentsSample_magno.jpg) 102 103Retina use case 104--------------- 105 106This model can be used basically for spatio-temporal video effects but also in the aim of : 107 108- performing texture analysis with enhanced signal to noise ratio and enhanced details robust 109 against input images luminance ranges (check out the Parvocellular retina channel output) 110- performing motion analysis also taking benefit of the previously cited properties. 111 112Literature 113---------- 114 115For more information, refer to the following papers : @cite Benoit2010 116 117- Please have a look at the reference work of Jeanny Herault that you can read in his book @cite Herault2010 118 119This retina filter code includes the research contributions of phd/research colleagues from which 120code has been redrawn by the author : 121 122- take a look at the *retinacolor.hpp* module to discover Brice Chaix de Lavarene phD color 123 mosaicing/demosaicing and his reference paper @cite Chaix2007 124 125- take a look at *imagelogpolprojection.hpp* to discover retina spatial log sampling which 126 originates from Barthelemy Durette phd with Jeanny Herault. A Retina / V1 cortex projection is 127 also proposed and originates from Jeanny's discussions. More informations in the above cited 128 Jeanny Heraults's book. 129 130Code tutorial 131------------- 132 133Please refer to the original tutorial source code in file 134*opencv_folder/samples/cpp/tutorial_code/bioinspired/retina_tutorial.cpp*. 135 136@note do not forget that the retina model is included in the following namespace: cv::bioinspired 137 138To compile it, assuming OpenCV is correctly installed, use the following command. It requires the 139opencv_core *(cv::Mat and friends objects management)*, opencv_highgui *(display and image/video 140read)* and opencv_bioinspired *(Retina description)* libraries to compile. 141 142@code{.sh} 143// compile 144gcc retina_tutorial.cpp -o Retina_tuto -lopencv_core -lopencv_highgui -lopencv_bioinspired -lopencv_videoio -lopencv_imgcodecs 145 146// Run commands : add 'log' as a last parameter to apply a spatial log sampling (simulates retina sampling) 147// run on webcam 148./Retina_tuto -video 149// run on video file 150./Retina_tuto -video myVideo.avi 151// run on an image 152./Retina_tuto -image myPicture.jpg 153// run on an image with log sampling 154./Retina_tuto -image myPicture.jpg log 155@endcode 156 157Here is a code explanation : 158 159Retina definition is present in the bioinspired package and a simple include allows to use it. You 160can rather use the specific header : *opencv2/bioinspired.hpp* if you prefer but then include the 161other required openv modules : *opencv2/core.hpp* and *opencv2/highgui.hpp* 162@code{.cpp} 163#include "opencv2/opencv.hpp" 164@endcode 165Provide user some hints to run the program with a help function 166@code{.cpp} 167// the help procedure 168static void help(std::string errorMessage) 169{ 170 std::cout<<"Program init error : "<<errorMessage<<std::endl; 171 std::cout<<"\nProgram call procedure : retinaDemo [processing mode] [Optional : media target] [Optional LAST parameter: \"log\" to activate retina log sampling]"<<std::endl; 172 std::cout<<"\t[processing mode] :"<<std::endl; 173 std::cout<<"\t -image : for still image processing"<<std::endl; 174 std::cout<<"\t -video : for video stream processing"<<std::endl; 175 std::cout<<"\t[Optional : media target] :"<<std::endl; 176 std::cout<<"\t if processing an image or video file, then, specify the path and filename of the target to process"<<std::endl; 177 std::cout<<"\t leave empty if processing video stream coming from a connected video device"<<std::endl; 178 std::cout<<"\t[Optional : activate retina log sampling] : an optional last parameter can be specified for retina spatial log sampling"<<std::endl; 179 std::cout<<"\t set \"log\" without quotes to activate this sampling, output frame size will be divided by 4"<<std::endl; 180 std::cout<<"\nExamples:"<<std::endl; 181 std::cout<<"\t-Image processing : ./retinaDemo -image lena.jpg"<<std::endl; 182 std::cout<<"\t-Image processing with log sampling : ./retinaDemo -image lena.jpg log"<<std::endl; 183 std::cout<<"\t-Video processing : ./retinaDemo -video myMovie.mp4"<<std::endl; 184 std::cout<<"\t-Live video processing : ./retinaDemo -video"<<std::endl; 185 std::cout<<"\nPlease start again with new parameters"<<std::endl; 186 std::cout<<"****************************************************"<<std::endl; 187 std::cout<<" NOTE : this program generates the default retina parameters file 'RetinaDefaultParameters.xml'"<<std::endl; 188 std::cout<<" => you can use this to fine tune parameters and load them if you save to file 'RetinaSpecificParameters.xml'"<<std::endl; 189} 190@endcode 191Then, start the main program and first declare a *cv::Mat* matrix in which input images will be 192loaded. Also allocate a *cv::VideoCapture* object ready to load video streams (if necessary) 193@code{.cpp} 194int main(int argc, char* argv[]) { 195 // declare the retina input buffer... that will be fed differently in regard of the input media 196 cv::Mat inputFrame; 197 cv::VideoCapture videoCapture; // in case a video media is used, its manager is declared here 198@endcode 199In the main program, before processing, first check input command parameters. Here it loads a first 200input image coming from a single loaded image (if user chose command *-image*) or from a video 201stream (if user chose command *-video*). Also, if the user added *log* command at the end of its 202program call, the spatial logarithmic image sampling performed by the retina is taken into account 203by the Boolean flag *useLogSampling*. 204@code{.cpp} 205// welcome message 206 std::cout<<"****************************************************"<<std::endl; 207 std::cout<<"* Retina demonstration : demonstrates the use of is a wrapper class of the Gipsa/Listic Labs retina model."<<std::endl; 208 std::cout<<"* This demo will try to load the file 'RetinaSpecificParameters.xml' (if exists).\nTo create it, copy the autogenerated template 'RetinaDefaultParameters.xml'.\nThen tweak it with your own retina parameters."<<std::endl; 209 // basic input arguments checking 210 if (argc<2) 211 { 212 help("bad number of parameter"); 213 return -1; 214 } 215 216 bool useLogSampling = !strcmp(argv[argc-1], "log"); // check if user wants retina log sampling processing 217 218 std::string inputMediaType=argv[1]; 219 220 ////////////////////////////////////////////////////////////////////////////// 221 // checking input media type (still image, video file, live video acquisition) 222 if (!strcmp(inputMediaType.c_str(), "-image") && argc >= 3) 223 { 224 std::cout<<"RetinaDemo: processing image "<<argv[2]<<std::endl; 225 // image processing case 226 inputFrame = cv::imread(std::string(argv[2]), 1); // load image in RGB mode 227 }else 228 if (!strcmp(inputMediaType.c_str(), "-video")) 229 { 230 if (argc == 2 || (argc == 3 && useLogSampling)) // attempt to grab images from a video capture device 231 { 232 videoCapture.open(0); 233 }else// attempt to grab images from a video filestream 234 { 235 std::cout<<"RetinaDemo: processing video stream "<<argv[2]<<std::endl; 236 videoCapture.open(argv[2]); 237 } 238 239 // grab a first frame to check if everything is ok 240 videoCapture>>inputFrame; 241 }else 242 { 243 // bad command parameter 244 help("bad command parameter"); 245 return -1; 246 } 247@endcode 248Once all input parameters are processed, a first image should have been loaded, if not, display 249error and stop program : 250@code{.cpp} 251if (inputFrame.empty()) 252{ 253 help("Input media could not be loaded, aborting"); 254 return -1; 255} 256@endcode 257Now, everything is ready to run the retina model. I propose here to allocate a retina instance and 258to manage the eventual log sampling option. The Retina constructor expects at least a cv::Size 259object that shows the input data size that will have to be managed. One can activate other options 260such as color and its related color multiplexing strategy (here Bayer multiplexing is chosen using 261*enum cv::bioinspired::RETINA_COLOR_BAYER*). If using log sampling, the image reduction factor 262(smaller output images) and log sampling strength can be adjusted. 263@code{.cpp} 264// pointer to a retina object 265cv::Ptr<cv::bioinspired::Retina> myRetina; 266 267// if the last parameter is 'log', then activate log sampling (favour foveal vision and subsamples peripheral vision) 268if (useLogSampling) 269{ 270 myRetina = cv::bioinspired::createRetina(inputFrame.size(), true, cv::bioinspired::RETINA_COLOR_BAYER, true, 2.0, 10.0); 271} 272else// -> else allocate "classical" retina : 273 myRetina = cv::bioinspired::createRetina(inputFrame.size()); 274@endcode 275Once done, the proposed code writes a default xml file that contains the default parameters of the 276retina. This is useful to make your own config using this template. Here generated template xml file 277is called *RetinaDefaultParameters.xml*. 278@code{.cpp} 279// save default retina parameters file in order to let you see this and maybe modify it and reload using method "setup" 280myRetina->write("RetinaDefaultParameters.xml"); 281@endcode 282In the following line, the retina attempts to load another xml file called 283*RetinaSpecificParameters.xml*. If you created it and introduced your own setup, it will be loaded, 284in the other case, default retina parameters are used. 285@code{.cpp} 286// load parameters if file exists 287myRetina->setup("RetinaSpecificParameters.xml"); 288@endcode 289It is not required here but just to show it is possible, you can reset the retina buffers to zero to 290force it to forget past events. 291@code{.cpp} 292// reset all retina buffers (imagine you close your eyes for a long time) 293myRetina->clearBuffers(); 294@endcode 295Now, it is time to run the retina ! First create some output buffers ready to receive the two retina 296channels outputs 297@code{.cpp} 298// declare retina output buffers 299cv::Mat retinaOutput_parvo; 300cv::Mat retinaOutput_magno; 301@endcode 302Then, run retina in a loop, load new frames from video sequence if necessary and get retina outputs 303back to dedicated buffers. 304@code{.cpp} 305// processing loop with no stop condition 306while(true) 307{ 308 // if using video stream, then, grabbing a new frame, else, input remains the same 309 if (videoCapture.isOpened()) 310 videoCapture>>inputFrame; 311 312 // run retina filter on the loaded input frame 313 myRetina->run(inputFrame); 314 // Retrieve and display retina output 315 myRetina->getParvo(retinaOutput_parvo); 316 myRetina->getMagno(retinaOutput_magno); 317 cv::imshow("retina input", inputFrame); 318 cv::imshow("Retina Parvo", retinaOutput_parvo); 319 cv::imshow("Retina Magno", retinaOutput_magno); 320 cv::waitKey(10); 321} 322@endcode 323That's done ! But if you want to secure the system, take care and manage Exceptions. The retina can 324throw some when it sees irrelevant data (no input frame, wrong setup, etc.). Then, i recommend to 325surround all the retina code by a try/catch system like this : 326@code{.cpp} 327try{ 328 // pointer to a retina object 329 cv::Ptr<cv::Retina> myRetina; 330 [---] 331 // processing loop with no stop condition 332 while(true) 333 { 334 [---] 335 } 336 337}catch(cv::Exception e) 338{ 339 std::cerr<<"Error using Retina : "<<e.what()<<std::endl; 340} 341@endcode 342 343Retina parameters, what to do ? 344------------------------------- 345 346First, it is recommended to read the reference paper @cite Benoit2010 347 348Once done open the configuration file *RetinaDefaultParameters.xml* generated by the demo and let's 349have a look at it. 350@code{.cpp} 351<?xml version="1.0"?> 352<opencv_storage> 353<OPLandIPLparvo> 354 <colorMode>1</colorMode> 355 <normaliseOutput>1</normaliseOutput> 356 <photoreceptorsLocalAdaptationSensitivity>7.5e-01</photoreceptorsLocalAdaptationSensitivity> 357 <photoreceptorsTemporalConstant>9.0e-01</photoreceptorsTemporalConstant> 358 <photoreceptorsSpatialConstant>5.7e-01</photoreceptorsSpatialConstant> 359 <horizontalCellsGain>0.01</horizontalCellsGain> 360 <hcellsTemporalConstant>0.5</hcellsTemporalConstant> 361 <hcellsSpatialConstant>7.</hcellsSpatialConstant> 362 <ganglionCellsSensitivity>7.5e-01</ganglionCellsSensitivity></OPLandIPLparvo> 363<IPLmagno> 364 <normaliseOutput>1</normaliseOutput> 365 <parasolCells_beta>0.</parasolCells_beta> 366 <parasolCells_tau>0.</parasolCells_tau> 367 <parasolCells_k>7.</parasolCells_k> 368 <amacrinCellsTemporalCutFrequency>2.0e+00</amacrinCellsTemporalCutFrequency> 369 <V0CompressionParameter>9.5e-01</V0CompressionParameter> 370 <localAdaptintegration_tau>0.</localAdaptintegration_tau> 371 <localAdaptintegration_k>7.</localAdaptintegration_k></IPLmagno> 372</opencv_storage> 373@endcode 374Here are some hints but actually, the best parameter setup depends more on what you want to do with 375the retina rather than the images input that you give to retina. Apart from the more specific case 376of High Dynamic Range images (HDR) that require more specific setup for specific luminance 377compression objective, the retina behaviors should be rather stable from content to content. Note 378that OpenCV is able to manage such HDR format thanks to the OpenEXR images compatibility. 379 380Then, if the application target requires details enhancement prior to specific image processing, you 381need to know if mean luminance information is required or not. If not, the the retina can cancel or 382significantly reduce its energy thus giving more visibility to higher spatial frequency details. 383 384 385#### Basic parameters 386 387The simplest parameters are as follows : 388 389- **colorMode** : let the retina process color information (if 1) or gray scale images (if 0). In 390 that last case, only the first channels of the input will be processed. 391- **normaliseOutput** : each channel has such parameter: if the value is set to 1, then the considered 392 channel's output is rescaled between 0 and 255. Be aware at this case of the Magnocellular output 393 level (motion/transient channel detection). Residual noise will also be rescaled ! 394 395**Note :** using color requires color channels multiplexing/demultipexing which also demands more 396processing. You can expect much faster processing using gray levels : it would require around 30 397product per pixel for all of the retina processes and it has recently been parallelized for multicore 398architectures. 399 400#### Photo-receptors parameters 401 402The following parameters act on the entry point of the retina - photo-receptors - and has impact on all 403 of the following processes. These sensors are low pass spatio-temporal filters that smooth temporal and 404spatial data and also adjust their sensitivity to local luminance,thus, leads to improving details extraction 405and high frequency noise canceling. 406 407- **photoreceptorsLocalAdaptationSensitivity** between 0 and 1. Values close to 1 allow high 408 luminance log compression's effect at the photo-receptors level. Values closer to 0 provide a more 409 linear sensitivity. Increased alone, it can burn the *Parvo (details channel)* output image. If 410 adjusted in collaboration with **ganglionCellsSensitivity**,images can be very contrasted 411 whatever the local luminance there is... at the cost of a naturalness decrease. 412- **photoreceptorsTemporalConstant** this setups the temporal constant of the low pass filter 413 effect at the entry of the retina. High value leads to strong temporal smoothing effect : moving 414 objects are blurred and can disappear while static object are favored. But when starting the 415 retina processing, stable state is reached later. 416- **photoreceptorsSpatialConstant** specifies the spatial constant related to photo-receptors' low 417 pass filter's effect. Those parameters specify the minimum value of the spatial signal period allowed 418 in what follows. Typically, this filter should cut high frequency noise. On the other hand, a 0 value 419 cuts none of the noise while higher values start to cut high spatial frequencies, and progressively 420 lower frequencies... Be aware to not go to high levels if you want to see some details of the input images ! 421 A good compromise for color images is a 0.53 value since such choice won't affect too much the color spectrum. 422 Higher values would lead to gray and blurred output images. 423 424#### Horizontal cells parameters 425 426This parameter set tunes the neural network connected to the photo-receptors, the horizontal cells. 427It modulates photo-receptors sensitivity and completes the processing for final spectral whitening 428(part of the spatial band pass effect thus favoring visual details enhancement). 429 430- **horizontalCellsGain** here is a critical parameter ! If you are not interested with the mean 431 luminance and want just to focus on details enhancement, then, set this parameterto zero. However, if 432 you want to keep some environment luminance's data, let some low spatial frequencies pass into the system and set a 433 higher value (\<1). 434- **hcellsTemporalConstant** similar to photo-receptors, this parameter acts on the temporal constant of a 435 low pass temporal filter that smoothes input data. Here, a high value generates a high retina 436 after effect while a lower value makes the retina more reactive. This value should be lower than 437 **photoreceptorsTemporalConstant** to limit strong retina after effects. 438- **hcellsSpatialConstant** is the spatial constant of these cells filter's low pass one. 439 It specifies the lowest spatial frequency allowed in what follows. Visually, a high value leads 440 to very low spatial frequencies processing and leads to salient halo effects. Lower values 441 reduce this effect but has the limit of not go lower than the value of 442 **photoreceptorsSpatialConstant**. Those 2 parameters actually specify the spatial band-pass of 443 the retina. 444 445**NOTE** Once the processing managed by the previous parameters is done, input data is cleaned from noise 446and luminance is already partly enhanced. The following parameters act on the last processing stages 447of the two outing retina signals. 448 449#### Parvo (details channel) dedicated parameter 450 451- **ganglionCellsSensitivity** specifies the strength of the final local adaptation occurring at 452 the output of this details' dedicated channel. Parameter values remain between 0 and 1. Low value 453 tend to give a linear response while higher values enforce the remaining low contrasted areas. 454 455**Note :** this parameter can correct eventual burned images by favoring low energetic details of 456the visual scene, even in bright areas. 457 458#### IPL Magno (motion/transient channel) parameters 459 460Once image's information are cleaned, this channel acts as a high pass temporal filter that 461selects only the signals related to transient signals (events, motion, etc.). A low pass spatial filter 462smoothes extracted transient data while a final logarithmic compression enhances low transient events 463thus enhancing event sensitivity. 464 465- **parasolCells_beta** generally set to zero, can be considered as an amplifier gain at the 466 entry point of this processing stage. Generally set to 0. 467- **parasolCells_tau** the temporal smoothing effect that can be added 468- **parasolCells_k** the spatial constant of the spatial filtering effect, set it at a high value 469 to favor low spatial frequency signals that are lower subject for residual noise. 470- **amacrinCellsTemporalCutFrequency** specifies the temporal constant of the high pass filter. 471 High values let slow transient events to be selected. 472- **V0CompressionParameter** specifies the strength of the log compression. Similar behaviors to 473 previous description but here enforces sensitivity of transient events. 474- **localAdaptintegration_tau** generally set to 0, has no real use actually in here. 475- **localAdaptintegration_k** specifies the size of the area on which local adaptation is 476 performed. Low values lead to short range local adaptation (higher sensitivity to noise), high 477 values secure log compression. 478