Mastering OpenCV with Practical Computer Vision Projects

Mastering OpenCV with Practical Computer Vision Projects
eBook: $26.99
Formats: PDF, PacktLib, ePub and Mobi formats
save 15%!
Print + free eBook + free PacktLib access to the book: $71.98    Print cover: $44.99
save 37%!
Free Shipping!
UK, US, Europe and selected countries in Asia.
Also available on:
Table of Contents
Sample Chapters
  • Allows anyone with basic OpenCV experience to rapidly obtain skills in many computer vision topics, for research or commercial use
  • Each chapter is a separate project covering a computer vision problem, written by a professional with proven experience on that topic
  • All projects include a step-by-step tutorial and full source-code, using the C++ interface of OpenCV

Book Details

Language : English
Paperback : 340 pages [ 235mm x 191mm ]
Release Date : December 2012
ISBN : 1849517827
ISBN 13 : 9781849517829
Author(s) : Daniel Lélis Baggio, Shervin Emami, David Millán Escrivá, Khvedchenia Ievgen, Naureen Mahmood, Jason Saragih, Roy Shilkrot
Topics and Technologies : All Books, Application Development, Open Source, Web Graphics & Video

Table of Contents

Chapter 1: Cartoonifier and Skin Changer for Android
Chapter 2: Marker-based Augmented Reality on iPhone or iPad
Chapter 3: Marker-less Augmented Reality
Chapter 4: Exploring Structure from Motion Using OpenCV
Chapter 5: Number Plate Recognition Using SVM and Neural Networks
Chapter 6: Non-rigid Face Tracking
Chapter 7: 3D Head Pose Estimation Using AAM and POSIT
Chapter 8: Face Recognition using Eigenfaces or Fisherfaces
  • Chapter 1: Cartoonifier and Skin Changer for Android
    • Accessing the webcam
    • Main camera processing loop for a desktop app
    • Generating a black-and-white sketch
    • Generating a color painting and a cartoon
    • Generating an "evil" mode using edge filters
    • Generating an "alien" mode using skin detection
      • Skin-detection algorithm
      • Showing the user where to put their face
      • Implementation of the skin-color changer
    • Porting from desktop to Android
      • Setting up an Android project that uses OpenCV
        • Color formats used for image processing on Android
        • Input color format from the camera
        • Output color format for display
      • Adding the cartoonifier code to the Android NDK app
        • Reviewing the Android app
        • Cartoonifying the image when the user taps the screen
        • Saving the image to a file and to the Android picture gallery
    • Showing an Android notification message about a saved image
      • Changing cartoon modes through the Android menu bar
  • Reducing the random pepper noise from the sketch image
    • Showing the FPS of the app
    • Using a different camera resolution
    • Customizing the app
  • Summary
    • Chapter 2: Marker-based Augmented Reality on iPhone or iPad
      • Creating an iOS project that uses OpenCV
        • Adding OpenCV framework
        • Including OpenCV headers
      • Application architecture
      • Marker detection
        • Marker identification
          • Grayscale conversion
          • Image binarization
          • Contours detection
          • Candidates search
        • Marker code recognition
          • Reading marker code
          • Marker location refinement
      • Placing a marker in 3D
        • Camera calibration
        • Marker pose estimation
      • Rendering the 3D virtual object
        • Creating the OpenGL rendering layer
        • Rendering an AR scene
      • Summary
      • References
      • Chapter 3: Marker-less Augmented Reality
        • Marker-based versus marker-less AR
        • Using feature descriptors to find an arbitrary image on video
          • Feature extraction
          • Definition of a pattern object
          • Matching of feature points
            • PatternDetector.cpp
          • Outlier removal
            • Cross-match filter
            • Ratio test
            • Homography estimation
            • Homography refinement
          • Putting it all together
        • Pattern pose estimation
          • PatternDetector.cpp
          • Obtaining the camera-intrinsic matrix
            • Pattern.cpp
        • Application infrastructure
          • ARPipeline.hpp
          • ARPipeline.cpp
          • Enabling support for 3D visualization in OpenCV
          • Creating OpenGL windows using OpenCV
          • Video capture using OpenCV
          • Rendering augmented reality
            • ARDrawingContext.hpp
            • ARDrawingContext.cpp
          • Demonstration
            • main.cpp
        • Summary
        • References
        • Chapter 4: Exploring Structure from Motion Using OpenCV
          • Structure from Motion concepts
          • Estimating the camera motion from a pair of images
            • Point matching using rich feature descriptors
            • Point matching using optical flow
            • Finding camera matrices
          • Reconstructing the scene
          • Reconstruction from many views
          • Refinement of the reconstruction
          • Visualizing 3D point clouds with PCL
          • Using the example code
          • Summary
          • References
            • Chapter 6: Non-rigid Face Tracking
              • Overview
              • Utilities
                • Object-oriented design
                • Data collection: Image and video annotation
                  • Training data types
                  • Annotation tool
                  • Pre-annotated data (The MUCT dataset)
              • Geometrical constraints
                • Procrustes analysis
                • Linear shape models
                • A combined local-global representation
                • Training and visualization
              • Facial feature detectors
                • Correlation-based patch models
                  • Learning discriminative patch models
                  • Generative versus discriminative patch models
                • Accounting for global geometric transformations
                • Training and visualization
              • Face detection and initialization
              • Face tracking
                • Face tracker implementation
                • Training and visualization
                • Generic versus person-specific models
              • Summary
              • References
              • Chapter 7: 3D Head Pose Estimation Using AAM and POSIT
                • Active Appearance Models overview
                • Active Shape Models
                  • Getting the feel of PCA
                  • Triangulation
                  • Triangle texture warping
                • Model Instantiation – playing with the Active Appearance Model
                • AAM search and fitting
                • POSIT
                  • Diving into POSIT
                  • POSIT and head model
                  • Tracking from webcam or video file
                • Summary
                • References
                • Chapter 8: Face Recognition using Eigenfaces or Fisherfaces
                  • Introduction to face recognition and face detection
                    • Step 1: Face detection
                      • Implementing face detection using OpenCV
                      • Loading a Haar or LBP detector for object or face detection
                      • Accessing the webcam
                      • Detecting an object using the Haar or LBP Classifier
                    • Detecting the face
                    • Step 2: Face preprocessing
                      • Eye detection
                      • Eye search regions
                    • Step 3: Collecting faces and learning from them
                      • Collecting preprocessed faces for training
                      • Training the face recognition system from collected faces
                      • Viewing the learned knowledge
                      • Average face
                      • Eigenvalues, Eigenfaces, and Fisherfaces
                    • Step 4: Face recognition
                      • Face identification: Recognizing people from their face
                      • Face verification: Validating that it is the claimed person
                    • Finishing touches: Saving and loading files
                    • Finishing touches: Making a nice and interactive GUI
                      • Drawing the GUI elements
                      • Checking and handling mouse clicks
                  • Summary
                  • References

                  Daniel Lélis Baggio

                  Daniel Lélis Baggio started his work in computer vision through medical image processing at InCor (Instituto do Coração – Heart Institute) in São Paulo, where he worked with intra-vascular ultrasound image segmentation. Since then, he has focused on GPGPU and ported the segmentation algorithm to work with NVIDIA's CUDA. He has also dived into six degrees of freedom head tracking with a natural user interface group through a project called ehci ( He now works for the Brazilian Air Force.

                  Shervin Emami

                  Shervin Emami (born in Iran) taught himself electronics and hobby robotics during his early teens in Australia. While building his first robot at the age of 15, he learned how RAM and CPUs work. He was so amazed by the concept that he soon designed and built a whole Z80 motherboard to control his robot, and wrote all the software purely in binary machine code using two push buttons for 0s and 1s. After learning that computers can be programmed in much easier ways such as assembly language and even high-level compilers, Shervin became hooked to computer programming and has been programming desktops, robots, and smartphones nearly every day since then. During his late teens he created Draw3D (, a 3D modeler with 30,000 lines of optimized C and assembly code that rendered 3D graphics faster than all the commercial alternatives of the time; but he lost interest in graphics programming when 3D hardware acceleration became available.

                  In University, Shervin took a subject on computer vision and became highly interested in it; so for his first thesis in 2003 he created a real-time face detection program based on Eigenfaces, using OpenCV (beta 3) for camera input. For his master's thesis in 2005 he created a visual navigation system for several mobile robots using OpenCV (v0.96). From 2008, he worked as a freelance Computer Vision Developer in Abu Dhabi and Philippines, using OpenCV for a large number of short-term commercial projects that included:

                  • Detecting faces using Haar or Eigenfaces
                  • Recognizing faces using Neural Networks, EHMM, or Eigenfaces
                  • Detecting the 3D position and orientation of a face from a single photo using AAM and POSIT
                  • Rotating a face in 3D using only a single photo
                  • Face preprocessing and artificial lighting using any 3D direction from a single photo
                  • Gender recognition
                  • Facial expression recognition
                  • Skin detection
                  • Iris detection
                  • Pupil detection
                  • Eye-gaze tracking
                  • Visual-saliency tracking
                  • Histogram matching
                  • Body-size detection
                  • Shirt and bikini detection
                  • Money recognition
                  • Video stabilization
                  • Face recognition on iPhone
                  • Food recognition on iPhone
                  • Marker-based augmented reality on iPhone (the second-fastest iPhone augmented reality app at the time).

                  OpenCV was putting food on the table for Shervin's family, so he began giving back to OpenCV through regular advice on the forums and by posting free OpenCV tutorials on his website ( In 2011, he contacted the owners of other free OpenCV websites to write this book. He also began working on computer vision optimization for mobile devices at NVIDIA, working closely with the official OpenCV developers to produce an optimized version of OpenCV for Android. In 2012, he also joined the Khronos OpenVL committee for standardizing the hardware acceleration of computer vision for mobile devices, on which OpenCV will be based in the future.

                  David Millán Escrivá

                  David Millán Escrivá was eight years old when he wrote his first program on an 8086 PC with Basic language, which enabled the 2D plotting of basic equations. In 2005, he finished his studies in IT through the Universitat Politécnica de Valencia with honors in human-computer interaction supported by computer vision with OpenCV (v0.96). He had a final project based on this subject and published it on HCI Spanish congress. He participated in Blender, an open source, 3D-software project, and worked in his first commercial movie Plumiferos - Aventuras voladoras as a Computer Graphics Software Developer. David now has more than 10 years of experience in IT, with experience in computer vision, computer graphics, and pattern recognition, working on different projects and startups, applying his knowledge of computer vision, optical character recognition, and augmented reality. He is the author of the "DamilesBlog" (, where he publishes research articles and tutorials about OpenCV, computer vision in general, and Optical Character Recognition algorithms. David has reviewed the book gnuPlot Cookbook by Lee Phillips and published by Packt Publishing.

                  Khvedchenia Ievgen

                  Khvedchenia Ievgen is a computer vision expert from Ukraine. He started his career with research and development of a camera-based driver assistance system for Harman International. He then began working as a Computer Vision Consultant for ESG. Nowadays, he is a self-employed developer focusing on the development of augmented reality applications. Ievgen is the author of the Computer Vision Talks blog (, where he publishes research articles and tutorials pertaining to computer vision and augmented reality.

                  Naureen Mahmood

                  Naureen Mahmood is a recent graduate from the Visualization department at Texas A&M University. She has experience working in various programming environments, animation software, and microcontroller electronics. Her work involves creating interactive applications using sensor-based electronics and software engineering. She has also worked on creating physics-based simulations and their use in special effects for animation. Here is her blog -

                  Jason Saragih

                  Jason Saragih received his B.Eng degree in mechatronics (with honors) and Ph.D. in computer science from the Australian National University, Canberra, Australia, in 2004 and 2008, respectively. From 2008 to 2010 he was a Postdoctoral fellow at the Robotics Institute of Carnegie Mellon University, Pittsburgh, PA. From 2010 to 2012 he worked at the Commonwealth Scientific and Industrial Research Organization (CSIRO) as a Research Scientist. He is currently a Senior Research Scientist at Visual Features, an Australian tech startup company. Dr. Saragih has made a number of contributions to the field of computer vision, specifically on the topic of deformable model registration and modeling. He is the author of two non-profit open source libraries that are widely used in the scientific community; DeMoLib and FaceTracker, both of which make use of generic computer vision libraries including OpenCV. Here is his blog address -

                  Roy Shilkrot

                  Roy Shilkrot is a researcher and professional in the area of computer vision and computer graphics. He obtained a B.Sc. in Computer Science from Tel-Aviv-Yaffo Academic College, and an M.Sc. from Tel-Aviv University. He is currently a PhD candidate in Media Laboratory of the Massachusetts Institute of Technology (MIT) in Cambridge. Roy has over seven years of experience as a Software Engineer in start-up companies and enterprises. Before joining the MIT Media Lab as a Research Assistant he worked as a Technology Strategist in the Innovation Laboratory of Comverse, a telecom solutions provider. He also dabbled in consultancy, and worked as an intern for Microsoft research at Redmond. Here is his blog address -
                  Sorry, we don't have any reviews for this title yet.

                  Code Downloads

                  Download the code and support files for this book.

                  Submit Errata

                  Please let us know if you have found any errors not listed on this list by completing our errata submission form. Our editors will check them and add them to this list. Thank you.


                  - 1 submitted: last submission 02 Jan 2013

                  Readers can obtain the latest code bundle from:

                  Sample chapters

                  You can view our sample chapters and prefaces of this title on PacktLib or download sample chapters in PDF format.

                  Frequently bought together

                  Mastering OpenCV with Practical Computer Vision Projects +    PhoneGap 3.x Mobile Application Development Hotshot =
                  50% Off
                  the second eBook
                  Price for both: £26.35

                  Buy both these recommended eBooks together and get 50% off the cheapest eBook.

                  What you will learn from this book

                  • Perform Face Analysis including simple Face & Eye & Skin Detection, Fisherfaces Face Recognition, 3D Head Orientation, complex Facial Feature Tracking.
                  • Do Number Plate Detection and Optical Character Recognition (OCR) using Artificial Intelligence (AI) methods including SVMs and Neural Networks
                  • Learn Augmented Reality for desktop and iPhone or iPad using simple artificial markers or complex markerless natural images
                  • Generate a 3D object model by moving a plain 2D camera, using 3D Structure from Motion (SfM) camera reprojection methods
                  • Redesign desktop real-time computer vision applications to more suitable Android & iOS mobile apps
                  • Use simple image filter effects including cartoon, sketch, paint, and alien effects
                  • Execute Human-Computer Interaction with an XBox Kinect sensor using the whole body as a dynamic input

                  In Detail

                  Computer Vision is fast becoming an important technology and is used in Mars robots, national security systems, automated factories, driver-less cars, and medical image analysis to new forms of human-computer interaction. OpenCV is the most common library for computer vision, providing hundreds of complex and fast algorithms. But it has a steep learning curve and limited in-depth tutorials.

                  Mastering OpenCV with Practical Computer Vision Projects is the perfect book for developers with just basic OpenCV skills who want to try practical computer vision projects, as well as the seasoned OpenCV experts who want to add more Computer Vision topics to their skill set or gain more experience with OpenCV’s new C++ interface before migrating from the C API to the C++ API.

                  Each chapter is a separate project including the necessary background knowledge, so try them all one-by-one or jump straight to the projects you’re most interested in.

                  Create working prototypes from this book including real-time mobile apps, Augmented Reality, 3D shape from video, or track faces & eyes, fluid wall using Kinect, number plate recognition and so on.

                  Mastering OpenCV with Practical Computer Vision Projects gives you rapid training in nine computer vision areas with useful projects.


                  Each chapter in the book is an individual project and each project is constructed with step-by-step instructions, clearly explained code, and includes the necessary screenshots.

                  Who this book is for

                  You should have basic OpenCV and C/C++ programming experience before reading this book, as it is aimed at Computer Science graduates, researchers, and computer vision experts widening their expertise.

                  Code Download and Errata
                  Packt Anytime, Anywhere
                  Register Books
                  Print Upgrades
                  eBook Downloads
                  Video Support
                  Contact Us
                  Awards Voting Nominations Previous Winners
                  Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
                  Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software