2017-05-20

Facial Recognition using F# and EmguCV

Read Time: 22 minutes

As the future rushes upon us, there is the growing desire to have a more integrated interaction with our computers. One way is to have our computers recognize us; enter facial recognition. Often this can be done with complex tools, but it is encouraging to be able to do it with something as simple as F# and EmguCV. With these tools in hand, facial recognition can be built into personal projects with ease.

For those not familiar, EmguCV is a one of the available OpenCV .NET wrapper packages. OpenCV has facial recognition built in, so this post will mostly devolve into the details of wiring it up using F#. But once complete, this is a good integration point for additional functionality. Time to get started.

Using Paket, here is a sample paket.dependencies file.

1
2
3

source https://nuget.org/api/v2

nuget EmguCV

Note: This project requires an additional step. I prefer Paket for package management, but that has it’s own set of implications. For this project there is a manual step after Paket has downloaded the packages. To get all things to work, the native dlls must be copied into the same directory as the EmguCV dlls. For me, the command looked like:

cp packages\EmguCV\build\native\x64\* packages\EmguCV\lib\net30

If you don’t do this, you’re likely to get an error that looks like this:

System.TypeInitializationException: The type initializer for 'Emgu.CV.CvInvoke' threw an exception. ---> System.DllNotFoundException: cvextern ...

Once package installation is complete, it is time to link up the library references.

System.IO.Directory.SetCurrentDirectory(__SOURCE_DIRECTORY__)

#I "../packages"
#r "../packages/EmguCV/lib/net30/Emgu.CV.UI.dll"
#r "../packages/EmguCV/lib/net30/Emgu.CV.World.dll"

open System
open System.Drawing
open System.Drawing.Imaging
open System.IO
open Emgu
open Emgu.CV
open Emgu.CV.CvEnum
open Emgu.CV.Structure
open Emgu.CV.UI
open Emgu.Util

Aiming toward my goal of higher interactivity, I put together a little starter app that uses the webcam to see who is sitting at the computer. If it doesn’t recognize the person, it prompts to add them to its database so they can be recognized in the future. Once it knows who they are, it just says hi. There are also some exploratory commands. I know, it isn’t much, but its a nice start to a larger, long-term project.

Building from the ground up, the first part is interacting with the camera. A simple capture.QueryFrame().Bitmap pulls a bitmap. I use a couple wrapper functions to support returning an image appropriate (Image<Gray,Byte>) for the FacialRecognition calls as well as the ability to save a single and series of images from the camera.

/// Camera/photo code
module Camera =
    let rand = new Random()

    /// Capture image from webcam as a bitmap 
    let captureImageBitmap () =
        let (capture:Capture) = new Capture() 
        let (imageBitmap:Bitmap) = capture.QueryFrame().Bitmap 
        imageBitmap 

    /// Capture image from webcam and return a FacialRecognition image
    let captureImage () = 
        let (image:Image<Gray,Byte>) = new Emgu.CV.Image<Gray,Byte>(captureImageBitmap())
        image

    /// Capture image from webcam and save as jpg
    let captureAndSaveImage (filename:string) = 
        let imageBitmap = captureImageBitmap()
        imageBitmap.Save(filename, ImageFormat.Jpeg)

    /// Take photos of person and return list of result photo files
    let takePhotos count (delayMs:int) dir person =
        [1..count]
        |> List.map (fun i -> 
            let filename = Path.Combine(dir, (sprintf "%s_%d.jpg" person (rand.Next(1000000000))))
            captureAndSaveImage filename
            System.Threading.Thread.Sleep(delayMs)
            filename)

Note: One extra take-away from the above code is that F# supports triple slash comments that show up in tooltips.

Triple-slash Tooltip

Next, I setup a simple database of photos and people. There are two components to the database module. The first is general db-ish type things. This includes person id/name lookup (person.txt), as well as a list of all photos taken (photos.txt). I wrap a couple lookup functions using Map as well as the ability to add to the files. In the future, these will be refactored into a real database, but simple text files work well for a demo. The second component is the OpenCV trained facial recognizer. It consists of the ability to train, as well as save the training results (trained.txt).

/// Recognition database
module Db =

    /// Number of photos during image capture
    let photosToTake = 50

    /// Delay (ms) between photos during image capture
    let delayMs = 100

    /// Data directory
    let dataDir = "../data/"

    /// Lookup table filename: Person id, name
    let personFilename = Path.Combine(dataDir, "person.txt")

    /// Lookup table filename: Photos id, path
    let photosFilename = Path.Combine(dataDir,  "photos.txt")

    /// Trained data filename
    let trainedFilename = Path.Combine(dataDir, "trained.txt")

    /// Person-level summary of prediction match v total result
    type ValidationResult = { name:string; matchCount:int; totalCount:int }

    /// Create a map from a lookup file
    let private makeMap filename fn = 
        if File.Exists(filename) 
        then 
            File.ReadAllLines(filename)
            |> Array.map (fun x -> x.Split [| '|' |])
            |> Array.map fn
            |> Map.ofArray
        else Map.ofList []
        
    /// Lookup person's id (using name)
    let private lookupId () = makeMap personFilename (fun x -> (x.[1], int x.[0]))

    /// Lookup person's name (using id)
    let private lookupName () = makeMap personFilename (fun x -> (int x.[0], x.[1]))

    /// Get a new id for a person
    let private getNewId () = 
        let ids = 
            Map.toArray (lookupName())
            |> Array.map fst
        if Array.length ids = 0 then 1 else (Array.max ids) + 1

    /// Get person's id (by name)
    let private getPersonId (map:Collections.Map<string,int>) (name:string) =
        if map.ContainsKey name then Some (map.Item name) else None

    /// Get person's name (by id)
    let private getPersonName (map:Collections.Map<int,string>) (id:int) =
        if map.ContainsKey id then Some (map.Item id) else None

    /// Append to new entries to photo db
    let private appendPhotosLines photosFilename lines =
        IO.File.AppendAllLines(photosFilename, lines)

    /// Add a person to the person lookup file
    let private addPersonId filename person id = 
        File.AppendAllLines(filename, [| sprintf "%d|%s" id person |])

    /// Add new photos just taken to photos file
    let private appendToPhotosFile dbFilename imageNames person = 
        let id =
            match getPersonId (lookupId()) person with
            | Some(x) -> x
            | None    ->
                        let id' = getNewId()
                        addPersonId personFilename person id' |> ignore
                        id'

        imageNames
        |> List.map (fun x -> sprintf "%d|%s" id x)
        |> appendPhotosLines dbFilename  

    /// ... Face Detection Functions (see below) ...

    /// List of people in the db
    let PersonList () =
        lookupId() 
        |> Map.toSeq 
        |> Seq.map fst


    /// Take a photo and lookup person's name
    /// Return their name
    let lookupPerson () =
        // take photo
        let image = Camera.captureImage()

        // Lookup in db
        // if found, return name, else return none
        let trainer = getTrainer()
        match trainer with
        | Some(trainer) -> getPersonName (lookupName()) (trainer.Predict(image).Label)
        | _             -> None

    /// Take photos and add person to database
    let addPerson name =
        // take photos
        let photoList = Camera.takePhotos photosToTake delayMs dataDir name 

        // Add to photosfile
        appendToPhotosFile photosFilename photoList name 

        // Train with new photos
        let trainer = trainFaceDetector photosFilename trainedFilename
        trainer.Save trainedFilename

    /// Create a blank user (with a smiley face photo)
    let createBlankUser () =
        let blankUserId = 0
        let blankUserName = "blank"
        let blankImageName = Path.Combine(dataDir, "blank.jpg")

        let createBlankImage (filename:string) =
            let blankImage = new Bitmap(640, 480)
            let g = Graphics.FromImage(blankImage)
            g.FillEllipse(new SolidBrush(Color.Yellow), 220, 140, 200, 200)
            g.FillEllipse(new SolidBrush(Color.White), 260, 180, 40, 40)
            g.FillEllipse(new SolidBrush(Color.White), 340, 180, 40, 40)
            g.DrawArc(new Pen(Color.White, 5.F), new Rectangle(260, 230, 120, 70), 10.F, 160.F)
            blankImage.Save(filename, ImageFormat.Jpeg)

        if not (File.Exists(personFilename)) 
        then 
            addPersonId personFilename blankUserName blankUserId
            createBlankImage blankImageName
            appendPhotosLines photosFilename [ sprintf "%d|%s" blankUserId blankImageName ]
        else ()

    /// Db initial setup
    let init() = 
        createBlankUser()

The application is a fun way to demonstrate functionality, but the really interesting parts (the OpenCV calls) can get lost of big blocks of code. I’ve pulled them out to make scanning for the interesting bits easier. Training turns out to be pretty easy. I’ve opted to use the FisherFace model, but Emgu also supports EigenFace and LBPHFace models. To train, call the CV.Face.FischerFaceRecognizer.Train function. It takes two arrays; one of images, and a corresponding one of int labels. It also assumes you have multiple classes, which makes sense, since what would you be training otherwise? To accomodate always having at least two classes, I created a Db.init() function that creates a blank user with a single smiley face image. Hopefully I don’t get classified as a ☺. Currently the training is one big pass, a future refactor will include iterative training.

Once trained, face prediction is done with the CV.Face.FischerFaceRecognizer.Predict call. It returns the predicted int label of the face. This maps to id, so its a simple lookup at that point for the name. All of the other stuff is boilerplate to load images and return results. The last part of this puzzle is loading saved predictions, using CV.Face.FisherFaceRecognizer.Load.

/// Train face detector
let private trainFaceDetector photosFilename trainedFilename =
    // Get labels and photos for training
    let (ids, photos) = 
        IO.File.ReadAllLines(photosFilename)
        |> Array.map (fun x -> 
            let columns = x.Split [| '|' |]
            (int columns.[0], columns.[1]))
        |> Array.map (fun (id, photoFilename) ->
            let image = new Image<Gray, Byte>(photoFilename)
            (id, image))
        |> Array.unzip 

    // Train based on photos
    let trainer = new CV.Face.FisherFaceRecognizer() 
    trainer.Train<Gray, Byte>(photos, ids)

    // Save trained data
    trainer.Save(trainedFilename)

    trainer


/// Perform prediction validation for a set of photos and trainer
let validatePredictions photosFilename (trainer:Face.FisherFaceRecognizer) =
        File.ReadAllLines(photosFilename)
        |> Array.map (fun x -> x.Split [| '|' |])
        |> Array.map (fun x ->
            let image = new Image<Gray, Byte>(x.[1])
            let predicted = trainer.Predict(image)
            { 
                ValidationResult.name = lookupName().Item predicted.Label;
                matchCount = (if int x.[0] = predicted.Label then 1 else 0);
                totalCount = 1
            })
        |> Array.groupBy (fun x -> x.name)
        |> Array.map (fun x -> 
            { 
                ValidationResult.name = fst x;
                matchCount = snd x |> Array.map (fun y -> y.matchCount) |> Array.sum;
                totalCount = snd x |> Array.map (fun y -> y.totalCount) |> Array.sum
            })


/// Load the trained face recognizer
let getTrainer () =
    let trainer = new CV.Face.FisherFaceRecognizer()
    if File.Exists(trainedFilename) 
    then 
        trainer.Load(trainedFilename)
        Some trainer
    else None

Next, putting it all together. By this point the interesting things are already complete, all that’s left is wrapper code. In the App module I build out the commands as well as the main loop. I also have a small Db.init() call to create the blank image that I mentioned earlier. Beyond that, the functions speak for themselves (and there are comments), so I won’t go into detail here.

/// Application
module App = 

    /// Add a person to the photo db
    let addPerson () = 
        printfn "Name to add (ensure person is in front of the camera): "
        let name = Console.ReadLine()
        printfn "Taking photos and training..."
        Db.addPerson name

    /// Lookup person currently in front of camera
    let whoAmI () =
        let person = Db.lookupPerson()
        match person with 
        | Some(person) ->
            printfn "You are %s" person
        | None -> 
            printfn "I don't recognize you.  Sorry."

    /// Display a list of known people in the db
    let reportPeople () =
        printfn "People"
        printfn "------"
        Db.PersonList()
        |> Seq.iter (printfn "%s")

    /// Display a validation report for recognition
    let reportValidation () =
        // run validation against existing photos
        let trainer = Db.getTrainer()
        match trainer with
        | Some(trainer) -> 
            Db.validatePredictions Db.photosFilename trainer
            |> Array.iter (fun x -> printfn "%10s %5d %5d %5.2f" x.name x.matchCount x.totalCount ((float x.matchCount) / (float x.totalCount)))
        | _             -> printfn "No training data"

    /// Show available commands
    let showHelp() =
        printfn "Commands: [addperson|whoami|people|validate|help|exit]"

    /// Execute provided command
    /// Return true to keep processing, false to exit
    let doCommand (command:string) = 
        match command.ToLower() with
        | "addperson"   -> addPerson(); true
        | "whoami"      -> whoAmI(); true
        | "people"      -> reportPeople(); true
        | "validate"    -> reportValidation(); true
        | "help"        -> showHelp(); true
        | "exit"        -> false
        | _             -> printfn "unknown command"; true


    /// Get a person's name (by lookup or prompt to add to db)
    let getName() =
        let person = Db.lookupPerson()
        match person with 
        | Some(person) ->
            printfn "Hi %s" person
            Some person
        | None -> 
            printfn "I don't recognize you. What is your name? "
            let name = Console.ReadLine()
            printfn "Taking photos and training..."
            Db.addPerson name
            // Note: Could return name, but I want to explicitly force a lookup
            //Some name    
            None

    /// Main
    let rec main name = 
        match name with
        | Some(name) ->
            Console.Write("> ")
            let line = Console.ReadLine()
            let keepGoing = doCommand line
            if keepGoing then main (Some name) else ()
        | None       ->
            let name = getName()
            main name

Db.init()
App.main None

The code is all together, it’s time to take the application for a test drive.

Application Execution

Great! It can tell the difference between people. (Sidebar: the detection isn’t perfect; but more, and better quality, data often helps with accuracy.) But I can’t leave well enough alone. It feels too impersonal, if only it knew how I was feeling. Lucky for me, Amazon’s Rekognition api holds the key to some fun bits. The TLDR; the api provides, among other things, its prediction of how a person in a picture is feeling. Other interesting components are: age range, gender, do they have glasses, and feature location.

Before I get into the code, the first requirement is an AWS account. Second, an IAM must be created with Rekognition service permissions. Third, add the IAM credentials to the credentials file, ~/.aws/credentials.

1
2
3

[default]
aws_access_key_id = <your id here>
aws_secret_access_key = <your key here>

Note: I’ll make a quick mention here about SSL certs. There was an easy to overcome snag when making the AWS call. When running the code in VSCode/Ionide, it ran fine. When running it from the commandline using fsharpi, I got an error. Specifically this: Amazon.Runtime.AmazonServiceException: A WebException with status TrustFailure was thrown. ---> System.Net.WebException: Error: TrustFailure (The authentication or decryption has failed.) ---> System.IO.IOException: The authentication or decryption has failed. ---> System.IO.IOException: The authentication or decryption has failed. ---> Mono.Security.Protocol.Tls.TlsException: Invalid certificate received from server. Error code: 0xffffffff800b010a at Mono.Security.Protocol.Tls.RecordProtocol.EndReceiveRecord (System.IAsyncResult asyncResult) .... There are several ways to resolve this error. I opt’d to solve this by importing the mozilla certs for mono using mozroots.exe. Other options can be found at (http://www.mono-project.com/docs/faq/security/); your mileage may vary.

Once these components are in place, a couple small modifications are required. First, add a new package to my package.dependencies file.

1	nuget AWSSDK.Rekognition

Next, add the Rekognition dll’s and namespace.

#r "../packages/AWSSDK.Core/lib/net45/AWSSDK.Core.dll"
#r "../packages/AWSSDK.Rekognition/lib/net45/AWSSDK.Rekognition.dll"

open Amazon
open Amazon.Rekognition

I’m going to put all the code into a new module. To get what we need out of the api, the basic workflow is:

Take photo as Bitmap.
Create a request object (with the image attached).
Call the Rekognition api with the request object.
Grab the attributes of the response object I care about.

For details on the api, I recommend looking at the Rekognition DetectFaces Documentation. There are a couple small details for my implementation I want to mention. The api grabs all faces it can find, so FaceDetail is an array. My use case presumes one person, or if more, it just takes the first one it finds. The api emotion is a list of possible emotions with their probabilities. This isn’t very friendly looking, so I only show the highest probability emotion. The whoAmI function provides some addition interesting reporting from the image.

/// Additional face analysis features
module FaceExtra = 

    /// Take an image and return an aws Model.Image
    let private bitmapToModelImage (image:Bitmap) = 
        // Load image into memorystream
        let ms = new MemoryStream()
        image.Save(ms, ImageFormat.Jpeg);

        // Convert memorystream to aws' Model.Image
        let modelImage = new Model.Image()
        modelImage.Bytes <- ms

        // Return model image
        modelImage


    /// Take an image filename and return an aws DetectFacesRequest 
    let private buildRequest (image:Bitmap) = 
        let request = new Model.DetectFacesRequest()

        // Get all attributes back from api
        let attributeList = new Collections.Generic.List<string>()
        attributeList.Add("ALL")
        request.Attributes <- attributeList

        // Set Image
        request.Image <- bitmapToModelImage image

        // Return request
        request


    /// Given a list of emotions, return the highest confidence one (in tuple form)
    let getMainEmotion (emotions:Collections.Generic.List<Model.Emotion>) = 
        if emotions.Count <> 0 
        then
            emotions
            |> Seq.sortByDescending (fun x -> x.Confidence)
            |> Seq.head
            |> (fun e -> (e.Type.Value, e.Confidence))
        else ("Unknown", float32 0.)


    /// Query the rekognition api using the provided bitmap image
    let getFaceDetails (image:Bitmap) = 
        let request = buildRequest image

        let rekognition = new Amazon.Rekognition.AmazonRekognitionClient(Amazon.RegionEndpoint.USEast1)

        let detectedFaces = rekognition.DetectFaces(request)

        detectedFaces


    /// Take a snapshot and determine the person's emotional state.
    let getCurrentEmotion () =
        let details = getFaceDetails (Camera.captureImageBitmap())

        if details.FaceDetails.Count <> 0
        then Some ((fst (getMainEmotion details.FaceDetails.[0].Emotions)).ToLower())
        else None


    /// Make a friendly description string for showing if face has an attribute
    let attributeDisplay (value:bool) (description:string) =
        if value then sprintf "Has %s" description else sprintf "No %s" description 


    /// Build a simple report string
    let getFaceReport () = 
        let details = getFaceDetails (Camera.captureImageBitmap())

        if details.FaceDetails.Count <> 0
        then
            let face = details.FaceDetails.[0]
            sprintf 
                "Gender: %s\r\nAge: %d - %d\r\nEmotions: %s\r\n%s\r\n%s\r\n%s\r\n%s" 
                face.Gender.Value.Value 
                face.AgeRange.Low 
                face.AgeRange.High
                (String.Join(", ", face.Emotions |> Seq.map (fun x -> sprintf "%s (%f)" x.Type.Value x.Confidence)))
                (attributeDisplay face.Beard.Value "beard")
                (attributeDisplay face.Mustache.Value "mustache")
                (attributeDisplay face.Eyeglasses.Value "glasses")
                (attributeDisplay face.Sunglasses.Value "sunglasses")
        else ""

The additional functionality gets wired into the whoAmI and getName calls. This is a pretty simple add.

/// Lookup person currently in front of camera
let whoAmI () =
    let person = Db.lookupPerson()
    match person with 
    | Some(person) ->
        let report = FaceExtra.getFaceReport()
        printfn "You are %s\r\n%s" person report 
    | None -> 
        printfn "I don't recognize you.  Sorry."

/// Get a person's name (by lookup or prompt to add to db)
let getName() =
    let person = Db.lookupPerson()
    match person with 
    | Some(person) ->
        let emotion = FaceExtra.getCurrentEmotion()
        match emotion with
        | Some(emotion) -> printfn "Hi %s, you seem %s" person emotion
        | None          -> printfn "Hi %s" person
        Some person
    | None -> 
        printfn "I don't recognize you. What is your name? "
        let name = Console.ReadLine()
        printfn "Taking photos and training..."
        Db.addPerson name
        // Note: Could return name, but I want to explicitly force a lookup
        //Some name    
        None

Time to checkout how the new functionality looks.

With Emotion

Cool. It did a pretty good job. The great thing is, this kind of technology will only get better. This has been fun, but the post has already gone longer than intended, so I’ll end it here. I hope you enjoyed this little glimpse into facial recognition and information extraction. Until next time.