Working with the Vision framework in the playgrounds

Photo by Shiau Ling / Unsplash

When playgrounds were first introduced I didn't grasp the idea at first. Time has passed and I finally started to treat playgrounds like... playgrounds. Place where I could quickly test my ideas, make prototypes, and more.

What is great about playgrounds is the constant feedback they provide. In a few seconds, you know whether the changes you made are working properly or not. This is a game-changer for anyone used to compiling and running the application to check how the new code behaves.

This time I won't introduce any new requests. Instead, I will show you how you can work with the Vision framework in playgrounds. Where you can verify whether the framework suits your needs or experiment with the requests.

First, we need to create a playground. In Xcode click on File > New > Playground or use shift + option + command + N. Then select Blank playground from the list.

With playground in place we need to add imports first:

import UIKit
import Vision

Then it's time for the request. I will reuse the code from Image classification using the Vision framework from last week:

func process(_ image: UIImage) {
    guard let cgImage = image.cgImage else { return }
    let request = VNClassifyImageRequest()
    
    let requestHandler = VNImageRequestHandler(cgImage: cgImage,
                                               orientation: .init(image.imageOrientation),
                                               options: [:])
    
    DispatchQueue.global(qos: .userInitiated).async {
        do {
            try requestHandler.perform([request])
        } catch {
            print("Can't make the request due to \(error)")
        }
        
        guard let results = request.results as? [VNClassificationObservation] else { return }
        
        results
            .filter { $0.confidence > 0.7 }
            .forEach { print("\($0.identifier) - \((Int($0.confidence * 100)))%") }
    }
}

And an extension:

extension CGImagePropertyOrientation {
    init(_ uiOrientation: UIImage.Orientation) {
        switch uiOrientation {
        case .up: self = .up
        case .upMirrored: self = .upMirrored
        case .down: self = .down
        case .downMirrored: self = .downMirrored
        case .left: self = .left
        case .leftMirrored: self = .leftMirrored
        case .right: self = .right
        case .rightMirrored: self = .rightMirrored
        @unknown default:
            self = .up
        }
    }
}

Which is described in Detecting body pose using Vision framework.
If you want to know more about the code please refer to the mentioned articles.

We have the processing function ready. Now we need images. The playground has a special place called Resources for providing images and any other assets:

We drag and drop the images:

Instantiate cupcake image in the playground:

let cupcake = UIImage(named: "cupcake.jpg")!

Add code for processing and start the playground by tapping the play icon:

process(cupcake)

The result is printed to the console:

food - 86%
baked_goods - 78%
dessert - 71%
cake - 71%
cupcake - 71%

And that's it! In a couple of minutes, we created functional proof-of-concept code. No application is needed. We could focus on the main task alone.

We could end here but there is a few cool features playground has to offer and it would be a shame to not use them.

What we can improve is how we present inputs and outputs. If we look closely at the line where we created the image:

We will see two icons on the right. If we tap the eye we get the large preview:

The other icon displays the current value inside the playground:

Now let's present the processing results better. First, we need to make a small change to our process function. We drop the dispatch queue code and make it work on the main thread and additionally return the result as an array of strings:

func process(_ image: UIImage) -> [String] {
    guard let cgImage = image.cgImage else { return [] }
    let request = VNClassifyImageRequest()
    
    let requestHandler = VNImageRequestHandler(cgImage: cgImage,
                                               orientation: .init(image.imageOrientation),
                                               options: [:])
    
    do {
        try requestHandler.perform([request])
    } catch {
        print("Can't make the request due to \(error)")
    }
    
    guard let results = request.results as? [VNClassificationObservation] else { return [] }
    
    return results
        .filter { $0.confidence > 0.7 }
        .map { "\($0.identifier) - \((Int($0.confidence * 100)))%" }
}

It's the easiest way to get the results right where we want them:

Note: As you can see the rectangle button can show more than the images.

With this approach, we have a clear overview of our inputs and outputs to check and we don't have to constrain ourselves:

We can do the same for the saliency requests described in Saliency detection using the Vision framework:

Animals requests from Animals detection using the Vision framework:

Or any other.

Playgrounds are a great tool to have.

This is the whole code needed:

import UIKit
import Vision


func process(_ image: UIImage) -> [String] {
    guard let cgImage = image.cgImage else { return [] }
    let request = VNClassifyImageRequest()
    
    let requestHandler = VNImageRequestHandler(cgImage: cgImage,
                                               orientation: .init(image.imageOrientation),
                                               options: [:])
    
    do {
        try requestHandler.perform([request])
    } catch {
        print("Can't make the request due to \(error)")
    }
    
    guard let results = request.results as? [VNClassificationObservation] else { return [] }
    
    return results
        .filter { $0.confidence > 0.7 }
        .map { "\($0.identifier) - \((Int($0.confidence * 100)))%" }
}

extension CGImagePropertyOrientation {
    init(_ uiOrientation: UIImage.Orientation) {
        switch uiOrientation {
        case .up: self = .up
        case .upMirrored: self = .upMirrored
        case .down: self = .down
        case .downMirrored: self = .downMirrored
        case .left: self = .left
        case .leftMirrored: self = .leftMirrored
        case .right: self = .right
        case .rightMirrored: self = .rightMirrored
        @unknown default:
            self = .up
        }
    }
}

let cupcake = UIImage(named: "cupcake.jpg")!
process(cupcake)
let plane = UIImage(named: "plane.jpg")!
process(plane)
let lake = UIImage(named: "lake.jpg")!
process(lake)

⚠️ Remember to add your assets to Resources and provide proper images names in UIImage(named: "cupcake.jpg")!.

If you want to play with the Vision and see it for yourself you can check the latest version of my vision demo application here where you will find TheVision.playground with three pages for you to experiment with:

⚠️ Playgrounds are  dependent on the project so you need to build the project first before using them.
⚠️ I was working with these playgrounds on Xcode 13.
⚠️ If you are using M1 and run Xcode using Rosetta playgrounds won't work:


Have fun!

If you have any feedback, or just want to say hi, you are more than welcome to write me an e-mail or tweet to @tustanowskik

If you want to be up to date and always be first to know what I'm working on tap follow @tustanowskik on Twitter

Thank you for reading!

P.S. Another way for getting quick feedback is using unit tests and snapshot tests. But this is a topic for another article. Let me know if you like to know more!

Photos I used in this article are made by Ibrahim Boran, Lacie Slezak, Ursa Bavcar , and The Lucky Neko.

This article was featured in SwiftLee Weekly #85 and Awesome Swift Weekly #280 🎉

If you want to help me stay on my feet during the night when I'm working on my blog - now you can:

Kamil Tustanowski is iOS Dev, blog writer, seeker of new ways of human-machine interaction
Hey 👋If you are seeing this page it means you either read my blog https://cornerbit.tech or play with my code on GitHub https://github.com/ktustanowski.Thank you...
Kamil Tustanowski

Kamil Tustanowski

I'm an iOS developer dinosaur who remembers times when Objective-C was "the only way", we did memory management by hand and whole iPhones were smaller than screens in current models.
Poland