Do you know what is a dialogue?
Dialogue is a written or spoken conversational exchange between two or more people, and a literary and theatrical form that depicts such an exchange.
To be honest I have a problem with this definition.
between two or more people
Why people? Why can't we converse with other beings? What is the difference?
This made me think about the Turing test:
The Turing test, originally called the imitation game by Alan Turing in 1950, is a test of a machine's ability to exhibit intelligent behaviour equivalent to, or indistinguishable from, that of a human. Turing proposed that a human evaluator would judge natural language conversations between a human and a machine designed to generate human-like responses. The evaluator would be aware that one of the two partners in conversation is a machine, and all participants would be separated from one another. The conversation would be limited to a text-only channel such as a computer keyboard and screen so the result would not depend on the machine's ability to render words as speech. If the evaluator cannot reliably tell the machine from the human, the machine is said to have passed the test. The test results do not depend on the machine's ability to give correct answers to questions, only how closely its answers resemble those a human would give.
It's a simpler, and actually real, predecessor of the Voight-Kampff test.
When the evaluator is conversing with human it's dialogue. When the evaluator is conversing with a machine it's not a dialogue any more. At least according to Wikipedia. But what happens before the evaluator decides who is a person and who is not? Can we call this part of the test a dialogue? A dialogue with a machine?
I will leave you with this question open. Sorry but I'm not a philosopher.
My previous article was about Speech recognition using the Speech framework. I presented a way we can talk to our applications and make them recognize our speech. It's time to allow the applications to speak back. It's time to give them a voice.
Did you notice the
AV prefix in
AVSpeechSynthesizer? No new & fancy frameworks this time just good ol'
Let's imagine we are working on a cooking application. We want to allow the user to use the application without touching or even watching the screen. Consider the scenario where we want the application to inform the user that the chicken should be placed in the oven:
Bake the chicken in the oven for fifteen minutes
First, we decide what the application will say by using
let englishUtterance = AVSpeechUtterance(string: "Bake the chicken in the oven for fifteen minutes")
I encourage you to immediately add:
englishUtterance.prefersAssistiveTechnologySettings = true
⚠️ There are a few ways we can tweak the way the application will speak our message. But what about the users with disabilities who are using VoiceOver? There is a high chance that the voice won't be identical to the one from the VoiceOver which is confusing and uncomfortable to the user. This line makes sure that when VoiceOver is on our application will use an identical voice.
Next, we create
AVSpeechSynthesizer which we will use in a moment to speak our
let synthesizer = AVSpeechSynthesizer()
If you prefer a simple approach you can add:
synthesizer.usesApplicationAudioSession = false
But note that:
If the value of this property is false, the capture session makes use of a private AVAudioSession instance for audio recording, which may cause interruption if your app uses its own audio session for playback.
The last part is passing utterance to speech synthesizer:
As soon as you do this you will hear the application talking to you.
let englishUtterance = AVSpeechUtterance(string: "Bake the chicken in the oven for fifteen minutes") englishUtterance.prefersAssistiveTechnologySettings = true let synthesizer = AVSpeechSynthesizer() synthesizer.usesApplicationAudioSession = false synthesizer.speak(englishUtterance)
Yes. It's that easy.
But that's not all. You can specify concrete language and speech synthesis can speak many different languages. Including the Polish language which I use every day:
let polishUtterance = AVSpeechUtterance(string: "Piecz kurczaka w piekarniku przez piętnaście minut") polishUtterance.prefersAssistiveTechnologySettings = true let polishVoice = AVSpeechSynthesisVoice(language: "pl-PL") polishUtterance.voice = polishVoice let synthesizer = AVSpeechSynthesizer() synthesizer.usesApplicationAudioSession = false synthesizer.speak(polishUtterance)
As you can see we can create a voice matching the language of the text. When you have a voice you need to pass it to the
let polishVoice = AVSpeechSynthesisVoice(language: "pl-PL") polishUtterance.voice = polishVoice
You can paste the code samples into a playground to hear how they sound.
AVSpeechUtterance has a few configuration options:
rate- Lower values correspond to slower speech, and higher values correspond to faster speech.
pitchMultiplier- The baseline pitch the speech synthesizer uses when speaking the utterance.
preUtteranceDelay- When multiple utterances are enqueued these values mark the delays between them. One from the start, the other after the end.
volume- The volume of the speech.
voice- The voice to be used to read the text. You can use a voice that doesn't match the country of the text but this won't end well.
You can use:
To see available voices:
Language: ar-SA, Name: Maged, Quality: Default [com.apple.ttsbundle.Maged-compact] Language: cs-CZ, Name: Zuzana, Quality: Default [com.apple.ttsbundle.Zuzana-compact] Language: da-DK, Name: Sara, Quality: Default [com.apple.ttsbundle.Sara-compact] Language: de-DE, Name: Anna, Quality: Default [com.apple.ttsbundle.Anna-compact] Language: el-GR, Name: Melina, Quality: Default [com.apple.ttsbundle.Melina-compact] Language: en-AU, Name: Karen, Quality: Default [com.apple.ttsbundle.Karen-compact] Language: en-GB, Name: Daniel, Quality: Default [com.apple.ttsbundle.Daniel-compact] Language: en-IE, Name: Moira, Quality: Default [com.apple.ttsbundle.Moira-compact] Language: en-IN, Name: Rishi, Quality: Default [com.apple.ttsbundle.Rishi-compact] Language: en-US, Name: Samantha, Quality: Default [com.apple.ttsbundle.Samantha-compact] Language: en-ZA, Name: Tessa, Quality: Default [com.apple.ttsbundle.Tessa-compact] Language: es-ES, Name: Mónica, Quality: Default [com.apple.ttsbundle.Monica-compact] Language: es-MX, Name: Paulina, Quality: Default [com.apple.ttsbundle.Paulina-compact] Language: fi-FI, Name: Satu, Quality: Default [com.apple.ttsbundle.Satu-compact] Language: fr-CA, Name: Amélie, Quality: Default [com.apple.ttsbundle.Amelie-compact] Language: fr-FR, Name: Thomas, Quality: Default [com.apple.ttsbundle.Thomas-compact] Language: he-IL, Name: Carmit, Quality: Default [com.apple.ttsbundle.Carmit-compact] Language: hi-IN, Name: Lekha, Quality: Default [com.apple.ttsbundle.Lekha-compact] Language: hu-HU, Name: Mariska, Quality: Default [com.apple.ttsbundle.Mariska-compact] Language: id-ID, Name: Damayanti, Quality: Default [com.apple.ttsbundle.Damayanti-compact] Language: it-IT, Name: Alice, Quality: Default [com.apple.ttsbundle.Alice-compact] Language: ja-JP, Name: Kyoko, Quality: Default [com.apple.ttsbundle.Kyoko-compact] Language: ko-KR, Name: Yuna, Quality: Default [com.apple.ttsbundle.Yuna-compact] Language: nl-BE, Name: Ellen, Quality: Default [com.apple.ttsbundle.Ellen-compact] Language: nl-NL, Name: Xander, Quality: Default [com.apple.ttsbundle.Xander-compact] Language: no-NO, Name: Nora, Quality: Default [com.apple.ttsbundle.Nora-compact] Language: pl-PL, Name: Zosia, Quality: Default [com.apple.ttsbundle.Zosia-compact] Language: pt-BR, Name: Luciana, Quality: Default [com.apple.ttsbundle.Luciana-compact] Language: pt-PT, Name: Joana, Quality: Default [com.apple.ttsbundle.Joana-compact] Language: ro-RO, Name: Ioana, Quality: Default [com.apple.ttsbundle.Ioana-compact] Language: ru-RU, Name: Milena, Quality: Default [com.apple.ttsbundle.Milena-compact] Language: sk-SK, Name: Laura, Quality: Default [com.apple.ttsbundle.Laura-compact] Language: sv-SE, Name: Alva, Quality: Default [com.apple.ttsbundle.Alva-compact] Language: th-TH, Name: Kanya, Quality: Default [com.apple.ttsbundle.Kanya-compact] Language: tr-TR, Name: Yelda, Quality: Default [com.apple.ttsbundle.Yelda-compact] Language: zh-CN, Name: Ting-Ting, Quality: Default [com.apple.ttsbundle.Ting-Ting-compact] Language: zh-HK, Name: Sin-Ji, Quality: Default [com.apple.ttsbundle.Sin-Ji-compact] Language: zh-TW, Name: Mei-Jia, Quality: Default [com.apple.ttsbundle.Mei-Jia-compact]]
⚠️ You need to set these properties before enqueuing the utterance because setting it afterward has no effect.
This will get you going but it will take a lot more to make your application pass the Turing test.
If you want to be up to date and always be the first to know what I'm working on tap follow @tustanowskik on Twitter
Thank you for reading!