Prerequisites

  • An account on https://developer.amazon.com/ which is where you setup the definition of the Alexa Skill

  • An account on https://aws.amazon.com to write the Lambda function which responds to the Skill

  • A laptop or computer to use

  • Some way of running Python code - good options are Python in a terminal, Jupyter notebook, or Trinket

  • An Alexa device (optional as you can use the web interface to test)

  • Access to the code - the original open sourced version from Amazon is here on Github, and my latest version is here

  • If running as a workshop, volunteers who understand Python to help the students

Schedule

A typical workshop schedule is

  • 1 hour Python introduction (students typically split into groups of 2-3, and worked together for the duration of the workshop)

  • 1/2 hour Introduction to Alexa Skills and VUI design

  • 2-3h build and improve an Alexa skill

Python Introduction

To begin the workshop, a short introduction to Python tutorial covers some basics of Python which are useful in building the Alexa skill. The code from the tutorial can be run in several ways - one of the easiest is to copy and paste into an online coding platform in your browser like Trinket. It takes about an hour to work through the tutorial.

Introduction to Alexa Skills & VUI Design

To design an Alexa Skill, we need to know about some basic concepts - utterances, intents and slots.

An ‘utterance’ is something that someone says - it isn’t necessarily a full sentence as people don’t always speak grammatically. For example:

  • “Play some music”

  • “Who is Serena Williams”

  • “Tell me about London”

When designing a voice interface, the computer has to know what the user asked for, and decide how to reply. Language is complex and ambiguous, so different people ask for the same thing in many different ways. We have to somehow categorise the user’s utterance so we can decide how to reply. An 'intent' is the name for a group of utterances which mean the same sort of thing. For example,

  • PlayMusicIntent

    • “play some music”

    • “i want to hear music”

    • “please make music play”

  • GetWeatherForecastIntent

    • “what’s the weather forecast”

    • “what will the weather be like”

    • “tell me about the weather”

  • GetFactIntent

    • “tell me an interesting fact”

    • “can i hear a fact”

    • “i want to know a fun fact”

A ‘slot’ is a specific thing (usually a noun) that the user is asking about, like a music artist, a song name, a city, a time, or a person.

  • PlayMusicIntent

    • “play some music by the beatles

    • “i want to hear hey jude

    • “please make summertime by nina simone play”

  • GetWeatherForecastIntent

    • “what’s the weather forecast tomorrow

    • “what will the weather be like in london

    • “tell me about the weather next week in cambridge

  • GetFactIntent

    • “tell me an interesting fact about Mae Jemison

    • “can i hear a fact about amelia earhart

    • “i want to know a fun fact about london

An utterance can have more than one slot in it. Sometimes the same words can be a different type of slot in different utterances, e.g. “play the song new york” vs “what’s the weather in new york”.

Alexa Skill Building

We use the basic Alexa skill from the Github repository as the basis for the practical session. This is a fully working skill, which can be built by the students and then improved. There’s full documentation for setting up a skill from Amazon - you need to setup both an Alexa Skill and a Lambda function, and have them talk to each other.

Our example skill has three intents:

  • saySomethingIntent

  • personalFactIntent

  • giveMeQuoteIntent

The skill in the Github repository has several files which are used in https://developer.amazon.com/ to build the Alexa skill. Most of the information is copied and pasted into the skill development interface:

  • The list of example utterances for the skill in utterances.txt

  • A list of people names for the slot “Person” in LIST_OF_PERSON

  • A list of subjects for the slot “Subject” in LIST_OF_SUBJECT

  • The intent schema in intent_schema.json - this lists the three intents for this skill with some information about the slots in each, in a computer readable json format. From this file we only need the names of the intents.

The repository also has the code for the Lambda function in lambda_function.py. This code is setup in a Lambda function on https://aws.amazon.com, and it composes the skill’s response to the three intents.

Once the skill and the lambda function are both setup and talking to each other, it’s possible to test the skill in the skill developer interface or on an Alexa device that’s connected to your Amazon account.

Some ideas for improving the skill during the workshop are:

  1. Change the output of saySomethingIntent so that Alexa’s response changes

  2. Look for the get_person_fact() function and try adding information about some more people

  3. Look for the get_quote() function and try adding some extra quotes for each topic

  4. Try adding a new topic that you can ask for quotes about

  5. Invent a new intent and add it to both the skill and the lambda function code