Browsed by
Tag: chat bots

Dynamic Carousels from your DialogFlow webhook

Dynamic Carousels from your DialogFlow webhook

After recently starting to work with DialogFlow and Actions on Google, the one thing that I learned is that the documentation is really patchy and often provides snippets of JSON without the context of how everything should fit together. I’ll probably do a series of posts in this vein, but for this inaugural DialogFlow entry I’m going to focus on carousels!

What is a DialogFlow carousel?

Carousels in DialogFlow are used to present the user with multiple options such as a list of products from a shop or results from a search.

Google Assistant Carousel

It’s fairly straight forward to set up static carousel responses in the DialogFlow console. In the “Google Assistant” responses tab, hit “Add Responses” and choose “Carousel Card”. Repeat for the number of cards needed. The blank cards have fields for an image URL, alt text for that image, a title, a description, a key and some synonyms (more on these last two later).

Adding a static carousel card

If you fill in some information, you get the idea. The below is an example using characters from the game Dota 2.

Static carousel items

But what if you want this content to be based on results of a search or some other action? We could always do the search on the web hook side, put all of the data into a context and reference it in the carousel cards using the hash notation (e.g. #results.title1 where results is the name of a context and title1 is the title of the first result). But what if how many cards that we want to show is variable? We may have 5 results in some cases and only 3 in others!

Dynamic Carousels

For users of one of the SDKs, you’re in luck because creating carousels with dynamic number of items and such is baked into the SDK. No JSON wrangling for you!

For the rest of us, ultimately, we need to return JSON in the format the DialogFlow is expecting back for a carousel to show up. Carousel responses have partial documentation. Unfortunately, this documentation does not give you the full JSON payload required. Through some trial and error, using bits from the “custom_payload” response and scouring the internet for information from others; I’ve been able to work out the structure of JSON that is expected!

Here’s the JSON that I have been using:

And from the documentation, we can see that a carousel response looks like this:

You’re not limited to the one message type in a response either. I regularly combine a simple_response that introduces the results, a carousel_card to display results and then a suggestion_chip response to suggest follow on actions to the user.

Handling carousel item selection

Obviously, there’s not much point in giving users multiple options if you don’t know which one that the user has chosen! To receive the user’s choice, we need an intent that can be triggered with the event actions_intent_OPTION. Events are essentially ways of tagging an intent to be able to trigger it programatically (more information on events in DialogFlow can be found here). In this case, when someone makes a choice from a carousel, the event actions_intent_OPTION is fired. There are two different ways of handling this event:

  1. If you only have one carousel, you just need an intent that knows what to do with the carousel choice.
  2. If you have multiple carousels, you need an intent specifically for handling carousel choices. This will capture the chosen option, put it into a context and trigger the correct event.

Note on option 2: You could also do this by having your carousel handling action call the right code internally but to me it seems more correct to not give that responsibility to code that should just be dealing with carousels.

With both approaches, the option that the user selected is put into a context named actions_intent_OPTION. That context looks a little like this:

The value of OPTION will be the key of the selected carousel item. As this context only has a lifespan of 0, it will expire after this call. The easiest thing to do is to take this value and put it into a context that has a longer lifespan. Looking back at the first JSON snippet, there’s a context named carousel with a parameter named followup-event. We can add the option value to this context and then tell DialogFlow to follow the current action up with the action associated with event in followup-event.

To summarise that, here’s the sequence:

Carousel flow

Carousel item keys and synonyms

As mentioned above, whatever you set as the key for your carousel item is what is sent back as the option. This means that you don’t have to try and interoperate what the user has said or typed, DialogFlow will do that for you and let you know what option was chosen.

How will dialog flow know which option the user has selected? If the user taps or clicks a carousel option then it’s easy for DialogFlow to know which option to return to the webhook. When it comes to speech or typing, you can use the “synonyms” feature to make it easier for DialogFlow to make a choice on what was chosen. For example, the screenshot with the Dota heroes has short and alternative names for that hero but we can also add in numbers and ordinals for that choice (first, second, etc). This way when a user says “choose the second one”, DialogFlow will match that to our synonym and pass the associated key to the webhook.

Another option for selecting from carousels is to use the select.number followup intent. This adds in a bunch of common “number selection” phrases and you can use that in your webhook but then you need to know what number corresponds to what option value which may involve storing some more stuff in the context for later recall.

Summary

Here’s a quick summary of what this post was about:

  • Dynamic carousels are not very well documented in the DialogFlow or Actions on Google docs
  • Example JSON to return a response with carousel items from your webhook
  • How to handle the event trigged by a carousel item choice
  • How to have a single intent for handling all carousel item choices
  • An explanation of how carousel item keys and synonyms work and how they can make your assistant app more user friendly
NDC London 2017 & SwanseaCon 2017

NDC London 2017 & SwanseaCon 2017

In January this year I gave my first ever (lightning) conference talk!

As well as a big thank you to the NDC submissions panel for thinking me worthy, I’d also like to extend my gratitude to all of the people who make NDC happen. As exciting as it was to be a speaker at such a high-calibre conference, for me, the truly best part was being able to see some amazing talks by some of the most accomplished (and smart) people in our industry.

It doesn’t take much reading through my previous posts to guess that my talk was about chat bots. The goal of my talk was to live-code a chatbot in under 15 minutes that I could then interact with in Skype, with the underlying theme being that creating chat bots is actually pretty straight forward and requires little to get going.

If any one is interested, have a watch of my talk on YouTube!

I enjoyed talking at NDC so much that I have since submitted a talk to and have been accepted to speak at SwanseaCon 2017. In keeping with the obvious theme, my talk will once again be about chat bots. This talk will include a similar live coding session as to what I showed at NDC but will also include insights into how we developed our customer service chat bot at Just Eat.

Functionally testing chatbots (part 2)

Functionally testing chatbots (part 2)

Introducing BotSpec!

In part one of this series I outlined the problems with testing Bot Framework chatbots and how there is a real gap in the tooling available right now. Today I can happy announce the, what I would consider the first usable version, of my own chatbot testing framework is now available on NuGet!

The goal for BotSpec is to provide a simple and discoverable interface for testing chatbots made the with Bot Framework. Hopefully I’m on the right path with the first version.

Simple use of BotSpec

Here’s some example code that shows off the basic features of BotSpec:

Code walkthrough

Set Up

In the OneTimeSetUp method we create an instance of Expect. This is the root object for interacting with and testing your chatbot. On creation, Expect will take your token and authenticate with the Bot Framework and start a new conversation.

Hello!

The first test, Saying_hello_should_return_Hello, shows the most basic of bot interactions; sending a message to the bot and expecting a response. Activities you send are most likely to be text-based messages (this includes button presses) but more complex activities that include attachments are also supported.

The next test creates the expectation that we will receive some activity that has text matching the phrase “Hello”. Pretty simple stuff so far, I know.

Drilling down into attachments

The last test shown, Saying_thumbnail_card_should_return_a_thumbnail_card, is an example of an expectation for an attachment that meets the 3 given criteria. The three expect statements will check that any activity received satisfies the condition given for each one. In our case one activity will satisfy all three as the one thumbnail card returned has all of the properties that we are looking for.

BotSpec features

As well as the simple things shown in the code example above there are a few nifty things that BotSpec can do to make testing easier.

Regex comparison and regex group matching

All of the methods that test strings are named {Property}Matching where {Property} is the name of the property is being tested. The word Matching is used because it is not an equality check; these methods take a regex and use that to check whether our property is what we are expecting. (Note: I am considering adding in some options for string checking with string.Equals() being the default and regex being one of the options).

In the majority of cases this will be enough but some times you will need to keep a part of a response from your bot to check later. When this is the case, there is an option for using a group matching regex. It’s a bit more complicated than the stuff we’ve seen so far so I will lead with an example:

We’re still using the TextMatching method but this one takes a few more arguments:

  • The first one is the regex that we used before; this will similarly be used to check whether the given property matches the regex supplied.
  • The second string looks very similar to the first but has brackets around the [\d]* part. The brackets form a capture group and whenever we see something that matches the regex, we keep the match inside the brackets for later
  • All of the matches that we get from our group match will collected and made available as a list of matches in an out param

Attachment retrieval and extraction

Attachments can either be sent with the activity or be referenced with a URL. Out of the box, BotSpec will work out for you whether to fetch the attachment via the given URL or to deserialise it from the provided JSON.

This is all using the the default attachment extractor which currently extracts attachments in the following way:

  • Works out whether the attachment content is a part of the activity or whether it resides remotely and needs to be retrieved
  • Selects all attachments that have the ContentType which matches the specified attachment type
  • Retrieves content with the provided URLs using the specified attachment retriever (more on that below)
  • Deserialises the content JSON to the specified type

The default attachment extractor can be overridden by setting AttachmentExtractorSettings.AttachmentExtractorType to Custom and assigning a custom IAttachmentExtractor implementation to AttachmentExtractorSettings.CustomAttachmentExtractor.

Similarly the default attachment retriever can be overridden by setting AttachmentRetrieverSettings.AttachmentRetrieverType to Custom and assigning a custom IAttachmentRetriever implementation to AttachmentRetrieverSettings.CustomAttachmentRetriever. The default uses a simple WebClient to download the content as as string (it’s expecting JSON so doesn’t handle image content very well at the moment).

Waiting for a number of messages before asserting

Some times you may be expecting a bot to return a set number of responses for a given interaction and can only be sure that your expectation is met once all of the messages have been received. An example of this could be that you ask your bot for your “top 10 selling products”. As bot messages are not guaranteed to be delivered in order, you may want to wait for 11 messages (1 for your bot to inform the user that it is looking and 1 message for each one of the top 10 products) before checking the content of these messages.

This is as simple as telling BotSpec how many activities you’re expecting before carrying on the assertion chain:

Currently (subject to change because I know this should be more flexible), BotSpec will wait for one second and then try again for a total of 10 tries before failing.

Part of the reasoning for this feature was also that the Bot Framework is still very new and my personal experience is that clients dealing with it should be as fault tolerant as possible.

Wrap up

If you like the sound of BotSpec, grab the NuGet and start testing all of your bots!

If you’re interested in how BotSpec works under the hood, I’ll delve into the inner workings in my next post but if you can’t wait until then, have a look at the code on GitHub. If you discover any bugs or have any suggestions feel free to raise an issue on the project or even create a PR and become a contributor.

Direct Line v3 and the new C# Direct Line Client

Direct Line v3 and the new C# Direct Line Client

Intro

One of the great things about the Bot Framework is that, out of the box, there’s a bunch of channels to hook your bot up to without having to worry about any of the plumbing of communicating with those services. the currently supported list of channels can be found here (although, the newly announced Microsoft Teams has not been added to that list yet).

But what do you do when the channels provided aren’t quite enough? That’s simple, you turn to Direct Line! Direct Line is a REST API for the bot framework that allows it’s users to create their own integrations. A great example of this is if you have a mobile app that you directly want to integrate a chat bot into. Microsoft aren’t going to make your app a channel for the Bot Framework as no one else will be able to ingrate with it but you can still get your users using your bot with Direct Line.

v3 of Direct Line

All of the documentation for the Direct Line API can be found on the Bot Framework site. Now, a little background on the first available version of the Direct Line API (v1.1); it sucked. It was quite flaky, prompts were styled as text and attachments were stripped out. There were a number of things that you could do to work around these issues but it was a pain. The new v3 version, however, is awesome and takes all of that pain away.

A client for v3

As well as releasing a new version of the API itself, the Bot Framework team released a new version of it’s Direct Client NuGet package. At the time of writing, the new version is still in beta so you will need to include pre-release packages in your search to find it. That particular fact caused me several hours of pain when trying to work out what was going on in a project of my own to later find out that I wasn’t using the latest package.

Getting started with the new Direct Line Client

Let’s take a look at the simplest way to get started using the Direct Line Client. The 3 things that we want to be able to do are; start a conversation, send some text as an activity and get the responses.

Start a conversation

To get started we need a class that creates an instance of DirectLineClient and calls StartConversationAsync on the ConversationsResource:

On line 12, we’re creating an instance of the client with our secret as an argument (your secret can be generated when activating the Direct Line channel for your bot). Lines 17 and 18 show how a conversation is started. We then want to keep a reference to the returned Conversation object so that we can tell the Bot Framework that any messages we send or retrieve are for this specific conversation.

Sending messages

Once our conversation is started, we can start sending messages:

The Activity object has many fields and it’s hard to find out what is the minimum required. To send a plain text message the only things required are your message as a string, a ChannelAccount object that identifies the sender and a string which tells the Bot Framework what kind of activity we are sending (in most cases, this will be message). The ChannelAccount requires an id that is unique to each sender (the Bot Framework also allows group conversations with bots). Once that’s all set up, calling PostActivityAsync with the ConversationId from earlier and our activity will send our message to the bot.

Retrieving messages

Retrieving messages is very similar to sending messages:

A call to GetActivitiesAsync with the ConversationId and watermark will ask the Bot Framework for any messages that are newer than the value of the watermark. With the response, we keep track of what the watermark is so that we only retrieve new messages every time. Although the Watermark property type is string, it’s actually just a sequence number starting at 1 for activity in a given conversation.

Summary

This is just a quick intro into using the new Direct Line C# client. There are a few other things that the client can do that we didn’t cover (resuming a conversation from before and uploading files to a conversation) but this should be enough to get going with.

A quick TL;DR would look like this:

  • Show pre-release packages when installing the Direct Line Client (currently, v3 is still in beta)
  • Initialise the client with your Direct Line secret
  • Start a conversation and keep a reference to at least the conversation id
  • When sending messages be sure to include a ConversationAccount with an id unique to each user and to send message as the type
  • When receiving messages, keep a reference to the current watermark to only get new messages
  • The code shown here is available on my GitHub

EDIT: I originally didn’t specify that an Activity sent to the Bot Framework required a type. Not including a type will cause the Bot Framework to ignore your message! Almost always this type will be “message” but at times, may be something else

Functionally testing chatbots (part 1)

Functionally testing chatbots (part 1)

This is the first post in a series of posts where I will talk about my experiences trying to functionally tests chatbots built with the Bot Framework. This first post will cover the roadblocks that I encountered when trying to create functional tests that were easily reproducable and automatable. The rest of the series will look at a framework for testing Bot Framework chatbots that I have recently been developing.

When I first started thinking about how to test chatbots that I’ve written I had the following thought:

Although the Bot Framework is very new, it should be straight forward to write functional tests because interacting with a bot is just sending HTTP requests to a web API.

Although, yes, you do interact with the bot via a web API over HTTP which already has proven methods of functionally testing; testing a chatbot is very different. Anything more than the most basic of chatbots will have conversation state and conversation flow to worry about. There is also the fact that chatbots can respond multiple times to a single user message and that after sending a message you don’t immediately get a response, you have to ask for any new responses since the last time you called.

I initially started writing unit tests using my usual trio of NUnit, NSubstitute and Fluent Assertions for testing, mocking and asserting respectively. These tests quickly became unwieldy and involved more setup than testing.

Mocking all of the dialog dependencies as well as the factory creating dialogs and everything that the DialogContext does quickly makes tests look and feel very bloated. Also, due to the way that the MessagesController (example of this in the second code block here) uses the Conversations static object, unit testing our controller is tricky and requires what I would consider more effort than it’s worth.

In bots that I have written my approach has been to treat dialogs like MVC/Web API controllers. What this means is that I try to keep them as thin as possible and only manage Bot Framework specific actions like responding to the user. Everything else is pushed down into service classes where I can unit test to my hearts content! Couple this approach with the difficulty in unit testing dialogs and the solitary controller responsible for calling the dialogs, I have opted to only cover them functional and end-to-end tests.

The one advantage to testing dialogs at a higher level means that a BDD style approach lends really nicely to conversation flow. Conversations can easily be expressed using a “Given, When, Then” syntax and this allows our tests to be easy to understand at a glance but also cover large portions of our conversation flow in one test.

Knowing that I wanted to use a BDD approach, I instantly added SpecFlow to a bot project and got to work working out how to write steps but then I discovered that SpecFlow currently doesn’t fully support async/await. Since the Bot Framework heavily relies on async/await, SpecFlow was no longer an option.

Similarly to the earlier unit test with masses of setup; even if SpecFlow had the async support that I needed, actually writing the tests in a succinct and clear way is still difficult. Let’s take the Sandwich Bot Sample as an example. This bot uses FormFlow to create a dialog that can create a sandwich order with a predefined set of steps. If we were to write an end to end test for a simple order in BDD style, it would have around 20 steps. Each one of those steps would either be sending a message or checking incoming messages for the content that we expect. We might receive multiple messages and have to check them all. Each one of those messages might have multiple attachments in the form of choice buttons. All of these buttons will have to be checked for the specific text that we want to assert on in our test.

This to me seemed like a there was something missing. I don’t want to keep writing code that checks all new messages from my bot for an attachment that has a button that has text matching a pattern. Also what do we do about the fact that a reply may not be instantly available after we send our message? Do we retry? Do we fail the test?

I’ve tried to take these questions and formulate them into a library that will ease the burden of testing chatbot conversation flow. It’s still a work in progress with lots of work still to be done but I believe that it can be useful for anyone writing chatbots with the intention of deploying them in enterprise where generally verification of new code is of high import.

My library can be found on GitHub but isn’t yet available as a NuGet as it’s not complete enough to publish (plus the name will probably change because the current one kinda sucks). The remaining posts in this series will be a look at how I have built this library, the reasons for certain architectural decisions and hopefully how it can be used for anyone building chatbots with the Bot Framework can ensure that it’s still doing what it’s supposed to.

IoC in the Bot Framework

IoC in the Bot Framework

My first post is going to surface some information that is pretty difficult to find and doesn’t exist all in one place. Hopefully it will be of benefit other people using the Bot Framework.

If you’ve used the Microsoft Bot Framework before, you’ll know that even though it’s well ahead of any other bot frameworks in terms of functionality, writing your bot code can be a bit tricky. There’s a bunch of hoops to jump through and the documentation isn’t always the most helpful.

One of the things that I’ve struggled with is that everything needs to be serializable. In theory this doesn’t sound like a problem until you want to separate your conversational bot code and your “service” code (i.e. calls to external APIs or data sources).

Solution 1: Service locator

The first solution that I came across was using a static factory for getting my dependencies.

This simple solution solved the problem with small amounts of code. Unfortunately it’s an implementation of the widely know anti-pattern; the Service Locator.

For all of the usual reasons, the static factory wasn’t great. There was no way to mock the services in the factory as they were just new’d up at run time, It definitely violates the Dependency Injection principle of SOLID and as I add more dependencies it will just grow and grow unless split into even more static factories!

Solution 2: Slightly improved service locator

The next evolution from here was to use an IoC container to create all of the dependencies and then to use the static factory to access the container and get the required services for the dialog class.

Slightly better than before. I can now set up my container in my tests and have the factory return mocked interfaces where appropriate. For quite some time, this is pattern that I used. I knew it wasn’t the best and I knew that it was still just a service locator.

Solution 3: Constructor injection via magic

I spent some time looking around for solutions and then finally in the depths of the BotBuilder GitHub issues I found these two issues raised by the same user:

https://github.com/Microsoft/BotBuilder/issues/106
https://github.com/Microsoft/BotBuilder/issues/938

Will gives a link to the Alarm Bot example in his comment. Here is a good example of how to create the AutoFac bindings required to use constructor injection.

So using some AutoFac magic and a badly documented aspect of the Bot Framework we can add dependencies as constructor arguments and not have them serialized!

Following the AutoFac guide for WebApi, I first installed Autofac.WebApi2 NuGet package and then updated my Global.asax.cs to look like this:

Now I get all of the usual benefits of doing dependency injection in a more traditional way. I can create mocks and pass them into the dialogue for my tests, the dialogue doesn’t have to worry about where the service comes from and I can more easily control the lifecycle of my dependencies.

Realistically I don’t want to write unit tests against my dialog as there are is too much to mock (IDialogContext does a lot of stuff!) but now I can easily extract my conversational logic to another class and test it in isolation but that’s for another post!

EDIT: this DI approach has since stopped working for me despite being the officially documented strategy. I’ve opened an issue on the Bot Framework’s GitHub. There’s been some discussion and it has since been tagged with “bug” but it doesn’t look like it’s going anywhere fast. Stay tuned for more updates!