Conversational AI (artificial intelligence) involves the study of techniques for software agents that can engage in natural conversational interactions with humans.
Conversational AI allows inquiries to be handled by technology, in the form of a chatbot or voice assistant. These use artificial intelligence to replicate the kind of interaction that users would expect from a helpful and well-informed human being. 
Machine Learning algorithms learn by example and can be continuously taught and re-trained over time, allowing them to adapt and improve with experience.
Natural language processing (NLP) is an area of artificial intelligence concerned with the automated analysis and generation of human language. 
Conversational AI combines machine learning with natural language processing. This enables it to process and understand what a user is writing or saying, then generate appropriate responses in a natural way. Over time, the system will automatically refine its responses and adapt to changing circumstances.
The following article is intended to review two Conversational AI platforms: Alan AI and Azure Speech. A comparison will be made between the two tools by looking at what features are provided, how their processes work, documentation, pricing, and integration to different programming languages or solutions. Finally, a short guide will be provided on how to implement each solution in a ReactJs application.
ALAN AI – An Overview
Alan is a Voice AI platform that allows you to add a voice interface to apps.
Alan offers a conversational AI platform to build voice assistants. The platform comprises a set of tools and components to design, embed, and host voice interfaces for apps:
- Alan Studio: A web portal for developers to create voice scripts for their apps.
- Alan SDKs: The Client SDKs allows developers to embed Alan’s voice in apps through the Alan button. This button allows users to communicate with the app via voice and execute commands from voice scripts.
- Alan Cloud: The solution architecture is serverless; voice scripts are run and managed on VMs in the cloud. The cloud is where voice scripts created are executed and all voice processing takes place. Alan trains on intents using the terminology for the app and learns to understand the logic of the domain. If necessary, it is possible to migrate the data to a private cloud.
How Alan AI works
Alan acts as an AI-backend. It lets the app ‘understand’ the human language and provides the ability to interact with the app through voice. To build a voice interface with Alan, you need to complete the following tasks:
- Integrate the Alan voice to the app with Alan SDKs: Alan integrates with client apps through Alan’s SDKs. To integrate an app with Alan, you need to embed the code for the Alan button in the app project. Alan provides common Client API methods for the Alan button. These methods are accessible on the client app side.
Alan AI has a comprehensive documentation website. It provides a guide on how to use Alan Studio to create the voice scripts, examples of voice scripts, samples on GitHub with different existing SDKs integrations, and video resources.
- Apache Cordova
- React Native
- Microsoft Power Apps
The enterprise price varies depending on the customer’s needs, such as:
- Number of projected users
- When is it projected to go live?
- What language(s) will be supported?
- Project and Use Case Description
- Will the voice experience be developed internally or will the Alan team build out the functionality?
There is no limitation on the number of users interacting with the voice assistant in your app. On Free and Pay as You Go pricing plans, the number of voice interactions is only limited by the number available on the billing account.
1. Go to Alan Studio.
2. Sign up with a Google or GitHub account, or with your email address.
3. In Alan Studio, click ‘Create Voice Assistant.’ Select to create an empty project and give it any name you want.
4. Create React App:
npx create-react-app my-app
5. Install the Alan Web component:
npm i @alan-ai/alan-sdk-web
6. Add the Alan Button to the app in the App.js file:
In the key field above, replace YOUR_KEY_FROM_ALAN_STUDIO_HERE with the Alan SDK key for the Alan Studio project. In Alan Studio, at the top of the code editor, click Integrations, copy the value provided in the Alan SDK Key field, and paste the value to key.
7. Add voice commands:
In Alan Studio, open the project and, in the code editor, add the following intents:
Now in the app, click the Alan button and ask: “What is your name?” and “How are you doing?” Alan will give responses provided in the intents.
Azure Speech Service – An Overview
Azure Cognitive Services are cloud-based services with REST APIs and client library SDKs available to help build cognitive intelligence into applications. It is possible to add cognitive features to applications without having artificial intelligence or data science skills. Azure Cognitive Services comprise various AI services that enable to build cognitive solutions that can see, hear, speak, understand, and even make decisions. 
Categories of Cognitive Services:
The catalog of cognitive services that provide cognitive understanding is categorized into five main pillars:
Azure Speech is a subset of Cognitive services. Speech service adds speech-enabled features to applications.
The voice assistant service provides fast, reliable interaction between a device and an assistant implementation that uses either (1) Direct Line Speech (via Azure Bot Service) for adding voice capabilities to bots, or, (2) Custom Commands for voice commanding scenarios. 
Direct Line Speech is an end-to-end solution for creating a voice assistant. It is powered by the Bot Framework and its Direct Line Speech channel, which is optimized for voice-in, voice-out interaction with bots.
Custom Commands makes it easy to build rich voice commanding apps optimized for voice-first interaction experiences. It provides a unified authoring experience, an automatic hosting model, and relatively lower complexity, helping to focus on building the best solution for the voice commanding scenarios.
|If you want…||Then consider…||For example…|
|Open-ended conversation with robust skills integration and full deployment control||Azure Bot Service bot with Direct Line Speech channel||“I need to go to Seattle”
“What kind of pizza can I order?”
|Voice commanding or simple task-oriented conversations with simplified authoring and hosting||Custom Commands||“Turn on the overhead light”
“Make it 5 degrees warmer”
Table1 – Voice assistant options 
How Azure Speech works
Direct Line Speech offers the highest levels of customization and sophistication for voice assistants. It’s designed for conversational scenarios that are open-ended, natural, or hybrids of the two with task completion or command-and-control use.
1. Create a Bot Using the Bot Framework: Bot Framework Composer enables creators to design conversational bots, virtual agents, digital assistants, and all other dialog interfaces—offering flexible, accessible, and powerful ways to connect with customers, employees, and one another.
2. Create the corresponding Azure resources configuration to voice enable the created bot using Direct Line Speech.
3. Using Speech SDK, the client application connects to the Direct Line Speech channel and streams audio.
4. Optionally, higher accuracy keyword verification happens on the service.
5. The audio is passed to the speech recognition service and transcribed to text.
6. The recognized text is passed to the Echo Bot as a Bot Framework Activity.
7. The response text is turned into audio by the Text-to-Speech (TTS) service, and streamed back to the client application for playback.
The Azure Speech Service has a lot of resources and examples in their documentation, including Youtube tutorials. However, the steps you must take in order to create a voice assistant can vary depending on the complexity of the intended solution. The creation of voice commands is more straightforward through the Speech Studio, but according to the documentation, the full AI capabilities are available when using a bot with voice enabled through the Direct Line Speech solution, which includes more configuration steps and also the learning curve to use the Bot Framework.
The Azure Speech software development kit (SDK) exposes many of the Speech service capabilities, to empower speech-enabled applications. The Speech SDK is available in many programming languages and across all platforms such as:
- Objective C / Swift
|Free – Web/Container
1 concurrent request
|Speech to Text||Standard2||5 audio hours free per month|
|Custom||5 audio hours free per month
Endpoint hosting: 1 model free per month 3
|Conversation Transcription Multichannel Audio||5 audio hours free per month|
|Text to Speech||Standard||5 million characters free per month|
|Neural||0.5 million characters free per month|
|Custom||5 million characters free per month
Endpoint hosting: 1 model free per month
|Standard – Web/Container
100 concurrent requests for Base model
20 concurrent requests for Custom model1
|Speech to Text||Standard||$1 per audio hour|
|Custom||$1.40 per audio hour
Endpoint hosting: $0.0538 per model per hour
|Conversation Transcription Multichannel Audio||$2.10 per audio hour 5|
|Text to Speech||Standard||$4 per 1M characters|
|Neural||$16 per 1M characters 6
Long audio creation: $100 per 1M characters
|Custom||$6 per 1M characters
Endpoint hosting: $0.0537 per model per hour 9
|Custom Neural||Training: $52 per compute hour, up to $4,992 per training
Real-time synthesis: $24 per 1M characters
Endpoint hosting: $4.04 per model per hour
Long audio creation: $100 per 1M characters
Table2 – Speech Recognition Price 
1. Create the Bot in the Bot Composer Framework
2. Create new Azure resources
3. Build, test, and deploy the Bot to an Azure App Service
4, Register your bot with Direct Line Speech channel
To integrate to a ReactJS application:
4. Create a new ReactJs app
5. Install the node module webchat bot framework
npm i botframework-webchat
7. Retrieve your Direct Line Speech credentials:
8. Render webchat using the Direct Line Speech Adapters
9. For full customizability, you can use React to recompose components of Web Chat
To install the production build from NPM, run npm install botframework-webchat.
Conversational AI Platforms: In Summary
Both Alan and Azure Speech seem to provide great solutions for voice assistant problems.
In terms of the learning curve, it is a much easier option to work with Alan AI, as the documentation seems pretty straightforward in terms of how to create the conversational possibilities and training for the AI in the Alan AI Studio. In the case of Speech cognitive services, there are more resources to learn and master including the Bot Framework and other important Azure prerequisites to start developing the voice assistant solution.
Alan AI provides stronger support in terms of the SDKs available to integrate with more programming languages and the vast examples on how to achieve a solution for the developers chosen programming language. Azure Speech service has more components for a possible scenario or solution and there is also the option to work with a simpler solution through the custom commands that the Speech Service offers, so is not as straightforward as with the Alan technology. The integration through SDKs is harder to follow through the documentation because the examples are more oriented towards Microsoft solutions like C#.
In terms of pricing, the recommendation would be to contact Alan AI and Microsoft to have a more detailed price calculation according to the requirements of your project. And it would be important to keep in mind that, depending on the Azure solution, there will be some resources that will need consideration in the estimation process.
Conversational AI is a very powerful tool that can allow the users to have faster assistance and also to self-serve in a faster way in the applications. These two providers are examples of technologies available that could provide developers and companies with tools to innovate their services and applications.