One conversational AI platform to rule them all

MS-DOS Emerges, 1983, courtesy RegMedia.co.uk
MS-DOS Emerges, 1983, courtesy RegMedia.co.uk

I expect that by the end of this decade, over 80% of all business communications will be automated on a human-level.

How businesses talk to customers. How customers talk to businesses and businesses talk to each other. This means that every single company will have dozens or hundreds of business processed automated by building text/voice conversational AI applications on top of a major platform vendor. To explain what I mean, we need another history tour and some analogies.

If you think about it, you will understand two things: 1) we all stand on the shoulders of giants, and all "revolutions" are nothing but just incremental paradigm shifts, and 2) everything repeats itself but at different levels.

When you write a program, such as a text editor, and want to display a menu item at the top of a window or read a file from disk, you ask the operating system to do it for you by using a very specific set of function calls that differ on different operating systems. Together, these functions are called API: this is the interface that the operating system, like Windows, provides to application developers, who in turn create text editors, browsers, etc. The term API was used in this sense before Web API (REST API) appeared. It's a collection of thousands upon thousands of detailed functions that developers can use to make the operating system do things like display a button, scrollbar, or menu, read/write a file, and even extremely complex things like displaying a rendered web page in a window.

If your program uses API calls for Windows, it won't work on macOS, which has a completely different set of API functions. This is one of the important reasons why Windows applications don't work on macOS or Linux (and vice versa). If you want the same application code written for Windows to run on Linux, you'll have to completely reimplement, recreate from scratch the entire Windows API on Linux, which consists of thousands of complex functions, entities, and concepts: it would mean almost as much work as writing Windows from the ground up, and it took Microsoft thousands of man-years. And if you make one small mistake, or forget to implement even one function that the application needs or it performs differently, your program will crash.

Going back to conversational AI (both voice and text - actually, doesn’t really matter). In reality, when you automate some business process, let’s say, outbound NPS survey, you may call it a voicebot, but it’s just a conversational voice application. Just like your favorite word processor.

This means whatever platform you’ve used for creating your conversational application, you are locked in. It will be virtually impossible to switch to another platform vendor without breaking your app. This is called vendor lock-in. Right at the moment it’s not obvious, because as I am writing these words in 2020, penetration of conversational AI in the world is negligible. 

In the future, we’ll see all aspects of customer-facing activity being automated on a human level (i.e. indistinguishable from talking to a real person). It will happen across all industries and functions: sales, customer service, support, marketing, HR, etc. To enable such broad automation, the underlying platform for building these automations should be insanely flexible, providing numerous features to the developers. Yes, the 80/20 rule works here as well, and 80%users will use only 20% of features, but it will be always a different set of features.

Since we're talking about writing applications for Dasha, we're also talking about the API, not the external, but the internal one – here I mean the platform functions that are exposed to a Dasha developer, like answering machine or voicemail detection, gender or age identification, dictation mode, active listening, ability to clone voice, control intonation, detect emotions and hundreds of other functions. Multiply it with a vast, exponentially increasing amount of data for machine learning, and it becomes obvious that there will be only one, maybe two major conversational AI platforms in the future.

Once you’ve invested a lot of time into automating your customer-facing workloads on top of some conversational AI platform, it enables mechanisms similar to those made Microsoft Windows a monopoly.

The variety of applications built on top of Dasha is potentially huge. If you're still thinking in terms of "it’s only about call centers", then think again. That's not what we're building at Dasha; in the medium term, we're building a universal platform to model any conversations that may exist at all. Here is what you'll be able to build using our Dasha Platform:

  • Voice-based AI service that helps restaurants take orders over the phone

  • Automate the hardest parts of outbound calling by detecting voicemails, filter out bad numbers, and navigate phone directories to get your reps into live conversations quickly

  • Automate call centers end-to-end (both inbound/outbound)

  • Booking table in a restaurant - build your own Google Duplex over the afternoon

  • Robocalls protection (https://assistant.dasha.ai)

  • NPS surveys for SMB (https://delight.dasha.ai)

  • COVID contact tracer (https://covid.dasha.ai)

  • Receptionist

  • Open a garage door with your voice

  • Control your Tesla with your voice

  • Add voice control to any existing app or website

  • Build a Discord bot

  • Add voice alerts to your Kubernetes cluster

  • Create your own Alexa

  • Make any device a smart device

  • Something else that we can't even fathom yet.

Automation of business communications is only the first and most obvious application. In the 1980s, the first mass programs were business apps – text editors, table editors, etc. (it was no accident that VisiCalc became the first killer app for computers). Business applications were followed by consumer ones.

Dasha masterplan is simple: 

Step 1. We automate repetitive business communications.

Step 2. We automate all business communications.

Step 3. We automate the entirety of human communications.

Remember operating systems? It's a thing that manages complex underlying hardware so that user applications can run. As such, people don't care about the OS. What they care about are the programs that run on those OS. Google Chrome, Telegram, Word, Excel. The OS themselves are not particularly useful. People use the OS because of the apps available on them. So the most useful OS is the one with the most useful programs.

So if you are a vendor of an OS (platform) like Dasha, you need to make sure that developers want to write applications for your platform and use your internal API. There are two ways to achieve this:

  1. Build a kickass powerful platform that lets you do things.

  2. Deliver excellent developer experience. Make the life of a developer using this platform as easy and cool as possible – here I mean development and debugging tools, onboarding process, code sample, and documentation.

Related Posts