Voice AI and site reliability engineering? What a ridiculous match up. The two live in completely different worlds. The former in the world of pesky call centers, the latter in the world of ponytailed sys admins.
Intro and definitions
- Webhooks
- HTTPS requests
- Site reliability monitoring (we will use Better Uptime)
- Conversational voice AI (we will use Dasha)
- Site reliability engineering - principles of software engineering applied to infrastucture and operations problems.
- Incident - something bad that happens to your site. In other words, your site was up, an incident happens, now it's down. Incidents are reported by site reliability monitors.
- Incident acknowledgement - once an incident is reported, it has to be confirmed by a responsible individual (you).
- Incident resolution - after incident is acknowledged, it has to be resolved (that is, if you want your website to be up and running again).
- Conversational AI app - an automated conversation built for a specific need, integrated with tools you already use, powered by AI.
What's wrong with the current incident reporting workflow?
- The responsible user may be away from their machine, as the incident happens. They can check some of the systems from their phone and maybe even resolve the issue but why force them to waste time logging into Better Uptime if they can just put the phone on speaker and resolve everything with the AI app?
- The user might indeed be at their computer but to resolve the incident would have to log onto a half dozen websites and click tons of things. With the AI app they will be able to get all info/data they need through a simple voice question, as well as set appropriate incident statuses in Better Uptime.
- And I'm sure you will find a dozen more ways in which this tech would be useful in your devops workflow.
- Create a Better Uptime account.
- Activate Dasha API key.
- Create a server for Better Uptime to monitor.
- Create a server to listen to webhooks from Better Uptime and launch the Dasha app.
- Build a Dasha conversational app that will call to you when an incident is created; set up external functions and HTTPS requests to Better Uptime.
- Set up nodemailer to email yourself a transcript of the conversation.
- Set up .env.
- Set up a tunnelling service to your localhosts.
- Set up the monitors and webhooks on Better Uptime.
- Test.
1. Create a Better Uptime Account
2. Activate Dasha API key
code --install-extension dasha-ai.dashastudio && npm i -g "@dasha.ai/cli@latest"
dasha account login
dasha account info
git clone https://github.com/dasha-samples/blank-slate-app cd blank-slate-app npm i
3. Create a server for Better Uptime to monitor
const http = require('http'); const hostname = '127.0.0.1'; const port = 3000; const server = http.createServer((req, res) => { res.statusCode = 200; res.setHeader('Content-Type', 'text/plain'); res.end('Hello World. Server is up.'); }); server.listen(port, hostname, () => { console.log("Server running at http://${hostname}:${port}/"); });
node helloworld.js
4. Create a server to listen to webhooks from Better Uptime and launch the Dasha app
const dasha = require("@dasha.ai/sdk"); const fs = require("fs"); const express = require( "express" ); const bodyParser = require("body-parser"); const hook = express(); const PORT = 1919; const json2html = require("node-json2html"); const axios = require("axios").default; require("dotenv").config(); hook.get('/', (req, res) => { res.setHeader("Content-Type", "text/plain"); res.end("Hello World. Server running on port " + PORT + ". Listening for incidents on http://1543a913a2c7.ngrok.io As soon as incident is identified, I will initiate a call from Dasha AI to ackgnowledge or address the incident. "); }) hook.use(bodyParser.json()); hook.listen(PORT, () => console.log("🚀 Server running on port ${PORT}"));
hook.post("/hook", async(req, res) => { console.log(req.body); // Call your action on the request here res.status(200).end(); // Responding is important // save incidentID from JSON as const incidentId // we will need it to send acknowledged and resolved requests to Better Uptime incidentId = req.body.data.id; // we also save acknowledged and resolved statuses. // we will need these to keep Dasha from calling you when your incident is acknowledged or resolved acknowledged = req.body.data.attributes.acknowledged_at; resolved = req.body.data.attributes.resolved_at; // log the statuses console.log("incidentID: " + incidentId); console.log("acknowledged: " + acknowledged); console.log("resolved: " + resolved); // Better Uptime sends out webhooks on created, acknowledged, resolved statuses for each incident // we only need to run the Dasha app when the incident is created, thus we do the following: if (acknowledged != null && resolved == null) { console.log("Incident " + incidentId + " acknowledged."); } else if (acknowledged != null && resolved != null) { console.log("Incident " + incidentId + " resolved."); } else { console.log("Incident " + incidentId + " created. Expect a call from Dasha."); // Launch the function running the Dasha app await calldasha(incidentId); } });
async function calldasha(incidentId) { const app = await dasha.deploy("./app"); // external functions begin // external functions are called from your Dasha conversation in the body of main.dsl file // external functions can be used for calculations, data storage, in this case, to // call external services with HTTPS requests. You can call an external function from DSL // in your node.js file and have it do literally anything you can do with Node.js. // external functions end await app.start(); const conv = app.createConversation({ phone: process.env.phone, name: process.env.name }); conv.audio.tts = "dasha"; if (conv.input.phone === "chat") { await dasha.chat.createConsoleChat(conv); } else { conv.on("transcription", console.log); } if (conv.input.phone !== "chat") conv.on("transcription", console.log); const result = await conv.execute(); console.log(result.output); //create directory to save transcriptions fs.mkdirSync("transcriptions", { recursive: true } ); var transcription = JSON.stringify(result.transcription); //save the transcript of the conversation in a file // or you can upload incident transcriptions to your incident management system here fs.writeFileSync("transcriptions/" + (incidentId??"test") + ".log", transcription ); // and email it to yourself var transcript = json2html.render(transcription, {"<>": "li", "html":[ {"<>": "span", "text": "${speaker} at ${startTime}: ${text} "} ]}); sendemail(transcript); await app.stop(); app.dispose(); }
node index.js
5. Build a Dasha conversational app that will call to you when an incident is created
context { input phone: string; input name: string; } // declare external functions here external function acknowledge(): string; external function resolve(): string; external function getstatusof( what:string? ): string;
start node root { do { #connectSafe($phone); wait *; } transitions { hello: goto hello on true; } } node hello { do { #sayText("Hello " + $name + "! This is Dasha calling you regarding your website. There has been an incident. "); #sayText("You can acknowledge or resolve the incident right on the call with me. "); #sayText(" Please note, I will listen and take notes until you mention that you are ready to resolve or acknowledge. ", interruptible:true); wait *; } transitions { } }
- Acknowledge the incident.
- Resolve the incident.
- Ignore the incident.
- Ask Dasha to wait while the user thinks something over or looks something up.
- Ask Dasha to repeat what she last said.
- Easter egg digression "oops".
- Journal node - lets Dasha know to not react unless an intent is identified. This lets us passively record any notes that the user makes to self, as he is resolving the incident.
- Get status of vital services. *
Creating the data set to train the Dasha neural network
{ "version": "v2", "intents": { "yes": { "includes": [ "yes", "sure", "yep", "I confirm", "confirmed", "I do", "yeah", "that's right" ], "excludes": [] }, "no": { "includes": [ ] }, "repeat": { "includes": [ ] }, "acknowledge": { "includes": [ "I acknowledge the incident", "I can acknowledge the incident", "I do acknowledge the incident", "acknowledge", "acknowledge please", "incident acknowledged" ] }, "resolve": { "includes": [ ] }, "ignore": { "includes": [ ] }, "oops": { "includes": [ ] }, "wait": { "includes": [ ] }, "status": { "includes": [ "What is the status of (kubernetes)[statusentity]", "Dasha, what's the status of (kubernetes)[statusentity] and (TLS)[statusentity]", "What's the status of (kubernetes)[statusentity]", "Tell me about the status (healthcheck)[statusentity]", "Give me an update on the status of (healthcheck)[statusentity]", "Status (healthcheck)[statusentity] and (TLS)[statusentity]", "Dasha, let's look at the status of (TLS)[statusentity]" ] } }, "entities": { "statusentity": { "open_set": false, "values": [ { "value": "kubernetes", "synonyms": [ "Kubernetes cluster", "cooper netease", "kubernetes", "Kubernetes instances", "for burnett", "Kubernetes deploy" ] }, { "value": "TLS", "synonyms": [ "SSL", "TLS/SSL", "TLS certificate", "certificate", "SSL certificate" ] }, { "value": "healthcheck", "synonyms": [ "site healthchecks", "health check ", "site health checks", "health checks" ] } ], "includes": [] } } }
Writing the digressions in main.dsl
// acknowledge flow begins digression acknowledge { conditions { on #messageHasIntent("acknowledge"); } do { #sayText("Can you please confirm that you want me to acknowledge the incident?"); wait *; } transitions { acknowledge: goto acknowledge_2 on #messageHasIntent("yes"); donotacknowledge: goto waiting on #messageHasIntent("no"); } } node acknowledge_2 { do { external acknowledge(); #sayText("Got it. I have set the status in Better Uptime as acknowledged. The next step is to resolve the incident."); wait *; } transitions { } } node waiting { do{ #sayText("Okay. I will wait for your instructions then. "); wait *; } }
Getting your node.js external functions to call external APIs
// external functions are called from your Dasha conversation in the body of main.dsl file // external functions can be used for calculations, data storage, in this case, to // call external services with HTTPS requests. You can call an external function from DSL // in your node.js file and have it do literally anything you can do with Node.js. // External function. Acknowledge an incident in Betteruptime through posting HTTPS app.setExternal("acknowledge", (args, conv) => { // this keeps the code from throwing an error if we are testing with blank data if (incidentId === null) return; const config = { // remember to set your betteruptimetoken in .env headers: { Authorization: "Bearer " + process.env.betteruptimetoken } }; const bodyParameters = { key: "value" }; axios.post( "https://betteruptime.com/api/v2/incidents/" + incidentId + "/acknowledge", bodyParameters, config) .then(console.log) .catch(console.log); }); // External function. Resolve an incident in Betteruptime through posting HTTPS app.setExternal("resolve", (args, conv) => { if (incidentId === null) return; const config = { headers: { Authorization: "Bearer "+ process.env.betteruptimetoken } }; const bodyParameters = { key: "value" }; axios.post( "https://betteruptime.com/api/v2/incidents/" + incidentId + "/resolve", bodyParameters, config) .then(console.log) .catch(console.log); }); // external function getting status of additional services app.setExternal("getstatusof", (args, conv) => { switch (args.what) { case "kubernetes": return "Kubernetes is up and running"; case "healthcheck": return "Site health checks are not responding"; case "TLS": return "TLS Certificate is active"; } });
Additional digressions in your Dasha app
// get status of vital services digression status { conditions { on #messageHasIntent("status") && #messageHasData( "statusentity" ); } do { for (var e in #messageGetData("statusentity") ){ var result = external getstatusof(e.value ); #sayText( result ); } return; } }
// additional digressions digression @wait { conditions { on #messageHasAnyIntent(digression.@wait.triggers) priority 900; } var triggers = ["wait", "wait_for_another_person"]; var responses: Phrases[] = ["i_will_wait"]; do { for (var item in digression.@wait.responses) { #say(item, repeatMode: "ignore"); } #waitingMode(duration: 70000); return; } transitions { } } // this digression tells Dasha to only respond to user replies that trigger an intent // this is a very helpful little piece of code for our particular use case because // the user might talk to themselves as they are resolving the incident // everyting the user says to themselves is logged (thus: journal) in the transcript // which can then be appended to the incident report digression journal { conditions { on true priority -1; } do { return; } } digression repeat { conditions { on #messageHasIntent("repeat"); } do { #repeat(); return; } } digression oops { conditions { on #messageHasIntent("oops"); } do { #sayText("What happened " + $name + "? Did you ue the wrong terminal again?"); return; } }
6. Set up nodemailer to email yourself a transcript of the conversation
function sendemail(transcript) { const nodemailer = require('nodemailer'); require('dotenv').config(); var transporter = nodemailer.createTransport( { service: 'gmail', auth: { // be sure to specify the credentials in your .env file user: process.env.gmailuser, pass: process.env.gmailpw } }); var mailOptions = { from: process.env.gmailuser, to: process.env.sendto, subject: 'Incident conversation transcript', html: '<h2>Conversation transcript:</h2><p>' + transcript + '</p>' }; transporter.sendMail(mailOptions, function(error, info) { if (error) { console.log(error); } else { console.log('Email sent: ' + info.response); } }); }
7. Set up .env
betteruptimetoken = name = phone = sendto = gmailuser = gmailpw =
8. Set up a tunnelling service to your localhosts.
node helloworld.js
node index.js
./ngrok http 3000
./ngrok http 1919
9. Set up the monitors and webhooks on Better Uptime
- 12345.ngrok.io - localhost: 3000 (helloworld.js) (the server we are monitoring)
- 67890.ngrok.io - localhost: 1919 (index.js) (the server which catches webhooks)