The purpose of this document is to share the Emotion Identification API specification so that GoVivace potential customers could test their integration. The contents of this document are GoVivace Proprietary and Subject to change.
This API strives to expose Emotion Identification routines as a restful web service. The emotion identification process assumes that the audio file input is an 8KHz 16 bit linear PCM file. If wav format is used, the first 44 bytes are just treated like audio and have been found to work fine.
The emotion identification service accepts post requests with the audio in the body of the message at the specified URI. For example, using the curl command, one could do-
This is the CURL Command
curl –request POST –data-binary @sample1.wav https://services.govivace.com:7687/EmotionId?action=identify&format=8K_PCM16&key=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
You need to give three parameters:
Here, sample1.wav is an 8KHz sampling rate 16 bit linear PCM audio file. The body of the post would contain the entire audio file in 16bit linear PCM 8KHz format.
(For Websocket API)
After the last block of speech data, a special 3-byte ANSI-encoded string “EOS” (“end-of-stream”) needs to be sent to the server. This tells the server that no more speech is coming.
After sending “EOS”, the client has to keep the WebSocket open to receive the result from the server. The server closes the connection itself when results have been sent to the client. No more audio can be sent via the same WebSocket after an “EOS” has been sent. In order to process a new audio stream, a new WebSocket connection has to be created by the client.
python client.py –uri “wss://services.govivace.com:7687/EmotionId?action=identify&format=8K_PCM16&key=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx” –save-json-filename sample1_emotion.json –rate 8000 sample1.wav
Options for python client
➢–save-json-filename : Save the intermediate JSON to this specified file
➢–rate : Rate in bytes/sec at which audio should be sent to the server
➢–uri : Server websocket URI
➢–action : Action value which we want to perform like identify
➢–key : Authentication key
➢–file_format : Define file format (default is 8K_PCM16)
“message”:”Emotion identification is successful”,
Server sends emotion identification results and other information to the client in the JSON format. The response can contain the following fields:
The server sends emotion identification results and other information to the client in the JSON format. The response can contain the following fields:
●status: response status (integer), see codes below
●message: status message
●processing_time: total amount of time spent at the server side to process the audio
●identified_emotion: neutral, angry, sad, happy and dominant
●emotion_score: confidence score of identified emotion
The following status codes are currently in use-