Uncategorized

Fun with Theresa May’s first Speech as PM and the Google Cloud Speech and Natural Language APIs

It’s not so often you get brand new PM Theresa May and fun in the same sentence, but when I learnt yesterday Google was launching their Cloud Speech and Natural Language APIs, I knew right away what would be a good proof of concept: Ms May‘s inaugural speech.

If you are wondering what Cloud Speech API and Natural Language APIs are, they are basically Machine Learning models trained by Google and offered as a service, so you can just send your audio files to Speech API and get back a transcript. And once you have the transcript, you can use the Natural Language API to get a list of names, organisations, and locations mentioned in the text. Not only that, you can also run sentiment analysis on the text and even get back a lot of syntax/semantics information in case you want to dig deeper.

Got your attention? Cool. Just let me show you what I did. I took the video from https://www.youtube.com/watch?v=FDyZ8trge2E and extracted the audio. In order to use Cloud Speech API, your audio should have a particular format, so I converted it using the following command line


sox Theresa_May_First_speech_as_Prime_Minister_BBC_News.flac --rate 16k --bits 16 --channels 1 theresa_may_mp_extract.flac trim 01:30 =02:12

Note I am extracting just my 42 favourite seconds of the audio.

I am uploading my file to Google Cloud Storage and, having set up my cloud account previously, I can already run this:


curl -s -k -H "Content-Type: application/json"  -H "Authorization: Bearer ya29.Ci8nAzIo94pFL-CPsdRgLH1qkIM4Zo_Iq-getfRDJJ41ZVvpK9aFODnwaXPuzYXmzw" https://speech.googleapis.com/v1beta1/speech:syncrecognize  -d @audio_may.json

I am getting back this beautiful JSON


{
"results": [
{
"alternatives": [
{
"transcript": "that means fighting against the burning injustice but if you're born        poor        you will die on average 9 years earlier than others if you're black you're treated more harshly by the criminal justice system then if your weight is there a white working class boy you're less likely than anybody else in Britain to go to university is right Estate school you're less likely to reach the top professions then if your educated privately if you're a woman you will earn less than a man if you suffer from mental health problem there's not enough help to hand if you're young you'll find it harder than ever before to own your own home",
"confidence": 0.935612
}
]
}
]
}

Not bad at all. Except for punctuation marks and a few minor problems (then instead of than) everything looks hunky dory. Great diction Theresa!

Let’s send this text to the Cloud Natural Language API and see what it says. To do so, I need to prepare a simple JSON with the content and the type of processing I want to do:


{
"document":{
"type":"PLAIN_TEXT",
"content":"that means fighting against the burning injustice but if you're born poor you will die on average 9 years earlier than others if you're black you're treated more harshly by the criminal justice system then if your weight is there a white working class boy you're less likely than anybody else in Britain to go to university is right Estate school you're less likely to reach the top professions then if your educated privately if you're a woman you will earn less than a man if you suffer from mental health problem there's not enough help to hand if you're young you'll find it harder than ever before to own your own home"
},
"features": {
"extractDocumentSentiment": true,
"extractEntities": true
}

}

I am asking only for Entities and Sentiments. I don’t care about the syntax of this particular text.

And I run this command


curl -s -k -H "Content-Type: application/json"  -H "Authorization: Bearer ya29.Ci8nA8vXgbR9VQ9NtXbhpj3Nz1oLwzShqn0gt0Ts2RnrCR-UDHg9W-mynn_WERa_9Q" https://language.googleapis.com/v1beta1/documents:annotateText  -d @nlp_may.json

As a result I get a JSON back


{
"sentences": [],
"tokens": [],
"entities": [
{
"name": "Britain",
"type": "LOCATION",
"metadata": {
"wikipedia_url": "http://en.wikipedia.org/wiki/United_Kingdom"
},
"salience": 0.010285536,
"mentions": [
{
"text": {
"content": "Britain",
"beginOffset": -1
}
}
]
}
],
"documentSentiment": {
"polarity": -1,
"magnitude": 0.7
},
"language": "en"
}

So, it seems Ms May mentioned Britain, and the sentiment of that extract was very negative, as we can see with the -1. Also, the strength of the sentiment is quite bland, at only 0.7. Dear Theresa, why so sad? Of course I chose an isolated extract, but if I take the full text as seen here things will surely change.

I change my JSON request to send the full text, and what I am getting now is much more interesting. The list of Entities is much more complete, including now The European Union, Northern Ireland, Scotland, Wales, Great Britain, Her Majesty the Queen (Hi Elizabeth!), David Cameron, The Conservative and Unionist Party, and Buckingham Palace. And each of those entities is featuring a link to the corresponding Wikipedia page. I kid you not. Just Wow!

And what about sentiment? Well, as expected when you take the full document the outcome is more positive (0.1 in a scale from -1 to 1) and the strength is clearly higher, at 9.7 magnitude.

I took a video, transcribed it, automatically extracted entities of different types with links to their corresponding pages and could guess the sentiment and strength of the speech. I never suspected Brexit could be this fun.

I can only start thinking of a million interesting applications that would benefit from something like this.

And don’t get me started about the Cloud Vision API… but that might be for another occasion.

** you have all the JSON requests and responses available at https://gist.github.com/javier/c3e0075f8077be87bb157471847211e2

And remember, if you need professional help with anything Google Cloud Platform or big data related, teowaki is always happy to help.

Javier Ramírez. Co-founder of Teowaki. Google Developer Expert and Google Authorized Trainer for the Google Cloud Platform.
Advertisements
Standard
Announcements, Uncategorized

Teowaki puts you on the map

Time flies when you are having fun. It’s been already one month since we announced our public launch and we have been busy adding a lot of small things to make teowaki even better for you.

After our Xmas break, we started the year by improving our search engine and adding individual pages for links, shouts and jesters. Then we added the “personas” feature to your profile, so you can let everybody know your different online identities. In the meantime, we got the opportunity to speak at local communities of developers in Zaragoza and London, sharing with them the technologies we are using at teowaki.

And today we are proud to announce our first geolocation features. You can now add your location to your profile, so other users can see where you are based. This is the cornerstone for the rest of our geospatial functionalities. In a few weeks you will be able to filter your search results by proximity —search for people or teams close to you— or to send shouts to users around one area.

How does it work?

When you visit your profile settings, your browser will ask for permission to use your computer’s location. Unless you allow your browser to pass your information to teowaki, we won’t be able to guess your current city and country automatically.

browser_permissions

A note on your privacy: We know it is technically possible to try and guess your current location using other techniques, like checking your IP against a database, but we think you as a user should have the last word in saying if you want us to geolocate you or not.  We won’t try to guess any geospatial information about you unless you allow us specifically to do so.*

Once you allow us to guess your location, teowaki will show your position on a map. In the rare cases where we can’t automatically locate you, or if our location is wrong, you can enter your city and country in the location text box and we will map it.

settings_map_and_flag

Even if you want us to keep your location, so we can use it for proximity searches, you can still uncheck the option to share your location publicly. In that case, we will store your location in our servers and we will use it internally, but we will never disclose your location to other uses. You can switch this check on and off as many times as you want. You are in total control of what and when it is shared about you.

Don’t forget to use the “Update” button to save your settings.

What does it look like?

Once you enter your location and give your permission, the name of your base location and a small map will be displayed on your profile.

user_public_profile

Feedback

Is there any way we can make this better for you? let us know at hello@teowaki.com

* our analytics backend uses techniques for geolocating every request we get to the system by analyzing the IP address. This is done at as a separate process and we don’t associate this information to your user, just anonymously to every request that hits our servers for statistical purposes.

Standard