Uncategorized

Fun with Theresa May’s first Speech as PM and the Google Cloud Speech and Natural Language APIs

It’s not so often you get brand new PM Theresa May and fun in the same sentence, but when I learnt yesterday Google was launching their Cloud Speech and Natural Language APIs, I knew right away what would be a good proof of concept: Ms May‘s inaugural speech.

If you are wondering what Cloud Speech API and Natural Language APIs are, they are basically Machine Learning models trained by Google and offered as a service, so you can just send your audio files to Speech API and get back a transcript. And once you have the transcript, you can use the Natural Language API to get a list of names, organisations, and locations mentioned in the text. Not only that, you can also run sentiment analysis on the text and even get back a lot of syntax/semantics information in case you want to dig deeper.

Got your attention? Cool. Just let me show you what I did. I took the video from https://www.youtube.com/watch?v=FDyZ8trge2E and extracted the audio. In order to use Cloud Speech API, your audio should have a particular format, so I converted it using the following command line


sox Theresa_May_First_speech_as_Prime_Minister_BBC_News.flac --rate 16k --bits 16 --channels 1 theresa_may_mp_extract.flac trim 01:30 =02:12

Note I am extracting just my 42 favourite seconds of the audio.

I am uploading my file to Google Cloud Storage and, having set up my cloud account previously, I can already run this:


curl -s -k -H "Content-Type: application/json"  -H "Authorization: Bearer ya29.Ci8nAzIo94pFL-CPsdRgLH1qkIM4Zo_Iq-getfRDJJ41ZVvpK9aFODnwaXPuzYXmzw" https://speech.googleapis.com/v1beta1/speech:syncrecognize  -d @audio_may.json

I am getting back this beautiful JSON


{
"results": [
{
"alternatives": [
{
"transcript": "that means fighting against the burning injustice but if you're born        poor        you will die on average 9 years earlier than others if you're black you're treated more harshly by the criminal justice system then if your weight is there a white working class boy you're less likely than anybody else in Britain to go to university is right Estate school you're less likely to reach the top professions then if your educated privately if you're a woman you will earn less than a man if you suffer from mental health problem there's not enough help to hand if you're young you'll find it harder than ever before to own your own home",
"confidence": 0.935612
}
]
}
]
}

Not bad at all. Except for punctuation marks and a few minor problems (then instead of than) everything looks hunky dory. Great diction Theresa!

Let’s send this text to the Cloud Natural Language API and see what it says. To do so, I need to prepare a simple JSON with the content and the type of processing I want to do:


{
"document":{
"type":"PLAIN_TEXT",
"content":"that means fighting against the burning injustice but if you're born poor you will die on average 9 years earlier than others if you're black you're treated more harshly by the criminal justice system then if your weight is there a white working class boy you're less likely than anybody else in Britain to go to university is right Estate school you're less likely to reach the top professions then if your educated privately if you're a woman you will earn less than a man if you suffer from mental health problem there's not enough help to hand if you're young you'll find it harder than ever before to own your own home"
},
"features": {
"extractDocumentSentiment": true,
"extractEntities": true
}

}

I am asking only for Entities and Sentiments. I don’t care about the syntax of this particular text.

And I run this command


curl -s -k -H "Content-Type: application/json"  -H "Authorization: Bearer ya29.Ci8nA8vXgbR9VQ9NtXbhpj3Nz1oLwzShqn0gt0Ts2RnrCR-UDHg9W-mynn_WERa_9Q" https://language.googleapis.com/v1beta1/documents:annotateText  -d @nlp_may.json

As a result I get a JSON back


{
"sentences": [],
"tokens": [],
"entities": [
{
"name": "Britain",
"type": "LOCATION",
"metadata": {
"wikipedia_url": "http://en.wikipedia.org/wiki/United_Kingdom"
},
"salience": 0.010285536,
"mentions": [
{
"text": {
"content": "Britain",
"beginOffset": -1
}
}
]
}
],
"documentSentiment": {
"polarity": -1,
"magnitude": 0.7
},
"language": "en"
}

So, it seems Ms May mentioned Britain, and the sentiment of that extract was very negative, as we can see with the -1. Also, the strength of the sentiment is quite bland, at only 0.7. Dear Theresa, why so sad? Of course I chose an isolated extract, but if I take the full text as seen here things will surely change.

I change my JSON request to send the full text, and what I am getting now is much more interesting. The list of Entities is much more complete, including now The European Union, Northern Ireland, Scotland, Wales, Great Britain, Her Majesty the Queen (Hi Elizabeth!), David Cameron, The Conservative and Unionist Party, and Buckingham Palace. And each of those entities is featuring a link to the corresponding Wikipedia page. I kid you not. Just Wow!

And what about sentiment? Well, as expected when you take the full document the outcome is more positive (0.1 in a scale from -1 to 1) and the strength is clearly higher, at 9.7 magnitude.

I took a video, transcribed it, automatically extracted entities of different types with links to their corresponding pages and could guess the sentiment and strength of the speech. I never suspected Brexit could be this fun.

I can only start thinking of a million interesting applications that would benefit from something like this.

And don’t get me started about the Cloud Vision API… but that might be for another occasion.

** you have all the JSON requests and responses available at https://gist.github.com/javier/c3e0075f8077be87bb157471847211e2

And remember, if you need professional help with anything Google Cloud Platform or big data related, teowaki is always happy to help.

Javier Ramírez. Co-founder of Teowaki. Google Developer Expert and Google Authorized Trainer for the Google Cloud Platform.
Standard
Announcements

Notifications Digest

A few weeks ago we launched our spin-off project Datawaki, but this doesn’t mean we are not working on improving teowaki.

Since we launched e-mail notifications, we have received lots of  comments requesting digest notifications. If there is too much movement in your team, it can be annoying to receive a lot of emails every day.

We know it well, since we get notifications ourselves every time we commit, push, or deploy changes and every time we modify any issue on our issue tracker. Even small teams like us can get dozens of notifications a day, and we think that’s wrong.

We are happy to announce notifications digest, to free your inbox from unwanted messages, but still be able to receive a summary of what’s going on.

Until now, in your Account Settings page you could only choose if you wanted to receive notifications by email or not. Now when you check the notifications checkbox you can choose the periodicity: invidivual email (same behaviour than before) or a daily or weekly digest with all the notifications

notifications

Your inbox will be much emptier, but you’ll still keep in touch with your developer friends. If you miss any feature in teowaki, just let us know at hello@teowaki.com. We are here to help you!

Standard
Announcements, Preview

Our spin-off project Datawaki is now public

You might have been wondering what teowaki has been up to, since it’s been a while after our last post. It turns out we had a very busy May and June, speaking at conferences in Berlin, Barcelona, Kiev, Tel Aviv and London, and attending a few other conferences, including Google I/O in San Francisco.

Every time I talked to other developers and told them about our internal analytics solution —based in Big Query and Redis— I could see a lot of interest. And some even asked me if they could use it for their own projects. So we decided to extract that functionality out of teowaki, prepare it for public use, and make a spin-off project. I’m proud to say Datawaki is now  live.

With Datawaki you can analyse anything and everything going on in your application. You only need to provide a JSON with the relevant fields for your business, and Datawaki will take care of storing it, sending real-time alerts via e-mail on important events, providing timely reports and allowing you to run interactive queries at any time. Over billions of rows. In a few seconds.

You don’t need to worry about storage, backups or allocating development resources. Just send us your events and we’ll do the rest.

As of today you can only register if you are using Heroku. We will get the JSON directly from your log file, so integration is very easy on your end. We will provide a public REST API in the future so anyone can manage their events with Datawaki.

This is just the beginning. We are now waiting for your feedback to know what you want, so we can make Datawaki better and brighter for you all.  If you are a Heroku user and would like to give Datawaki a try, please send us an e-mail to hello@datawaki.com and we will invite you over.

Standard
Announcements, Development

teowaki Developers Centre

You have heard me say —probably many times— teowaki has a nice, usable, RESTful API. But you had to trust me on that, because we hadn’t had the time to document it properly.

From today you don’t need to trust me anymore, since you can play with the API by yourself following the documentation we have published at teowaki Developers Centre.

If you are already familiar with REST, you can proceed directly to our API overview, or to our hypermedia documentation. If you want to know more about REST you can try the REST basic concepts tutorial. And if you like APIs, you probably want to take a look at our Developer Tools section.

A few examples from our API using cURL from the command line:

Get the public contents about redis
curl -H "accept:application/json" https://api.teowaki.com/search?q=redis

Get the profile of the user Ada
curl -H "accept:application/json" "https://api.teowaki.com/people/ada"

Or directly from your browser:

https://api.teowaki.com/search.json?q=redis
https://api.teowaki.com/people/ada.json

If you want to access private contents, you will need to Authenticate using OAuth2 first. But with the OAuth2 intro in the Developers Centre that should be a breeze.

Give it a try, and tell us about all the awesome things you are building with it at hello@teowaki.com or @teowaki. We will feature in our blog the coolest uses of the API.

Standard
news

Finalists at Startup in Action, Rome

I was in the middle of a G+ hangout with Diego yesterday, when the subject of an incoming email got my eye. I could hardly believe it. teowaki had been selected as a finalist for the Startup in Action contest.

In their own words, Startup in Action is “a great opportunity for hi-tech Startups to be known by programmers, engineers, Italian ICT companies and large international companies which will attend to Codemotion Rome”.

It turns out Codemotion is one of my favourite events, and I was already very happy because I had been confirmed as a speaker a few weeks ago. So you can imagine our excitement when we received the news from Startup in Action.

Diego will be joining me at Codemotion Rome, where we will have a mini-stand for teowaki. We hope to hold a lot of interesting discussions and we are looking forward to knowing what Italian developers think of our product.

If you are attending Codemotion Rome, please come say hi to our stand. We’ll be happy to see you… and if you are good we might even give you some sweets!

Standard
Announcements

How can I help you?

We work hard to make teowaki easy to use. But still, sometimes people ask me questions about teowaki. Things that are not really obvious, like “Do I need to pay to use teowaki?” or “with all the collaboration tools out there, why did you decide to build a new one?”.

To answer those questions —and many more— we just launched our Help Centre . At the moment we have a few entries about what is teowaki, how the notifications work, how to use your account settings, what integrations are available and general details about payments and price plans.

We will be adding more sections in the next weeks. If you think there is something important missing, please contact hello@teowaki.com and let us know.

Standard
Announcements, Development

Announcing our engineering blog

Teowaki is a tool for developers. And it’s done by developers. So it was the high time to start writing about the technology that powers teowaki. I am proud to announce today teowaki’s engineering blog.

In the first post Diego tells us how he upgrades our servers with Ansible. Stay tuned for insights, tricks and new posts on how we use NoSQL, big data, AngularJS, hypermedia APIs, cloud services and everything that keeps teowaki up and running.

We hope you will enjoy our new blog as much as we enjoy writing it.

If you have any suggestions for new articles or you want to be featured as a guest engineer please write us at engineering@teowaki.com

Standard