Tuesday, May 1, 2018

iOS 11.4 beta enables AirPlay control via Siri

Zac Hall from 9to5Mac discovered an awesome new feature I’ve been clamoring for in the latest iOS 11.4 Beta. You can now ask Siri to AirPlay content to your Apple TV and/or HomePod, including multi-room audio playback!

This is a feature I have always thought Apple should implement to make the initiation of AirPlay even quicker, so I’m very excited. Hope it stays in through the final release.

Thursday, April 19, 2018

Siri isn’t dumb, she’s less consistent

Everyone loves to hate on Siri. The common trope equates to her being dumb or not up to par with the other voice assistants (largely being Alexa and Google Voice Assistant). I believe this perception largely comes from Siri’s greatest opportunity for improvement: general knowledge. 1

Table Stakes

Let’s first address the table stakes among digital assistants — weather, sports, news, smart home functions, etc. I feel they all do these jobs equally well, with little differences.

For example: let’s say my living room Lurton Caséta dimmer is at 5%, but I want to raise it to 100%. If I tell Alexa to “turn on the living room lights”, Alexa is smart enough to interpret my intent as a human would and just raise the lights. A human might have more snark at first. Siri, on the other hand, does not understand my intent. If I issue the same command to her, she does nothing because the lights are already on. As if a child, she might as well be saying “the lights are already on, duh”. I must specifically ask Siri to “set the lights to 100%” or some variation.

It’s a little annoyance, and although I prefer Alexa’s handling of the situation, there is still feature parity here.

General Knowledge

By contrast, I feel this is the main area in which Siri lacks consistent feature parity with the others. Even in my own circle of friends and family, the questions that fail the most fall into this category. These are usually questions I would never ask Siri myself, since I know she can’t answer them accurately (if at all). Here are just a few examples, comparing Siri and Alexa.

Are tomatoes a fruit?

  • Siri: Wolfram Alpha results with no direct answer to the question.
  • Alexa: “Yes, a tomato is a fruit.”

What is the largest freshwater lake in the world?

  • Siri: “Here’s what I found on the web.”
  • Alexa: “The largest freshwater lake by area is Lake Superior, at 31,795.5 square miles.”

What time is Brooklyn Nine-Nine on?

  • Siri: “Sorry, I couldn’t find anything called ‘Brooklyn Nine-Nine’ playing nearby.”
  • Alexa: “Season five of Brooklyn Nine-Nine airs on Fox Tuesdays at 9:30pm Eastern and 8:30pm Central.”

Now, I will say that Siri answered most of my general knowledge questions correctly (about 70% of them) as I was looking for the above examples. However, every time Siri doesn’t answer correctly or in an unexpected way, trust in the service takes another hit.

Siri’s negative perception will continue to increase until Apple addresses this area and others (hopefully in some capacity at this year’s WWDC). This isn’t Siri’s only problem, but I think it’s the biggest one. Severely reducing dumps to web searches (like above) is another one. As Siri and voice input are increasingly positioned at the forefront of new computing methods, the last thing Apple needs is to be thought of as behind. Does this all make Siri dumb? No. It makes her less consistent.


  1. General Knowledge. /salute 

Friday, December 8, 2017

TechCrunch: Apple to acquire Shazam →

Ingrid Lunden and Katie Roof for TechCrunch:

As Spotify continues to inch towards a public listing, Apple is making a move of its own to step up its game in music services. Sources tell us that the company is close to acquiring Shazam, the popular app that lets people identify any song, TV show, film or advert in seconds, by listening to an audio clip or (in the case of, say, an ad) a visual fragment, and then takes you to content relevant to that search.

We have heard that the deal is being signed this week, and will be announced on Monday, although that could always change.

One source describes the deal as in the nine figures; another puts it at around £300 million ($401 million). We are still asking around. Notably, though, the numbers we’ve heard are lower than the $1.02 billion (according to PitchBook) post-money valuation the company had in its last funding round, in 2015.

Obvious Apple Music and Siri benefits aside, Apple must be really impressed with Shazam’s underlying technology to make this purchase. I’ve never seen anyone use Shazam on a TV show or in any capacity other than identifying music, but there could be some real benefits to tried and tested audio recognition down the line (e.g. AR, advanced Siri functions, HomePod).

Monday, September 25, 2017

Apple switches Siri and Spotlight Search provider from Bing to Google →

Matthew Panzarino for TechCrunch:

Apple is switching the default provider of its web searches from Siri, Search inside iOS (formerly called Spotlight) and Spotlight on the Mac. So, for instance, if Siri falls back to a web search on iOS when you ask it a question, you’re now going to get Google results instead of Bing.

Consistency is Apple’s main motivation given for switching the results from Microsoft’s Bing to Google in these cases. Safari on Mac and iOS already currently use Google search as the default provider, thanks to a deal worth billions to Apple (and Google) over the last decade. This change will now mirror those results when Siri, the iOS Search bar or Spotlight is used.

On privacy:

As is expected with Apple now, searches and results are all encrypted and anonymized and cannot be attributed to any individual user. Once you click on the ‘Show Google results’ link, of course, you’re off to Google and its standard tracking will apply. Clicking directly on a website result will take you straight there, not through Google.

This is surprising. I tried out a couple searches with Siri on my iPad (see below). Sure enough, it’s already serving Google results including YouTube videos.

This is an interesting move, and I can’t say I’ve ever cared much for Bing search. While Google has always been accurate for me, I don’t really agree with their privacy and tracking perspectives. I would have liked to see Apple team up with DuckDuckGo (already a Safari search option). If you care about search privacy, check them out.

Siri Google search.
Siri Google search.
Siri YouTube search.
Siri YouTube search.

Friday, September 1, 2017

Sunday, August 27, 2017

How Siri’s voice has improved from iOS 9 to iOS 11 →

Earlier this week, Apple posted three new entries on their Machine Learning Journal detailing multiple aspects of how Siri has been improved over time.

The one linked above centers on how Siri’s voice has been vastly improved since iOS 9.

Starting in iOS 10 and continuing with new features in iOS 11, we base Siri voices on deep learning. The resulting voices are more natural, smoother, and allow Siri’s personality to shine through.

How speech synthesis works:

Building a high-quality text-to-speech (TTS) system for a personal assistant is not an easy task. The first phase is to find a professional voice talent whose voice is both pleasant and intelligible and fits the personality of Siri. In order to cover some of the vast variety of human speech, we first need to record 10—20 hours of speech in a professional studio. The recording scripts vary from audio books to navigation instructions, and from prompted answers to witty jokes. Typically, this natural speech cannot be used as it is recorded because it is impossible to record all possible utterances the assistant may speak.

This next figure illustrates how speech synthesis works via the selection of half-phones for each part of speech:

Unit selection speech synthesis using half-phones.
Figure 1. Illustration of unit selection speech synthesis using half-phones. The synthesized utterance “Unit selection synthesis” and its phonetic transcription using half-phones are shown at the top of the figure. The corresponding synthetic waveform and its spectrogram are shown below. The speech segments delimited by the lines are continuous speech segments from the database that may contain one or more half-phones.

On Siri’s new iOS 11 voice:

For iOS 11, we chose a new female voice talent with the goal of improving the naturalness, personality, and expressivity of Siri’s voice. We evaluated hundreds of candidates before choosing the best one. Then, we recorded over 20 hours of speech and built a new TTS voice using the new deep learning based TTS technology. As a result, the new US English Siri voice sounds better than ever. Table 1 contains a few examples of the Siri deep learning -based voices in iOS 11 and 10 compared to a traditional unit selection voice in iOS 9.

Make sure you check out the audio comparisons on the page from iOS 9 — iOS 11. After using Siri extensively on iOS 11, I can truly say the new voice is better than ever, and absolutely more natural and expressive.

Reading these journal entries really makes you realize how difficult speech recognition really is. Maybe we can go a little easier on Siri when she doesn’t understand or perform exactly how we expect every time. To err is human, and digital assistants are becoming increasingly more human-like, after all.

I’m super excited to see how Siri and other assistants improve over the next few years. I think we’re going to see the bar raised exponentially thanks to machine learning.

Other Posts

Monday, August 14, 2017

David Pogue Reviews Bixby →

David Pogue from Yahoo gave Samsung’s voice assistant Bixby a run-through and it doesn’t impress much. Here are some particular downsides from Pogue’s article:

Bixby is especially pathetic when it comes to navigation.

  • What pizza places are nearby? (Bixby: “Looks like there’s a connection problem.”)
  • Find me an Italian restaurant nearby. (Bixby opens Google Maps—promising!—but then stops, saying, “It looks like we experienced a slight hiccup.”)
  • Give me directions to JFK airport. (Bixby: “Which one?”)
  • Give me directions to the Empire State Building. (The “slight hiccup” error message appears after 10 seconds.)
  • In all cases, Bixby is very, very slow—plenty of videos online show how badly it lags behind Siri or Google Assistant.

It’s also fairly confusing. Most response bubbles include the baffling phrase, “You’re in native context.” And every so often, you’re awarded Bixby XP points for using Bixby. Samsung suggests that if you accumulate enough, you’ll be able to earn valuable prizes. OK, but if you have to bribe your customers to use your app…

This is hilarious. Samsung is resorting to gamification in hopes it will entice people to use Bixby. This is so incredibly ass-backwards. Imagine if you could win Apple points for using Siri or Amazon credits for using Alexa 1. How about this, Samsung: build a worthy product that compels people to use it because of how great it is, not because they can win imaginary points.

Like I’ve said before, no virtual assistant is perfect, but Samsung is incredibly late to this game. Since there’s a precedent now where every manufacturer needs their own virtual assistant, I suppose it’s no surprise. I’m sure Bixby will get better with time, but imagine how far ahead Alexa, Siri, and Google Assistant will be when that happens.

Side note: my favorite blunder from the video is when Pogue asks when Abraham Lincoln died and Bixby responds “Which One?”.


  1. On second thought, Bezos should get on this. 

Tuesday, July 25, 2017

Feature Request: smart assistant adaptive volume and stringed requests for smart home devices

We’re gradually increasing our reliance on smart assistants, but they are far from perfect. Going hand-in-hand with them is the next mainstream computing input method: voice. Sure, voice control has been around for a while, but we’re turning the corner on it being used in extremely meaningful ways throughout the course of our daily lives.

As a big proponent of voice input and smart assistants, here’s a couple improvements that would be a next step in the right direction when it comes to improving the interaction experience.

Adaptive Volume

Picture this: your little one just fell asleep, and you go to turn on the nightlight in the room with your Amazon Echo like you always do. It goes a little something like this.

You: Alexa, turn on the nightlight — oh shit…
Alexa at full volume: OKAY!!!

Now you have to coerce your little one back to sleep. This can apply to using Siri on the iPhone or iPad, too. Sometimes I want to set the Good Night scene using Siri on my phone, but Siri’s volume is set differently from the system volume, so I’d rather not chance what it was last set to.

Ideal Solution

These assistants need to find a way to adapt their volume for the situation, based on multiple factors. If it’s late at night and quiet, it’s probably safe to say I don’t want to hear any feedback at all from Alexa, Siri, or the like. Maybe at a volume level of 3-4, but definitely nothing louder.

Conversely, if there’s a lot of noise in the room, bump that volume up so I can hear the response. All of these devices have multiple microphones built in, so it’s just a matter of software.

In short: don’t take my manual volume change as law if it doesn’t make sense for the situation. This is an instance where a computer should be allowed to decide something for us.

Alternate Solution

Give us a volume request modifier. Two examples:

You: Alexa, quietly turn on the nightlight.
Alexa changes to low volume: “Okay.”
Alexa then reverts back to original volume.

Or

You: Alexa, loudly, what time is it in New York?
Alexa changes to full volume: “THE TIME IN NEW YORK IS 11AM!!!”
Alexa then reverts back to original volume.

Stringed Requests for Smart Home Commands

Pretty straightforward. Let us string at least two commands together for controlling smart home devices. Perhaps I want to selectively control two devices at a time with Siri that aren’t part of a scene I’ve already configured. For example:

Hey Siri, turn off the foyer and living room lights.

Or

Hey Siri, unlock the door and turn on the porch light.

This would be a huge step in improving the manual control experience of smart home devices, instead of one singular command at a time.

Monday, July 24, 2017

The Rock x Siri Dominate the Day

Takeaways

  • The short is better than I thought it was going to be. At least it’s not Planet Of The Apps.
  • Think they could’ve come up with a better name than “The Rock x Siri”. Why not “The Rock & Siri”?
  • Dwayne Johnson was the highest paid actor of 2016, making over $64.5 million.
  • They cut some of Siri’s responses when unnecessary. No “Calling so-and-so, iPhone” or similar.
  • They didn’t use Siri’s new iOS 11 voice; a missed opportunity to showcase it. It’s a shame, because Siri sounds a whole lot better in iOS 11, although it’s a little stunning at first since she sounds like a different ‘person’.
  • Voice is the newest mainstream input method for computing, and is far from perfect.
  • If Siri worked this perfectly every time, more people would be inclined to use it. I think Apple is painting a picture here of the possibilities and use cases Siri offers—not that anyone didn’t know a majority of the commands, but it serves as a reminder in a sense of “oh yeah, I can try Siri for that”. My advice: use Siri as much as you can, so when its eventual accuracy improves, you’ll be a step ahead.

Wednesday, May 31, 2017

Siri Smart Speaker reportedly enters manufacturing

Hot on the heels of the recently-announced Amazon Echo Show and Essential Home, Bloomberg is reporting the rumored Apple “Siri-Speaker” has entered production. This comes ahead of Apple’s Worldwide Developers Conference (WWDC) next week, where the device is expected to be announced.

I can’t wait to see what Apple does in this arena, because they’ll include features nobody else has thought of yet. They did it with other newer markets, such as Apple Watch and AirPods. Also, because I’m looking forward to the native HomeKit integration with my smart devices. Right now, I’m using an Echo Dot and love it, so next week should be exciting.

Here are my previous thoughts on how Apple can differentiate its Smart Speaker from the rest.

On a related note, I’m also working on a detailed write-up of my Smart Home configuration, so look for that coming soon.