The weakest link

Contemporary voice recognition systems over-emphasize learning based on explicit “a=b” training; that is, there is a vital absence of false training.

I imagine a parent and child: the parent says “It’s time for …” as a peal of thunder ripples through the room. This might be used as a comedic device precisely because we would not expect the child to respond “Yes, daddy, I’ve turned on the lights in the kitchen”. I’ve yet to hear a voice system ask me what “ACHOO” means or just say “what was that?”

After a hiatus, I return to Windows speech recognition and am confused by just how far ahead it is of the technology we rely on in Siri, Alexa, Google Home, even Microsoft’s own Cortana.

For training it still relies on the old “speak these words” explicit recognition training. This is basically the same tech that shipped with Windows 7, and this comes back to my point: This approach was already not-even state-of-the-art when Windows 7 shipped.

I believe a far better approach would be to a decoupled training procedure: don’t tell the training system what the user is being asked to say. Instead, use a combination of pre-scripted phrases, common keywords, and insight into the state of the network, to decide what to ask me.

Then, ask the user to exclude options until they are down to something close enough to need individual words correcting.

There are two major gains here: 1. The user gets clear feedback on where the system is struggling to understand, 2. Instead of teaching the system that “*cough*pi” means pizza, and that “zzaplease”  means “please”, I can acknowledge the system’s ability match sounds to speech.

The problem of purely positive training is compounded by the assumption engineers make that the systems will only hear deliberate communication.

Think about this: You cough and your voice system activates. You say “I wasn’t talking to you”, and you get a witty reply.

Except: You actually just trained the system that it heard it’s activation word; it may have changed in recent months, but it was certainly true at the start of the year that all the big systems had this flaw.

Nor does being quiet help.

I think this is part of why all of the current systems have the ability to suddenly become dumber on you. Perhaps the microphone is suddenly muffled, or perhaps the subtle changes of you having a cold for a day totally reaffirmed some weak association in the engine and it’ll take you months to untrain it again so it recognizes your regular voice.

It’s my hunch this is why there is so often a clear honeymoon period with devices like Alexa, Google Home etc: you become less forgiving, the system becomes over-confident, the first thing you say gets misunderstood, and your speech pattern changes as you become annoyed, angry or bothered by the device. So instead of your normal voice being the voice it expects, your angry or shouty voice is the one it trains itself on the majority of the time.

Alexa does provide the ability to provide corrective feedback via the Alexa app, but that quickly becomes burdensome and after the first few months, largely seems to be ineffective.

Positive AND negative training are the way forward.

Cold hard cache.

Time to crawl the interwebs. I’m looking for something relatively small and lightweight, a binary blob cache that I can drop into place, import a module in python and have relatively easy access to.

The keys are likely to be large, the blobs may be several MB. I don’t care a great-deal about persistence.

What I’m looking to achieve is something like ‘distcc’ for asset conversion. The backend doesn’t need to know that, it’s just going to get semi-opaque key values that ultimately serve to compartmentalize hash spaces.

Mashinky

I played a lot of Railroad and Transport Tycoon. Some of the recent attempts to recreate the experience fall flat because they either look hideous or they are too busy delivering an ultra-realistic world/train-driving simulation.
 
On Steam, “Mashinky”  popped onto my queue and it looks like Jan Zeleny may have found a great middle ground.
There is a tycoon-like, low-rez tiled experience which he maps into a more beautifully rendered 3d mode with fancy camera options.
Steam reviews are mixed, but this is now top of my queue to try and Secret World.

CPPCon 2017

I love and hate conventions, so I don’t go to them all that often.

Although I’ve watched CPPCon videos, I hadn’t considered something you attended until this year; I wasn’t really convinced it would be worth going.

The agenda for the first few days proposed some very interesting stuff, and I decided to dip my toe.

Beware, AI…

Ever done one of those puzzles where you have to change the word “FISH” into “SOAP” one letter at a time? Imagine a more scrabble-like two-player version where each player starts from one word and they work towards the middle together.

The recent stink about Facebook shutting down some chatbots is the clickbait version of describing Facebook guys creating code that tried to do roughly the same thing, but let the dorks get carried away using words like “the machines” and “invent” and “language”.

I suspect that Facebook shut down the project because it was pointless and stupid and the coders were a little bit too whimsical.

What they did was take the task of “bartering” and reduce it to a simple numbers game; think of a sort of co-operative scrabble/fish (cards) version of the earlier puzzle where you don’t have to trade a card if it isn’t a fair trade, and the game ends the first time neither of you offers a fair trade.

You do this by drawing two hands. Each hand can be described numerically as a list of (card number and quantity). That is: jack, jack, ace, three = (card 11 * 2), (card 1 * 1), (card 3 * 1). Take the word ‘card’ out and we have (in json/python): [(11, 2), (1, 1), (3, 1)].

The Facebook guys wrote small programs that took two such lists and built a new list: the cards they want to trade. jack for queen, jack for king would be [(11, 12), (11, 13)] (jack is 11, queen 12, king 13).

These lists were sent between the programs using messenger. To do this, the programmers – not the programs – replaced the numbers with words to generate a text message they could send. At the other end, the same code mapped the words back into numbers.

So far, this is all very computationally simple, and I’m sure that there was some level of “ai research” or “machine learning” code involved, but the approach taken and the underlying task they focused on resulted in nothing special. The programs didn’t “know” anything, they just needed to succeed in choosing a number sequence that went from their first hand to their last hand without choosing numbers that were “too big” (I’m simplyfing the concept of filtering here).

The programs did not become self aware, did not know they were “communicating”, only “communicated” in so much as the line “sendMessage(‘jack queen jack king’)” as code is “communicating” (it’s a techie term, not the literal english ‘communicate’), and they most certainly did not invent a language, they simply did literally what they’d been told to do and nothing else.

Honestly: What happened is that some idiots got their project cancelled and bitched about it by describing it like an 8 year old…

“We wanted the other machine to trade our machine a jack for a queen, but instead of developing the ability to speak english and saying ‘Trade you a jack for a queen’ via a speaker box, it was really spooky… our machine said ‘jack queen’, and the other machine – the one with the red eyes and the laser beams – it said ‘queen jack’. Holy shit! Sure, we wrote code to print “something something” but … it was doing it. All on its own, when we clicked Run.

“Obviously it didn’t say that, it just printed 10 11 and 11 12, but when we ran the program that converted the numbers into text and sent them to messenger, you could see it right there, on facebook! In text! ‘jack queen’ and ‘queen jack’. The machines were talking to each other! It was, like, they had invented their own language.

“First time round, we couldn’t get the other computer to receive the messages, we had to copy and paste them into a program to convert text into numbers on the other machine, but when we did that, when we converted the text into numbers, and ran our program, it printed out some more numbers. It was like the machine understood what was being said to it. Totally freaky.”

TL:DR; There was definitely some “artificial” intelligence behind the story

Mr #4 if you read this – someone needs to be “transferred to the Feed-PE team”.

 

Wink 2 review

We moved into a rental house a couple months ago and I decided to finally explore my interest in smart-home automation. Picked up an Ikea Tradfri gateway, remote and bulb, a Wink 2 hub with some Cree bulbs, and some random china-cheapo bulbs from Amazon.

Ikea’s offering was cheap and cheerful, and only the Ikea stuff would talk to their gateway, so you’re going to need a hub unless you’re only going to buy Ikea stuff. The little remote is nice, though. I turned the Ikea hub back off and went ahead with Wink.

The Wink 2 hub is a nice looking piece of hardware for this sort of thing, the box, packing and manual all impressed. It was able to talk to all the non-hue devices I bought, with the exception of the Ikea remote. That’s a shame because it’s a nice little remote but I’m not running a gateway just for a remote, so that’s out of play.

Generally, I was very pleased with the hub.

Home-automation follow-along

I’m experimenting with Home Automation using Python. For those of you curious about how it works and how confused you’ll need to get, I wanted to provide this little tutorial/follow-along. You don’t have to participate and you can skip bits you don’t care about.

NOTE: The purpose of this post is to show you the workings behind the workings, you aren’t going to have to get your hands this dirty to work with most home automation systems.

Python-Hue and Python-Homeassistant

I’ll be demonstrating talking to a Hue hub and my local Home Assistant install. If you’re using some other hub (Wink, etc) it is left as an exercise for the reader to find the appropriate ways to talk to that API. You should still be able to follow along in spirit.

Dumb vs Smart?

The prefix, “Smart”, for a light bulb, motion sensor or pretzel mulcher is generally an allusion to a device that can participate in smart home automation.

The minimum bar for “Smart” is providing information to a controller, usually wirelessly. The next step is being able to be controlled the same way.

Motion sensor: A “dumb” sensor sits between the light it controls and the electricity supply. It acts like a physical switch. No motion? No power to the light. When it senses motion, it closes the switch and current flows to the light.

Whereas a “Smart” sensor detects motion and turns it into information that it sends to an intermediate device. Which will be called a “controller”, “gateway” or “hub” depending on precisely what that thing does.

In order for the sensor to turn a light on or off, a virtual connection is made by configuring or programming something to say “When the sensor sees motion, take this action”.

The fundamental value in “Smart” is the ability (and perhaps desire) to make decisions such as “if the sensor sees motion and it is dark then turn on the patio light”.

Short-range vehicle intercom

Throwing this out there. Approaching a stopped vehicle is fraught with problems, but today we have ways we could defuse/address some of that.

The first element is a very short-range communication link with assorted restrictions and security features, targeting the scenario where a car is stopped with a second vehicle stopped directly behind it, perhaps 3-6 feet away. Encrypted, not for privacy but to ensure range-based negotiation.

Second, we need some way to talk to the occupants of the vehicle.

I would start with a simple voice link, a microphone, and speaker near the driver but with proximity to the passenger(s) so that emergency responders can be talking to the passengers on arrival, while obstacles are being cleared that would otherwise make communication difficult.

Your heckles are probably raised at this point. So let me interject some protections here.

  • Complete system disabled when vehicle is in motion,
  • Microphone disabled by default and connected to a light that clearly indicates when the microphone is powered,
  • Require the “caller” to speak for > N.n seconds before enabling microphone,
  • Allow the user to choose between manual activation and automatic activation based on (a) turning on the blinkers, (b) deployment of one or more airbags,

In the US this would potentially defuse a tension-laden pull-over for a tail light from a dark figure holding a gun approaching your vehicle to a preliminary conversation.

You could also address additional issues by providing the option to have the device “call in” when it is activated and register the contact with its location, the details of the “caller”, and perhaps validate that the caller is legitimate.

I personally wouldn’t have a problem with (optionally) enabling video and an automatic drivers-id field as part of the handshake. One reason officers approach cars is the opportunity to visually ID for wanted people, and some officers might actually prefer the mark-1 eyeball over a quick voice chat.

So I’d have no problem with being able to allow the officer to see me/my vehicle, check my ID and everything, without having to approach me with his gun.

But this system won’t ever be perfect, there will be exploits, from the media spying on people to nefarious and potentially dangerous/lethal stuff.

So the system needs to be very optional, from not having such a device at all and requiring good-old approach the car, being able to buy a voice-only system to having a bells and whistles system and having everything definitively and impenetrably turned off.

Further, this system should in no way to be tied to control of the vehicle. In the first version of this idea on Facebook, someone had suggested the officer might need to be able to stop a “drive off”. Except the officer is still behind the wheel of his car and can just give chase.

While this is mostly about making the pull-over less dangerous for everyone, consider the dangerous situation of a car stopped in the middle of a freeway/motorway. Now it’s possible for 2-way communication with the occupants who might be in distress or trouble.

A driver with a heart condition might reduce response time by critical seconds; Firemen could communicate with trapped occupants and give them life-saving advice…

This isn’t intended to be a complete ready-to-go design, but perhaps enough inspiration for someone to put something like this together.

Save some lives.

City/Empire/Culture building games?

Asked several times yesterday what my favorite kind of game is, and I don’t know if I have an answer for it. Then I happened to be background watching a youtube video about Sid Meier’s “Starships“. How the heck could I have forgotten the man-months I spent playing Blue Byte “Settlers” Games? “Civ 3“? “Age of Empires“?
Platformers and side-scrollers, I remember “Jet-Set Willy“, “Monty Mole“, “WizBall” and then after that, I maybe remember “Turrican” and “Giana Sisters“, “Duke Nukem” briefly (Also, “Bubble Bobble“, I’m ashamed to admit). I’d seen it all already, and I wasn’t interested in the newcomers.
Graphical adventures rarely floated my boat – I was late to the party with “Monkey Island” and “Day of the Tentacle”. I spent some time in Myst.
I played hundreds of other games I just don’t remember. The “Jedi Knight” and “Tie Fighter vs” games ate chunks of my life. “Doom”, “Quake”.
The games I played most and returned to most? My first ever experience with a computer was at school with 5 of us sharing an Apple Lisa at a time and playing a settler-type game. For the longest time, those were my favorite kinds of games, with space games (“Elite”, “Federation of Free Traders“, etc edit: How could I forget? Paul Woakes’ amazing Mercenary Games?) being my favorites.
I’ve tinkered with a few in recent years but never gotten hooked. They became too predictable and grindy. It seems like, instead of expanding in meaningful ways, people just hacked on new ‘cool’ features to the repertoire.
Last “Civ” I played was “Civ 5” and it was just unfulfilling. The wonders weren’t … wonderous. Just a building with stats. Overall, the game just felt too much like it was designed or tuned by engineers.
I liked the earlier “Caesar” games, but it seems empire games at the moment fall into just one of two categories: 1/ Fight: Grind a big enough army, 2/ Decorate: optimize city layout so you can make more of the same. Nothing seems to hit that nice middle-ground that I remember “Civ 3” or “Settlers III” (I think) having.
Any suggestions for games to check out?

I’m actually happy with Win10.

There are several ways you can make Windows 10 more bearable/comfortable.

I like the start screen because I’ve organized mine: got rid of everything MS had put on it and just put my stuff in groups that make sense to me. It’s essentially just like most people used to have their desktops but with some structure to it.

If you have it set to be a start menu, that might be annoying. I would encourage you to instead get Stardock or something and give yourself a win7 program menu.

Tip #1 – Start -> Search
Pressing the Windows key or clicking Start opens the menu/screen but it also opens an input box. This isn’t always obvious to everyone. You don’t have to click anything at this point – just type, and it will search for what you’re typing.

e.g Hit Start and type cmd.

Tip #2 – Start +
Few people know that you can access the items on your taskbar based on their position. My left-most icon is my browser, and I can launch it with the keyboard by pressing Windows+1. The next icon is my email, that’s Windows+2. The third is Windows explorer, that’s Windows+3. And so on.

Tip #3 – Hidden taskbar, crouching start.
I keep my taskbar auto-hidden for a little extra screen space, and the annoyance of having to mouse for it is gone for me now because pressing Start brings it up along with the Start menu/screen.

Tip #4 – Pin
Tap Start and type calc. The top entry will be “Calculator”. Right-click on it. Your options will include Pin (or unpin)to Start, and Pin(or unpin) to Taskbar. If you have the option to Pin to Start go ahead and do it. Then try moving the tile to someplace you’d like.

Try right-clicking the tile and see what your options are.

Most things that have icons – from Control Panel to Folders can be pinned to the Start Menu or Task Bar. Have folders hidden 300 deep somewhere? Pin ’em.

I use a combination of things pinned to the Taskbar (for Tip #2) and everything else pinned to the start menu.

The beauty of this: My desktop now has very little stuff on it – a few folders organizing data, files, etc. I actually keep urgent bookmarks on the right side of my screen – something I couldn’t do before because of clutter.

The taskbar is less cluttered, just stuff I want to access with a Windows key and develop muscle memory for.

Everything else is nicely grouped and organized on the Start Menu.

As much as I hate live tiles, I actually make use of several of them on the left-most side: News, Weather, Mail, and Photos. These form a great little change-of-context summary if I want to look to the top left of my screen when pressing the button, and I ignore them otherwise.

Tip #5 – Windows Keys
There are a whole bunch of things that I’ve learned have Windows Keys associated with them, that I make heavy use of.

My favorites:

  Windows + D => hide windows and show Desktop
  Windows + H => screensHot + sHare this window (!!!)
  Windows + I => Settings
  Windows + R => Start -> Run
  Windows + S => [Cortana] Search
  Windows + W => Windows Workspace (checkout Screen Sketch!)

Misc/useful ones:

  Windows + A => Action Center
  Windows + E => Explorer
  Windows + G => Game Bar (when a game is running)
  Windows + L => Lock (avoid pressing if you don't know your password)
  Windows + T => Cycle thru taskbar items
  Windows + U => Accessibility options
  Windows + X => Alternate Start Menu