Everywhere you go these days – every store, every urban street, every car park, every cash machine – there are video cameras watching where you go and what you do. Just what are "they" (whoever "they" are) doing with all of that video surveillance?

Consider the Fraunhofer Institute's platform-independent development kit called SHORE (example above) which stands for Sophisticated High-speed Object Recognition Engine. This software can recognize faces presented full-frontal with 91.5% accuracy, can determine gender with an accuracy of 94.3%, estimate age with a mean absolute error of 6.85 years, and do all of that in realtime.

SHORE can also recognize facial expressions (Happy, Surprised, Angry and Sad), identify faces whether recognized or not, and it has a short-term memory for recognized faces to support tracking within a scene.

Some years ago I was in London – in Harrods to be precise – going up an escalator, and lining the walls were flat-panel displays showing ads. I started thinking about what could be done by placing a small camera beside each display and making a video of the people passing.

Using SHORE to process the video would enable us to figure out if customers were looking at the ads and which ads they paid the most attention to. This would be based not just on whether they looked at the ads, but for how long and if their eyes tracked the ads. Ideally, we'd also want to detect and measure pupil-size changes, which is known to correlate with interest level.

Tracking every beat of your heart

We could possibly even measure customers' heart and breathing rates visually as they looked at the ads. A friend of mine, Christopher Small, recently told me about a video processing technique called Eulerian Video Magnification he's been working on at Quanta Research Cambridge that amplifies microscopic movements. He's made an online demonstration of this available called Videoscope ... in particular, check out the "face" and "wrist" examples.

So, armed with all of this data about customers and their responses to ads, we'd then know which ads worked best for which demographic dimensions and, if we were processing the video in realtime, we could show ads to each customer individually that were an order of magnitude more effective because they would be tailored to what we determined their unique demographic profile to be. And, over time, using A/B testing, we could continuously refine our targeting ability.

To go further we could use full-blown facial recognition and then, when we identified someone who we had already seen and who we had characterized, we could deliver even more effective ads to them.

And how about analyzing how people navigate around a store? Store layout matters, as I'm sure you've noticed in stores that don't physically work well.

The problem is how to visually track people in a scene, something that has become very easy. Just over three years ago in Gearhead I reviewed a product called Vitamin D which does just that; it can distinguish people from other moving objects even when they are partially obscured!

What's particularly interesting about this product is that it was built using a technology based on a model of how the human brain works, which was developed by Numenta, a company co-founded by Jeff Hawkins, the founder of Palm.

Now, aggregate all of this intelligence about customers captured by all of the camera-augmented displays around the store (both electronic as well as static), along with the analysis of how people move around a store (this could be linked, ultimately, with the systems that recognize individuals to allow detailed behavior and path analysis for each customer visit), and we start to develop astounding insight into customer behavior and preferences in the real world.

Link that big pool of big data to the till data, and we could correlate the entire in-store customer experience (where they went, what ads they saw, what smells we added to the air, what lighting we used, etc.) with revenue potential. What we've done is turn the arts of analyzing and tuning advertising and store layout design into a science.

Of course this data will be very valuable and, in theory, we could share the data on both anonymous and recognized individuals with other businesses, allowing behavior to be tracked across multiple locations or used without prior interaction. This means we'd have to pay attention to protecting the privacy of recognized individuals by generating something like an MD5 hash for their recognition data and only storing that, but then their entire consumer experience could be "optimized" without actually identifying individuals and destroying their privacy.

Yep, welcome to the "Minority Report" shopping experience without the holograms.

So, how long until this happens? Or has it already happened?