Artificial intelligence tools like deep learning have come on in leaps and bounds in recent years, powered by two main resources: cheap processing power, delivered by the cloud; and stacks and stacks of user data. For a company like Apple — which has banged the drum of user privacy like no other — this presents a problem. How does it deliver deep learning smarts without constantly sending your data to the cloud? And how does it get that data in the first place?
“WE’RE DOING IT ON YOUR DEVICES, KEEPING YOUR PERSONAL DATA UNDER YOUR CONTROL.”
At this year’s WWDC, the company promised that it can square the circle. New deep learning-powered features like facial recognition in your photos and conversational suggestions in your messages were showed, but Apple stressed that all the computation was happening on your device. “When it comes to performing analysis of your data,” said Apple’s Craig Federighi. “We’re doing it on your devices, keeping your personal data under your control.” Google’s, by comparison, happens in the cloud.
Apple also said it would be using a technique known as “differential privacy” to comb through users’ data while keeping them anonymous. This includes techniques like hashing, subsectioning, and noise injection to scramble their own data. This makes it difficult, in theory, to ever trace information back to an individual user, while still providing Apple’s computer scientists with workable datasets with which to train their deep learning tools.
Apple says it even bought in a third-party privacy researcher (and differential privacy expert), Aaron Roth of the University of Pennsylvania, to assess the company’s efforts. In what came as a surprise to no one, he was impressed, and was quoted on stage as saying that Apple’s efforts positioned it as the “clear privacy leader among technology companies today.”
The question is: how much of this is bluster? How good is Apple’s AI going to be compared to Google’s, which isn’t as hamstrung by pledges of user privacy? And how will deep learning operate locally? Apps that use local deep learning could be slower (working on just your phone rather than a rack of servers), they could be bigger (downloading all those neural network models), and balancing out these constraints might mean they end up simply not working as well as the competition.
And unfortunately, while we’ll be able to judge pretty quickly how well Apple’s on-device deep learning functions (does it categorize your photos better or worse than Google’s? Just try both!), the success or failure of privacy models aren’t always as obvious. Problems tend to emerge over time, and despite what Apple says, security researchers have found that “anonymized” data doesn’t always stay anonymous. We’ll have to closely watch the company’s results — and not just its presentations — to find out.