The democratization of deepfake technology brings new perils for business

Reality Defender's CEO and Deloitte's chief futurist explore AI pitfalls and deepfake risks, discussing mitigation strategies for both users and organizations.

Click to listen on your favorite streaming platforms:

Apple podcasts     Spotify

Today’s guests:

  • Mike Bechtel, chief futurist for Deloitte Consulting LLP
  • Ben Colman, CEO of Reality Defender

Deepfake technology has advanced rapidly, and bad actors have taken note. In February, a finance worker in Hong Kong was tricked into transferring approximately US$25 million to a fraudulent account after a video conference with his CFO and other coworkers he recognized. He later discovered that everyone on the call—except him—was a deepfake.1

Mike Bechtel, chief futurist of Deloitte Consulting LLP, said events like this have galvanized business leaders into considering their risk. He and his team had covered the risks of synthetic media just a few months before. Clients were interested, but not overly concerned, he said. But when the news from Hong Kong hit, “In short, this really turned from ‘Yikes; that would be a bummer if it happened,’ to ‘Yeek! This is happening and it better not happen to me!’”

Detecting and dealing with deepfakes is not a simple problem. We spoke with Bechtel and Ben Colman, CEO of Reality Defender, a company that monitors and detects AI-enabled fraud, about why we’re seeing this rise of deepfakes, how businesses can protect themselves, and why current protection measures may not be enough.

Mike Bechtel: Fears of deepfakes are overblown. We can always tell when a voice is generated by AI.

Ben Colman: But can you tell a deepfake from an authentic voice?

Tanya Ott: Did you identify those as AI-generated voices? Tools are getting better—and more accessible—every day. But that only tells half the story. I’m Tanya Ott, and today, we’re going to dig deep into what’s already happening with deepfakes and what could happen over the next year or two.

I recently sat down with Deloitte Consulting’s Chief Futurist Mike Bechtel.

Bechtel: My team and I work to make sense of what’s new and next in tech, by looking around the world to get a sense of those new-fangled, emergent technologies that figure, not just to help us do more in terms of the art of the possible, but, you know, move us towards the art of the profitable.

Ott: And Ben Colman, co-founder and CEO of Reality Defender, which provides detection tools to large enterprises so they can stop deepfake and generative AI fraud.

Colman: And why did I found Reality Defender? You know, my career at Google and Goldman Sachs and government research has always been about the intersection of cybersecurity and data science. And so I’m a super nerd at this. This is all I’ve ever done. So it really came natural to me.

I started our conversation by asking Mike about the 2024 Tech Trends report, which included a chapter about truth in the age of synthetic media. I asked what has changed since the report came out late last year.

Bechtel: Oh, gosh. You know, Tanya, when we dropped Deloitte Tech Trends in December of 2023, we mentioned synthetic media, aka deepfakes, not just in video, but in photography, text, audio— putting the multi in multimedia, to use an old term. That concept felt provocative and little futuristic. Our clients circa January were intrigued, if not all-in on the recognition that folks could do material harm with synthetic media.

What’s been interesting over the last five months is we’ve seen a spate of headlines about senior executives at household-name organizations who, when asked by their boss to bum a password for presentation coming up with the board, they do so because, to use casual language, [they’re] like,
“Yeah, sure, man. You know, happy to hook a brother up,” only to find that that's not your brother. And in some reasonably well-publicized cases, folks have accidentally given their password to folks who’ve shipped US$10, US$20, US$50 million to an offshore account because that boss on the webcam was, in fact, a bad actor using deepfake tech.

In short, this really turned from “Yikes; that would be a bummer if it happened,” to “Yeek! This is happening and it better not happen to me.”

Ott: Deepfakes can seem like a new and totally unique threat. Is that the case? Or is it just the latest episode in the history of social engineering and fraud?

Bechtel: I remember, Tanya, 20 years ago or so, when my grandma, God bless her, she told me about a “nice man” who had money for her if she just flipped him her bank account info. You know, the “Nigerian prince.” And I remember thinking at the time, oh, grandma, this isn’t what you think it is. And thinking at this time, oh, the follies we used to engage in. But the fact of the matter is, this was a social engineering play, right? The exploitation of human goodwill, human trust, [and] human foibles.

We’re still falling for it, [but] it’s no longer the poorly worded email by the poorly disguised Nigerian prince. It’s crazily believable 4K video, photorealistic pictures, text that’s actually in the tone and style of the person they’re seeking to emulate. In the legal community, they call this a “matter of degree, not kind,” aka this stuff isn’t an altogether new attack so much as that same old effective attack, but now on steroids. At least that’s how I’ve been seeing it.

Colman: It’s a bit of a loaded term because up until a few years ago, we had deepfakes, but we called them virtual humans or AI avatars.

The idea [of a] deepfake goes back at least a decade, maybe more, starting within entertainment with special effects. Think about your favorite movie—the type of work that needed really high-powered computers and also a lot of time. You’d pick something you want to make, whether it’s a face swap or a fake environment, and then you’d go to sleep and wait a night, maybe a few nights for it to render, and it might be good or might be that you have to redo it.

What’s really happened over the last few years has been the democratization of both the tools—they’re now available to anybody with a search, whether it’s online or through the App Store on your phone—but also the democratization of the technology needed to run the software. Anybody with a credit card can get access to cloud compute. A lot of times the tools don’t even need you to spin up your own instance on cloud compute because the software itself is available online with just a quick search and your credit card.

There are deepfakes for audio where you could make really entertaining audio—maybe me sounding like the Rock or another actor—or something incredibly dangerous like the deepfake audio of President Biden telling folks not to vote in the primary. So this is a trend that’s going to continue with other modalities beyond audio.

What’s not yet dangerous and what’s not yet on the forefront of deepfakes [is] real-time video. We forecast that the idea of deep fakes in real-time video will start hitting prime time next year.

Ott: Wow, that sounds like a game changer. Is it?

Colman: It’s a game changer for a lot of great qualities and opportunities. We think generative AI is going to change the world. It’ll increase efficiency, increase productivity, increase creativity. But in a very small minority of use cases, it’s incredibly dangerous.

This podcast, our typical conversations, instead of using a gigabit per second of bandwidth, we’ll all have deepfakes of ourselves. That’ll be a permissioned avatar of me and you. And we’ll be able to take calls or take video conversations from our car or lying in bed and have a perfect version of us looking right at the camera always. That's going to open up a whole new world of challenges, which is how do we prove that’s Ben using Ben’s deepfake avatar of himself?

Ott: Wow, what are the particular risks for businesses in this environment?

Colman: The challenge is that anything that uses media communications can be faked and will be faked. Certain platforms and banks and brokerages still believe that your voice is your password, and all that can be faked. We’re seeing that right now. Just the fact that it sounds like me and matches my voiceprint, and just because it has my birthday, my social security number, my address, all the things that could be found or stolen or hacked online—that’s just not enough. The idea of you have to see it to believe it, or you have to hear you believe it—it doesn't work anymore.

Ott: I just reloaded apps on a new phone, and I had one of my banks say do you want to do voice recognition? And I’m like, I’ve got 30-plus years of my voice out on the internet as a radio person. No, I don’t want to do that. Easily, easily found and duplicated.

Bechtel: Ben, as you were explaining voice as identity it got me thinking of something that happened to me just two weeks ago. I got a phone call from an 800 number wherein the person said, hey, there’s some unusual charges on your credit card and did you buy a US$500 waterslide at a big box store in Florida, Tampa, three hours ago? I said, oh, goodness, no. (He) says, well, how about so-and-so? I go no. And they keep asking me about these charges that are similar insofar as they’re uniformly ridiculous. And then they hung up (after saying) thank you, we’ll make sure these charges don't go through. And I’m thinking to myself, that just felt different than the usual fraud-prevention interaction. And then it occurred to me, wait a minute, these cats were probably trying to harvest my voice.

Well, sure enough, a couple days later, I start getting actual questions from the actual credit card entities asking me if I’d actually called in. And I thought to myself, oh, Lordy. It’s a brave new world out here. I’m not one to hide my head in the sand, but to your point Tanya, my voice is out there; that cream is in the coffee.

Ott: Yeah, yeah.

Bechtel: But goodness gracious, the vectors that are in play that I didn’t even realize were vectors until I lived it as a potential victim.

Colman: In certain situations, for example, a fake voice, even if you know it's fake, are we really going to react the way that we should? I have two kids—two boys, four and seven [years old]. They’re a lot of energy. They’re killing us. We’re not getting any sleep. They go to sleep late, and they wake up at like five in the morning. That's a whole different podcast.

Bechtel: Yeah. Congratulations! You'll be out of that phase in about ten years.

Colman: And then I’ll miss it.

We work with telecoms trying to help protect their users from scary situations—I get a phone call, that is obviously fake, from what may appear to be my one of my kids, either they’re in trouble or somebody is with them and they’re in trouble. It could be they’re in a car accident or with some people at a mall or some other kids were just being silly, and I’m told, oh, my son needs money. I need to Venmo or Zelle, or wire US$100, US$1,000 to somebody.

This is something that people are facing every day. Could be their kids, could be their parents or grandparents. Could be middle of the night. Am I going to take the chance, or am I just going to send the money? Because even if it's 95% confident that it's fake, is that 5% really worth it? And that’s a real challenge.

We spoke to somebody just yesterday who got a call in the middle of the night from one of their kids who’s in college, who is known to be on a trip, which probably was already known and shown on their social media. They got a [call] saying, hey, I’m in a foreign country, I got separated from [my friends], I got in a car with some people, [and] they won’t let me out of the car. And the phone’s taken from them [by someone] saying, hey, we’re with your son or daughter. They’re okay now, but they might not be soon. You need to send this amount of money to this address.

And again, you know, we all [have in the] back our mind, wow, this is so obviously fake. But if it’s someone we care about, even myself, as a CEO of a company literally focused on deepfake detection, I don’t know how I’d behave in that situation. I might send the money just for that small chance of it potentially being dangerous.

Ott: I actually had a neighbor that fell victim to a fraud exactly like that when the voice of her granddaughter, who seemed to be in distress, was on the other end of the call, and the neighbor ended up losing US$11,000 over the deal.

Colman: Oof!

Ott: Oof is right.

Colman: The challenge with this is that the tools to do everything we’re describing are available to anybody. On one hand, hackers and bad actors and fraudsters can now do it in real time to many more people at the same time, but also average people who might not think about committing fraud might just do it because it’s just so easy. And wow, you can get away with it.

[Generative AI] can be used for a lot of great reasons. You could create presentations for our companies, help support communications as different languages are being translated in real time. But if anyone can use them without any verification of who you are and what you’re doing, it’s really something where the technology is moving faster than regulations required to protect average people, let alone companies and countries.

Ott: I want to talk about some of the ways that we can deal with it. But first I want to ask what sort of AI fraud is not making the headlines, but it’s going on, nonetheless. We might not even be aware as the general populace of this type of fraud.

Colman: We’ve been thinking about a lot of things in the lab, putting ourselves in the mind of the hacker or bad-actor fraudster that we postulated but also thought, wow, we’ll never see [these] things, but now we are seeing some of these things. One thing, we’re calling “AI autocorrect.”

We spoke to the head of cybersecurity for one bank that explained a situation where somebody had called the bank [and] had already confirmed their voiceprint. They confirmed all their information—birthday, social security number, tax ID number—because they worked for a company. And this is all real. It was this person. It was the person who was confirming who they were at the company. And they were executing a wire transfer, which again was still really them. When they said the account number they wanted to wire it to, that part was not them. It was actually a fake voice. And they only caught this because this bank and this client required a multi-factor authentication with a code generator, which this person forgot. The real person was trying a real transaction, but they didn’t have the code generator.

So, imagine a three- or four-minute phone call, which is all real, all confirmed, all verified, all honest, and then only the five seconds of the call that mattered the most, giving the account number where you want to wire the money, changing.

As they try to do some forensics to understand what happened, what they believe happened is that this person’s phone was compromised, or the connection was compromised. Something in the middle was compromised, and somebody or some organization or platform was listening in and waiting until just the right moment to inject an AI-generated voice of the person they’re trying to impersonate, to try and get a real live transfer sent to a wrong recipient.

Obviously, there’s a number of things going on in this case. It boggles the mind, the different levels of safety that be breached to do this; the amount of time that the bad actors are waiting to do this.  But it just shows how, with AI and some creativity, you could do a lot of fake things.

The other example I’ll give [is] a little […] closer to home. We were interviewing a senior researcher. And after 20 minutes of great Q&A, he changed. He changed race. He changed from having glasses to no glasses. He changed accent. His hair changed. And it shocked us like, wow, that was amazing! What just happened? And he said, oh, I thought this would be fun, given that I’m interviewing for a job to detect fake faces and voices and video and audio and everything, I thought it’d be fun to also deepfake myself for most of the interview.

We asked him how [he did this]. He said much like most startups, he signed up for Amazon Cloud Compute. Anyone who signs up, typically when you go through a background check, you can get anywhere from US$10,000 to US$100,000 of free compute on Amazon. He had about US$20,000 to compute left on a project he’d been previously working on, and he used those right here, for that time in an interview. He burned US$10,000 to US$15,000 of compute credits on AWS.

This is why we’re seeing most of the fraud with real-time audio, because it’s a lot cheaper to do. The platforms that exist to do it will cover the cost of compute. With real-time video, it’s still too expensive. [That’s] why we haven’t seen this type of fraud happen yet. But as the compute costs reduce, we expect, [in] the next 18 to 24 months, to see just the degree of deepfaking—financial fraud—happening [in] real-time video as well.

Ott: So what sort of solutions are there for detecting and mitigating these deepfakes and the AI-generated fraud like this?

Colman: We take a pretty firm view that consumers should not have to become experts to detect AI-generated fraud, the same way consumers aren’t required to be expert to identify a computer virus. Your email does it for you because average people, let alone experts, can’t do it themselves. And these platforms don’t do this just because they want to be good corporate citizens. They do it because there are either laws or regulations or requirements from the FCC (Federal Communications Commission) or FTC (The Federal Trade Commission) to do this.

Bechtel: You know what’s so interesting to me when I hear you lay that out, Ben, is that the analogy that pops into my mind is it’s sort of like a digital truth serum. Absent this tech, there’s no means of understanding if content is, maybe you could call it, certified organic. But here you are, as I understand it, using a mixed-model approach that says, okay, given the fingerprint that seems to be coming through we might not be able to tell you with 100% certainty if it's a carbon- or silicon-based life form. But we can tell you if it's feeling fishy enough to escalate to additional checks and balances. Am I capturing it correctly?

Colman: I think that’s a good approach. What we’re doing is inference, which is probabilistic. We are not doing provenance, which is deterministic. One of the reasons we don’t do provenance is because, perhaps sometimes it’s a false sense of security. If you don’t say something is fake, then you’re saying it’s real. But you might be wrong. And with inference, it’s probabilistic. And so we might have lower fidelity, lower quality of a phone call or a video or image or document, but we can always provide a confidence level and a score that can then be used in a perimeter strategy with other signals to make a termination by a client on a next up for a phone call or a video or a piece of media.

The other bonus for us of doing inference [and] not provenance is we don’t need any ground truth. Choose your favorite voice-authentication solution—that platform fundamentally has to retain personal information of users and employees. It needs their voice. It needs their voiceprint. And what we've seen time and time again is if something can be hacked, it will be hacked. And unlike a password, if you lose your voiceprint or your face print, you can’t just press reset. It’s lost forever.

The ultimate password, ultimate private key, for people is our DNA, and 23andMe admitted they were hacked a year or two ago.2 Just like your face or your voice, your DNA is not something you can reset. You can’t get new versions of it. That is out there forever.

Ott: Ben, you’ve been talking about some of the technical remedies, but you also talked about non-technical remedies like legislation that would then compel companies to do something. I’m wondering about the efficacy of those kinds of things or media literacy campaigns. What’s your assessment of how effective they would be in helping people and companies protect themselves from these dangers?

Colman: I think that any education is good and very, very important. In our space, education only gets you so far.

I’ll give an example of phishing campaigns. A lot of companies have started to test their employees by sending them automated, potentially fraudulent phishing campaigns. And they’ve seen time and time again people still click the button. And then, even if they're told, hey, you did it last month so [we’re] testing you again, [and] they'll still click it again. But at least […] you know, you’re trying. You're hoping to give better information and muscle memory so that you can hopefully reduce fraud.

An example I'll give is, every company and every single organization will tell consumers you need better passwords. Now they require it. You need lowercase. You need uppercase. You need a symbol. You need a number. You can’t have two characters after another. It’s really painful. And people say, okay, they’ll do it. And they’ll also use the same password, multiple places, which, again, we all think is obviously something you shouldn’t do, but people still do it anyways.

But the moment that our email or our social media is hacked, suddenly we have that really personal feeling of, oh my, all of my information is now online. That means all of your emails, all your personal images of your family, all your medical records. And only then do people realize they need to really be smarter and be more careful with their passwords. Typically, that’s when people actually go sign up for a password manager and start generating completely randomized passwords.

There’s tens of thousands of completely off-the-shelf tools to do this, [but] until you see it done on you or on somebody you work with or someone you care about, you just don’t realize how dangerous and how universal this problem is.

Bechtel: You know, Ben, one of the things we’ve seen in our emerging technology research here at Deloitte is how often seemingly unprecedented or futuristic technologies echo the patterns of past precursors. Kind of that old idea that history doesn’t repeat itself, but it certainly rhymes. Your point about the motivator for engaging in this sort of defense really has me thinking about home or residential security systems. I remember when we had moved into our new house a few years back, the fellow who was setting up the system said, “Yeah, you’re one of the scarce few who does this proactively and not in response to a recent burglary.” And I thought to myself, well, thanks for providing me the leverage to think in a sober means about the value prop here. But it really did get me thinking that you’re right. [For] most of us, be it passwords or even our homes—it’s not a thing until it’s a thing, and then it’s everything.

Ott: Well, the threat seems dire, but why is it important that we not write off AI.

Colman: AI has so many phenomenal opportunities for business, for humanity. As we think about the regulatory side, we absolutely want to ensure that there are no limitations on AI innovation. But there is a balance needed given that in a subset of use cases, there are these transformational, nonlinear, exponential risks that are created.

Ott: So if you were going to summarize, what do companies need to be thinking about? How should they be preparing?

Colman: Companies need to be thinking about a lot of the things that previously they thought they solved more holistically. There is no silver bullet here. Defect detection is just one of the tools that organizations should be thinking about, but it’s a tool that all organizations should use to complement other, more traditional checks they're doing on users or actions or requests, [for] both internal risks, but also external risks, whether it’s to the company or to their customers.

Bechtel: With our Deloitte Tech Trends [from] over the last 15 years, you tend to see patterns. And one of the macro patterns we’ve seen as regards cybersecurity and trust writ large is this move towards a “zero-trust posture.”

In the olden times—and by olden times I mean five, ten years ago—cyber defense felt like a moat around a castle, right? The idea [was] that our castle is our professional home. It’s protected by a VPN (virtual private network). It’s protected by firewalls, aka the moat. And woe to thee who thinks they’re going to get through that barrier. Well, the trick is, while the vast majority of people don’t get through, those that do have free run of the castle and havoc ensues.

The zero-trust posture basically says, no, no, no, no, no. No more moats. We’re going to lock every square meter of the interior of the castle, and everywhere you go is going to require proof of identity. What’s so interesting about that is it replaces the idea of “trust but verify” to the idea of, well, “You ain't going to trust nobody, right? You’re going to prove it every gosh darn time.”

And I think what we’re seeing with respect to cyber and AI, is the recognition that in a world where you can’t trust your eyes and ears anymore, you’re going to have to fight math with math. You’re going to have to have that sort of digital truth serum, as it were. Inference, not provenance. I hear you loud and clear, but you’re going to need to have that running in the background as the recognition that nobody can be “innocent until proven guilty,” at least with regard to cybersecurity.

Colman: Absolutely. If this [were] a year ago, I’d have given you guys all kinds of examples of how to identify different kinds of anomalies with the naked eye. But the truth is that with the majority of our team across research and engineering, which is two-thirds of our team, many with PhDs—if they can't see anything, how do my kids, how do my parents stand a chance?

Ott: How do you guys feel as you’re moving forward in this, and the technology evolves?

Bechtel: Well, one of the things I think that's important here, Tanya, is to resist the temptation to throw the proverbial baby out with the bathwater. One of the other themes we’ve seen is that there’s a tendency to characterize anything new as alternately a hero or a villain. Right? Ooop! It’s new-fangled. Do we fear it, or do we revere it? And I would tell you, don’t do either.

The truth is that technology is a puffy-chested, four-syllable synonym for tool. And humans have been tool makers for the last 2.5 million years. Tools can, at the end of the day, they can be used for mindful good or malicious bad. When you think of a tool like fire, I can cook my dinner—that’s pretty good—or I can burn my neighbor’s house down—that’s pretty bad. When you substitute fire with fission […] I can [cook] all the dinners in town with thoughtfully deployed fission. I can burn down the town with recklessly or maliciously deployed fission.

And I think [if] you extend this to the latest and greatest, in this case, AI and generative AI, what you realize is, okay yeah, we can unleash a new era of creativity, a new era of productivity, and those are all true. But, the cartoon, mustache-twisting villains out there are trying to figure out how to sow havoc, and we need to get out in front of that, too.

And so, our team at Deloitte, we write a series we called Dichotomies: speculative fiction as a vehicle for business value, helping leaders understand what could go wrong, but honestly, what could go right. It's about saying, okay, this isn’t a big dumb binary. This isn’t a tech to be gotten rid of or to be embraced whole hog everywhere all the time. It’s about thoughtful, strategic choices and adulting. And, I think, nowhere more so than in cyber defense.

Ott: I love adulting. Hello, adulting. Ben, final thoughts from you?

Colman: I think we’ve had a lot of doom and gloom, but I want to end this with, there is hope. There are solutions that exist here. We can use AI to detect or to fight AI. Once we accept that, we’ll get past a lot of the challenges and false sense of security we’re seeing online today, which is, oh, you can trust community notes, or you can trust content moderation. Oh, we’ve asked consumers to flag a piece of media for being fake. It’ll all be automated. It’ll be just like looking at a computer virus. You just know it’s there because the platforms flag it for you. And you can really trust when you click on a file that you're not downloading a virus on your computer. Now, with immediate communications, it can take a little more time, but I'm extremely optimistic that in the next two years—potentially before Q4 of this year—we'll start seeing a lot more protections for average people, but also for governments and companies in the space.

Ott: Well, you guys have given me, and I’m sure our listeners, a lot to think about. Thank you so much for joining us today.

Bechtel: Thank you.

Colman: Thank you.

Ott: Ben Colman is co-founder and CEO of Reality Defender. His company provides detection tools to large enterprises so they can stop deepfake and generative AI fraud. Mike Bechtel is the chief futurist for Deloitte Consulting. You can find his team’s 2024 Tech Trends report at DeloitteInsights.com

Don’t forget to subscribe or follow the show so when we release a new episode it’ll drop to your device automatically, so you don’t miss a single thing.

I’m Tanya Ott. Thanks for listening and have a great day!

This podcast is produced by Deloitte. The views and opinions expressed by podcast speakers and guests are solely their own and do not reflect the opinions of Deloitte. This podcast provides general information only and is not intended to constitute advice or services of any kind. For additional information about Deloitte, go to Deloitte.com/about.

by

Tanya Ott

United States

Mike Bechtel

United States

Endnotes

Acknowledgments

Cover art by: Alexis Werbeck