Safeguarding data is our best hope to control AI

Congress shouldn't let myopic anticipation of the future be the enemy of the opportunities to build critical data policy now.
Samuel Altman, CEO of OpenAI, is sworn in during a Senate Judiciary Subcommittee on Privacy, Technology, and the Law oversight hearing to examine artificial intelligence, on Capitol Hill on May 16, 2023. (Photo by Andrew Caballero-Reynolds/AFP via Getty Images)

Artificial intelligence is coming — this much we know. Some iterations of it are already apparent in our daily lives: Google completes questions as you type; Instagram suggests a “new connection;” and Alexa responds to your commands. But the introduction of generative AI applications such asChatGPT and a growing number of competitors has now left us with more questions than answers regarding the impact of this technology and its implications for our lives and society at large.

Lately, frantic reports raising concerns about AI have lit a fire under the public and a bipartisan group of lawmakers. This year, Congress held multiple AI hearings to scratch the surface,  including one Tuesday with the CEO of the AI startup Anthropic. And bipartisan bills are pouring in, too. We are already discussing remedies to address AI’s impact on consumer privacy, cybersecurity and education. But what exactly are we trying to legislate?

While it’s responsible for Congress to raise questions about AI, creating policies on its prospective application may run the risk of deterring future innovations or, worse, exacerbating the dangers it may present. 

At this point, AI is in a state of superposition, like Schrödinger’s cat. Just as Schrödinger couldn’t know whether the cat in the box was dead or alive without physically opening the box, we can’t decide whether AI is a gift to mankind or a curse. At this point, we must think of it as both. So how does Congress open the box?


Well, when engineers try to resolve a Schrödinger’s cat problem, they don’t speculate on the unknown. Instead, they solve for the known knowns. In this case, we may not know what AI will become, but we do know what fuels AI — data. To quote the Federal Trade Commission, “The foundation of any generative AI model is the underlying data . . . exceptionally large datasets.” So it makes sense that we address the current issues related to data management first, before we speculate on how it will be used by AI.

One serious issue concerning data is that a handful of Big Tech companies — Google, Meta, and Apple, most prominently — unilaterally control the access, aggregation and distribution of data. As a result, they are in the best position to shape the future of AI. They also have demonstrated a penchant for shaping “the future” in their own best interests. Their outsize power means that they will either consume or destroy any disruptors threatening that position. Absent checks on their control, they possess an extraordinary amount of leverage over what AI becomes.

We as a society have always been wary of consolidated power in the hands of a few. If we want AI to best benefit its users and ensure our cybersecurity, addressing control of the data marketplace is an obvious starting point. Transparency and oversight are critical to the best long-term outcome for everyone. But the few companies that control the vast majority of data want to maintain their comfortable status quo. Diverting lawmakers’ attention away from antitrust policies and the bread-and-butter work of data regulation to the dazzling promises or threats of AI only helps them maintain that.

While we can appreciate that the biggest players’ in this field are trying to get out ahead of the problem with their recent “voluntary commitments” to principles of “safety, security, and trust,” voluntary commitments tend to be vaguely defined and difficult to enforce. Above all, these AI commitments don’t change the preliminary need to address data concentration in this market overall if they are going to have any real meaning.”

Another critical issue for us to figure out now concerning data is online child safety. We are becoming even more keenly aware of the effect online services have on our kids. Social media and other tech services, for example, have been linked to children experiencing high instances of depression, anxiety, isolation and suicide. TikTok challenges have even led to a slew of teenage deaths. Sadly, we have done very little to curtail the harm tech companies cause to our children. Worse, none of the policies surrounding AI help quell the concern.


Fortunately, Congress can continue to evaluate the future of AI while still attending to current market imbalances and harms to children. For example, Senators Lee and Klobuchar lead the bipartisan AMERICA Act to tackle Big Tech companies’ anticompetitive behavior and consolidation of the ad-tech market while also increasing transparency on their data management practices. Concurrently, the DOJ also filed suit against Google to address its specific monopolization of ad tech.

Meanwhile, two other bipartisan bills, the Kids Online Safety Act and the Protecting Kids on Social Media Act, curtail use of children’s online data that can risk their mental or physical wellbeing. These and other discrete data-centric policies have clear potential, but they risk being overshadowed if leadership only has eyes for AI.

In sum, a solid foundation in data policy can better ensure optimization of AI long-term.  Congress must not let myopic anticipation of the future be the enemy of the opportunities to build critical data policy now.  

Kate Forscey is a contributing fellow for the Digital Progress Institute and principal and founder of KRF Strategies LLC. She has served as senior technology policy advisor for Congresswoman Anna G. Eshoo and policy counsel at Public Knowledge. 

Latest Podcasts