4.4. Legislation

To analyze the legal aspects of generative AI, I had a conversation with my friend Maciej Mańturz about the upcoming European Union legislation aimed at regulating AI. We discussed how the new law will impact startups looking to incorporate AI, and what entrepreneurs need to know to stay ahead.

Maciej is a lawyer and a specialist in privacy. The common branches of law never truly resonated with him, and he’s never envisioned himself in a courtroom setting. Eventually, he joined a major corporation, which opened his eyes to the intersection of technology and the business world. His view became that a lawyer should be not an obstacle but a facilitator of business initiatives.

Since then, he’s pursued further education and earned certifications in privacy and broader tech law, covering areas like contracts, intellectual property, a touch of cybersecurity, and even worked on AI, culminating in a postgraduate thesis on the EU’s proposed AI framework—which I read preparing for this discussion.

Kamil Nicieja: We studied together, so you know I left law school behind. I’ve often reflected on what initially attracted me to law. Once I shifted from law to coding and then to business—which isn’t that different from law—it became clear to me: lawyers are like coders, but they deal with incredibly complex “syntax” and “run” their code in a slow and unpredictable system: the courts.

Now, with the emergence of advanced language models, coding feels more and more like drafting laws. You type in a command, and really, it’s anyone’s guess what the outcome will be. Do you see the parallels?

Maciej Mańturz: It’s intriguing to see you make those connections, given that they’re both apparent and commonly acknowledged. This year, I went to a series of talks with lawyers well-versed in tech and legal intersections, covering fields like Intellectual Property, Cybersecurity, and AI. One specialist had launched a postgrad course: The course teaches tech-centric law and coding abilities, like understanding how an app works, some programming skills, and the software development lifecycle. It’s said to draw inspiration from Western European models and others worldwide.

I read an article which proposed that lawyers, aside from math-heavy tasks, are naturally fit for coding classes, too. The logical nature of both professions likely supports this view. It’s also becoming evident that modern lawyers should stay updated with tech trends, highlighting the value of cross-disciplinary expertise. I only wish such a mindset had been mainstream when we started our studies.

We’re meeting at a significant time. It’s been five years since the EU introduced GDPR. Now, they’re working on another major tech regulation named the AIA, the AI Act. For those not closely following European legislation, could you explain the main goals of this new bill? Do you know when it might be implemented?

The final EU Artificial Intelligence Act is expected to be adopted near the end of 2023. It’s clear that AI is no longer just a theoretical idea. Nowadays, you can’t browse social media without coming across news about AI, whether it’s a new advancement or a tool set to change our lives.

There are genuine concerns about the potential negative effects of this technology on the average person. Deepfakes are a prime example of potential misuse. Training AI models, especially using deep learning methods, requires vast amounts of data. Personal information can be as valuable as gold, which is why many social media platforms can seem invasive to our privacy.

There’s a tug-of-war between laws protecting our privacy and businesses looking to profit while offering us these innovative services. There’s a trade-off with our rights. Too strict regulations might drive businesses away, but too lax could jeopardize our rights and security.

Just like with GDPR, this new regulation isn’t just for companies based in the EU—if you’ve got users in Europe, you’ve got to comply. It’s pretty clear the target’s is largely on U.S. and Chinese tech giants. I’ve come across some in tech circles saying the EU is essentially in a cold war with other major powers, using regulation as their weapon of choice because they can’t go toe-to-toe on tech innovation. Can you break down why, when the EU rolls out a big tech regulation, it becomes a global must-follow? What’s stopping U.S. and Chinese companies from just giving it the cold shoulder?

GDPR surprisingly set a global precedent but, you know, it kinda worked, right? Many new privacy regulations are modeled after it, even if it’s a challenge for businesses. In the realm of data privacy, there’s ongoing tension about transferring personal data to the US. In essence, every few years, NGOs, led by Max Schrems, prompt the ECJ to declare the US framework incompatible with GDPR. Then, governments negotiate until the EU bodies approve a decision. This back-and-forth often revolves around US laws related to accessing data for national security reasons.

The same is seen with companies like Meta, who continue to operate despite GDPR-related fines. Perhaps the EU’s global influence and potential profits are compelling reasons to keep pushing boundaries. It seems the EU is willing to compromise on GDPR to ensure data transfers, which is also a strategic decision in the global tech race. From a company’s standpoint, it’s simpler to comply with these standards if they aim to operate within the EU. And the EU is a significant market so nobody can just drop it. But then again, I'm no business guru.

The bill sorts AI systems into two major camps: those that pose a “high risk” and everything else. So what does the EU mean by significant risk when it comes to AI? Does this mean you can basically snooze on this bill if you're developing just another text-summarization app, but you’d better pay attention if your AI could potentially harm people or be used to discriminate against them on a large scale? Like in HR or finance?

The AI Act adopts a risk-based approach, considering various points in the product chain, from creators to deployers. While it’s up to you to categorize your product’s risk level, disagreements with regulators could result in hefty penalties. Given these potential consequences, as a lawyer I would advise not to dismiss the AIA’s requirements.

Even products deemed lower risk must adhere to the AI Act’s foundational principles, like explainability, privacy, and security. As you mentioned, there are certain activities classified as “high risk” or outright prohibited in the Act. These should be immediate red flags for any company. Businesses should consult legal experts to ensure they aren’t inadvertently falling into these high-risk categories.

Let’s talk about some tangible scenarios. Suppose I aim to create a foundational model on par with GPT-4. For instance, consider Europe’s prominent AI enterprise, Mistral, and its new LLM. How does the AIA assess the risk associated with such a venture?

When evaluating the requirements of such a system, a layered approach is essential. The initial step involves determining its placement within the risk classification spectrum. The “high risk” category relates to systems that could significantly jeopardize a person’s health, safety, or fundamental rights.

Perhaps that’s not a bad choice given that researchers found that Mistral’s model can provide information on topics like bomb construction, suggesting a potential shortfall in their safety measures.

Yup. And similar to GDPR, the penalties under the AIA are significant. I think they can reach up to 40 million euros or 7% of global annual revenue. This makes the cost of non-compliance even steeper than in the realm of personal data.

OK, next example: a B2B product that uses AI to monitor daily activities of employees across platforms like Slack, Outlook, Teams, Jira, and compiles a daily company-wide summary. Would this be classified as high-risk? What are the reasons for or against this classification?

Certainly, this situation could be viewed as high risk since employment is explicitly labeled as such in the AIA annexes. Surveillance also poses additional challenges from a GDPR standpoint.

Wow, color me surprised. I personally didn’t think this would be a huge deal. Let’s consider one final scenario: a chat application where 500 million users can converse with virtual representations of celebrities like, say, LeBron James about basketball. Would this be deemed high-risk or low-risk?

I’d argue that this sounds like a low-risk situation. Generally, chatbots don’t fall under the high-risk category. The intent behind these regulations is to prevent misuse or exploitation in areas of public interest, such as welfare, employment, and safety.

However, it’s important to note that creating a virtual likeness of someone must respect intellectual property rights. It’s also now widely understood that users should be informed when they’re interacting with a machine, not a human. The AIA would actually qualify that as a deepfake. Additionally, there will be another EU legal act addressing civil liability for damages, which should also be kept in mind.

What guidance would you offer to a standard AI startup in Europe? Given that such companies are often small, their financial landscape can be challenging. They might have secured some funding, but it’s equally likely they haven’t. On top of engineering expenses, they’re also faced with the potential costs of legal counsel. Is it wiser for them to address legal matters upfront and, if so, how can they do so affordably? Or should they prioritize gaining traction, securing external investment, and then allocating funds for legal guidance?

As someone specialized in privacy, I’d stress that any solution aiming for a European launch should adhere to the privacy-by-design principle, especially if it involves personal data. The financial repercussions for not complying with EU regulations are steep and can be quite daunting.

Currently, it seems plausible that only major entities could significantly impact the AI sector due to these regulatory hurdles. It’s uncertain whether they’d even choose the EU as a base for AI development given these challenges. If startups are to thrive in this environment—and I hope they can—it’s wise to seek at least basic legal counsel early on. While a full legal team might not be necessary initially, gaining a foundational understanding of expected requirements is crucial. A good starting point might be recommendations from regulatory bodies like the UK’s ICO.

Gotcha. I personally believe VC investors can play a significant role here by providing startups in their portfolio with complimentary legal consultations as a value-add. It’s far more efficient for a VC fund to employ lawyers who can assist several startups simultaneously rather than each startup seeking individual legal counsel. Some VCs did that with GDPR.

Oh, and while we’re at that… If I’m already compliant with GDPR, does that mean I'm in the clear with AIA as well?

No, not really.

If I ask ChatGPT to elaborate on who Kamil Nicieja is, am I making it non-compliant with GDPR? And what if I build an app that uses OpenAI's API and ChatGPT as the engine—am I skirting dangerous legal territory too?

For a casual request from an ordinary individual, it likely isn’t a major concern, potentially falling under the household exemption for personal data processing. However, if you develop an app, the situation becomes more complex as it enters the realm of business data usage. You’d need to establish a framework between companies, determine the legal basis for processing, and address other details. In essence, while it’s possible to navigate this legally, it would require effort and careful planning.

I’d imagine that if I strictly adhere to the GDPR, I should be compliant when using user data for AI training, especially if I’ve secured processing consent and taken similar measures.

While there are similarities between the AIA and GDPR, the current landscape isn’t as straightforward as businesses might hope. Common principles like transparency and security are present in both regulations, but their interpretations might differ. Some aspects of the two regulations even seem contradictory, even though the GDPR was crafted to be tech-neutral.

Generally speaking, the GDPR’s standards are stricter. So, starting with GDPR compliance can provide a solid foundation for meeting some of AIA’s essential requirements, including privacy. Fundamentally, any solution should be designed with privacy as a central focus, in line with the privacy-by-design and privacy-by-default principles, which are integral to the GDPR, regardless of AI involvement.

Furthermore, the GDPR’s provisions on automated decision-making involving personal data introduce a stricter set of requirements when these decisions might impact an individual’s rights. For example, explicit consent is needed instead of just standard consent.

Where does generative AI fit into this whole equation?

It depends. Generally, the AIA does distinctly define generative AI and imposes additional stipulations. Like all tools, its risk must be assessed, and it must adhere to general principles. However, the AIA also mandates particular transparency measures, such as disclosing AI-generated content or sharing summaries of copyrighted training data.

Isn’t this a bit premature? Generative AI has only been mainstream for less than a year, and the EU is already keen on regulating it. Regulation naturally curtails innovation. While the EU claims it aims to regulate only the large, high-risk models, leaving space for research and startups with smaller models, what if this approach is flawed? The larger models are the epicenters of innovation at this moment.

The issue is undeniably influenced by geopolitical factors, and Europe seems to be trailing. Currently, the US and China lead the AI race. China has already implemented some AI regulations, and its unique standing might enable faster advancements in AI sectors. Given this, Europe’s AIA might already be lagging.

However, I’d be concerned if the regulation hinders technological growth. Scientific research seems better positioned, as the AIA indicates certain exceptions for such work. Startups might face more challenges, but from the EU’s viewpoint, the primary goal is to stay competitive while protecting its customers—and markets.

Japan’s made a decision that using datasets for training AI models does not infringe upon copyright law. Therefore, model trainers can now access publicly available data without the need for licenses or permissions from the data owners. Does the AIA provide any direction on handling copyright issues in training data?

As I mentioned before, the AIA explicitly outlines transparency requirements for generative AI and its associated training data. While there are general provisions for intellectual property rights protection within the regulation, it isn’t the primary focus.

In the EU, actions related to this are mainly governed by two exceptions for text and data mining (TDM) in the 2019 Copyright in the Digital Single Market (CDSM) Directive. These exceptions address TDM for scientific research, covered in Article 3, and what’s sometimes termed “commercial” TDM, highlighted in Article 4. For AI models like Midjourney, DALL-E, or Firefly, the relevant regulation to reference is the commercial TDM exception. So if you’re looking for guidance, I’d look there.

Generative AI has really heated things up in the realm of automated agents—agents are AI-driven characters that use large language models to mimic basic autonomous reasoning. I read in your article about “automatic influence on the individual’s situation,” which apparently flags an AI system as high-risk. That sounds a lot like what I described earlier. Does this mean the EU’s gonna put the brakes on developing autonomous agents?

Not necessarily. The term “automatic decision making” originates from the GDPR. While there’s a higher standard for processing personal data in this manner, it’s still achievable with the explicit consent of the individual involved. Typically, companies avoid this approach since maintaining such consent can be challenging. They often introduce human intervention to sidestep fully automated decision-making processes.

While implementing a human element in an AI system might not be feasible, the GDPR doesn’t prohibit autonomous agents given specific consent or power of attorney. Based on my understanding of the upcoming regulation, the AIA doesn’t present obstacles either. While there are safety measures and conditions to meet for any AI solution, with some tailored for autonomous agents, I’m unaware of any explicit restrictions imposed by EU regulators on such initiatives.

Your article talks about how AIA mandates that high-risk systems need to be “transparent and understandable to users,” but let’s be real—most AI systems are black boxes. Stuff goes in, stuff comes out, and what happens in between is anyone’s guess.

Now, I get the sense that making tech companies clarify their AI’s decision-making process has been a major sticking point. But you’re saying AIA might not actually require a crystal-clear explanation, just that companies need to give users the lowdown on touchy subjects like how hallucinations work and be upfront about the training data, right?

Certainly, this appears to be a central and somewhat paradoxical issue from a business standpoint. Given AI’s nature, it’s often challenging to pinpoint precisely how a system, based on various inputs, reaches a specific output. Yet, EU regulators advocate for the explainability principle, implying a clearer understanding.

The silver lining is that the AIA acknowledges this dilemma and, as you noted, doesn’t demand the impossible from developers. It primarily necessitates that entities involved in the AI system’s lifecycle can articulate its foundational principles in layman’s terms and describe the kind of data leading to specific outcomes. This might also encompass offering an alternate prediction or providing the context behind a decision. In this sense, it bears similarities to transparency and fairness principles.

Ultimately, the interpretation will hinge on regulatory guidance. History has shown that some interpretations can be more stringent than necessary. However, the latest version of the AIA offers some protection, and over time, a unified approach should emerge across the EU. It’s evident that regulators are actively engaged in the AI evolution, with some already providing guidance on AI’s interplay with the GDPR. Again, I’d specifically point to the insights from the UK’s ICO as particularly valuable.

Thank you for this discussion! I’ve gained a lot from it, and I’m confident our readers will benefit as well.

You’re welcome. Thank you as well!

A few days after my chat with Maciej, an executive order was signed by President Joe Biden, outlining guidelines for generative AI.

The approach this executive order adopts differs from the European Union's. Instead of directly addressing risk, it zeroes in on the computational power of the machines used to build foundational AI models. The underlying belief is that the real threat comes from massive supercomputers with hefty price tags, not small startups operating out of garages.

Many on Twitter have criticized this view as lacking foresight. They argue that in a short span, the computational capabilities exclusive to these supercomputers will be within reach for garage startups. Critics believe that regulations are challenging to reverse, and this particular one may become outdated quicker than most.