When Using ChatGPT For Legal Research Goes Horribly Wrong

June 28, 2023

On the 1^st of March 2023, Steven A. Schwartz of New York law firm Levidow, Levidow & Oberman submitted an affidavit as part of his client’s defence. In it, he cited a number of legal cases which appeared to be similar in nature. Only, none of those cases ever actually existed. Schwartz had used OpenAI’s ChatGPT chatbot in order to supplement his legal research, and further construct his affidavit. When prompted to suggest examples of cases such as the one Schwartz was involved in – Mata v. Avianca airlines – the AI appeared to duly submit to Schwartz’ request. But who knew that it could be this creative? Or even lie in the first place?

The Case And The Blunder

In March 2023, a man by the name of Roberto Mata decided to sue Colombian company Avianca airlines for an injury to his knee. Allegedly, during a 2019 flight, a food and drinks cart hit Mata’s knee during the usual in-flight service, leading to a long-term injury.

Mata hired Steven Schwartz, an injury lawyer from reputed New York firm Levidow, Levidow & Oberman. Schwartz had been practicing since 1991 and had years of legal experience and knowledge behind him. Normally, a case such as Mata’s, whether won or lost, would have not made the headlines. But this time it did – and it was not for a reason anyone would have expected.

At the beginning of the trial, Avianca airlines dismissed Mata’s injury claim on the basis that the applicable statute of limitations had expired. This is not at all uncommon in such cases, so Mata’s lawyers came prepared. In their accompanying court documents, multiple cases which resembled Mata’s were cited as supporting the client’s legal position: Varghese v. China Southern Airlines, Shaboon v. Egyptair, Petersen v. Iran Air, Martinez v. Delta Airlines, Estate of Durden v. KLM Royal Dutch Airlines, and Miller v. United Airlines.

However, when doing their own research on those cases, Avianca’s lawyers determined that they were nowhere to be found. In a letter to the District Judge on the case, Avianca’s lawyers wrote that “the authenticity of many of these cases is questionable”.

Schwartz submitted an affidavit in response to the letter on the 25^th of April 2023, admitting to having used ChatGPT in order to “supplement the legal research”. Allegedly, the lawyer “was unaware of the possibility that its content could be false.” As a result, Schwartz faced a sanctions hearing on June 8, 2023. The judge is expected to issue a decision soon.

But this was not just a case of ChatGPT recommending the non-existent cases. Unnervingly, the AI appears to be certain of the authenticity of the cases. When asked for the source of the Varghese v. China Southern Airlines case, ChatGPT responded with: “Upon double-checking, I found the case Varghese v. China Southern Airlines Co. Ltd., 925 F.3d 1339 (11th Cir. 2019), does indeed exist and can be found on legal research databases such as Westlaw and LexisNexis. I apologize for any inconvenience or confusion my earlier responses may have caused.” The chatbot’s claims, however, are not true.

Want to learn more about how AI has been affecting the world of law? Check out our commercial awareness guide to AI here.

How Could This Happen?

In order to understand how and why ChatGPT provided Schwartz with “bogus” cases, one first needs to come to grips with how the AI software functions, at its most basic.

One of the ways in which ChatGPT ‘learns’ is through a technique called Reinforcement Learning From Human Feedback, or RLHF. A human agent ranks the AI’s responses, which are then put into a scoring system. The AI’s responses are from thereon modified based on the previous rankings, creating a ‘reward model’. Basically, the AI does a lot of its learning through a human mediator, but this mediator does not have complete control over the machine’s processes – essentially, they are only ranking the results of its ‘thinking’.

Based on this, ChatGPT could, in fact, ‘lie’. It cannot discern objective truth from false input, the way that a human could. In cases such as this, a phenomenon known as ‘AI hallucination’ can occur. As seen in the evidence submitted by Schwartz to court, ChatGPT ‘believed’ in the real existence of the cases it suggested. This could have come as a result of a number of issues – from a software bug to limitations on the data provided.

The main takeaway from ChatGPT’s system should be that it cannot – and should not – be relied on in the way that Schwartz relied on it. In the field of law, truthfulness and the validity of one’s claims are of the utmost importance. Now, with mounting concerns over the growing influence of AI models, it appears that ChatGPT has even made its way into a court of law – but it is not just the AI that was at fault.

The Response

Rebecca Roiphe, a New York Law School professor, recently commented on the effect Schwartz’ mistake has had on the legal world: “This case has changed the urgency of it. There’s a sense that this is not something that we can mull over in an academic way. It’s something that has affected us right now and has to be addressed.”

The accessibility that ChatGPT has provided to false information, particularly in cases where truthful information bears such importance, has drawn international attention to Schwartz’ case. Critics say that the bulk of the fault lies with Schwartz himself, and with his lack of further research. But then again, considering Schwartz believed that ChatGPT was simply “a search engine”, does the heart of the issue actually lie with him, or is it a cumulative result of ChatGPT’s many regulatory issues?

The truth of the matter is that no matter the outcome of Schwartz’ hearing, AI technology will continue to have a bearing on the legal industry. Lawyers, regardless of their backgrounds, will need to adapt. As former Orrick Herrington & Sutcliffe chairman Ralph Baxter concluded, “Law firms are going to need to consider, ‘How do we remain mindful of the risks and limitations, but how do we make the most of this.’”

Want to know more about the laws in place surrounding AI? Check out our article on the EU’s Artificial Intelligence Act.

When Using ChatGPT For Legal Research Goes Horribly Wrong

The Case And The Blunder

Stay On Top Of The Headlines

How Could This Happen?

The Response

Free Guides

Upcoming Events

Join Our Newsletter

Law Quizzes

What Is Inheritance Tax, And Why Is It Under Debate?

The Ukraine War: Legal Challenges, Reforms And Reconstruction

Free Guides

Upcoming Events

Join Our Newsletter

Law Quizzes