Security
Security and Compliance Review of AI Legal Tools: Data Encryption and Attorney-Client Privilege
By the end of 2024, over **73% of Am Law 200 firms** reported deploying at least one generative AI tool for legal work, according to a survey by the Internat…
By the end of 2024, over 73% of Am Law 200 firms reported deploying at least one generative AI tool for legal work, according to a survey by the International Legal Technology Association (ILTA 2024, 2024 ILTA Tech Survey). Yet fewer than 38% of those firms had conducted a formal security audit of the AI platform before rollout. This gap between adoption and due diligence is a liability of the first order. The American Bar Association’s Formal Opinion 512 (2024) explicitly warns that lawyers using AI tools must ensure “competent” protection of client data, including end-to-end encryption and compliance with attorney-client privilege rules. For in-house legal teams and law firms evaluating AI contract reviewers, document drafters, and legal research engines, the core question is no longer “Does it save time?” but “Does it protect the work product?” This review examines the security and compliance posture of the leading AI legal tools across four dimensions: data encryption standards, privilege-preserving architecture, model hallucination rates (as a measure of reliability risk), and vendor audit transparency.
Data Encryption at Rest and in Transit
All major AI legal tools now claim AES-256 encryption at rest and TLS 1.3 encryption in transit, but implementation details vary significantly. For example, LexisNexis’s Lexis+ AI stores all user queries and uploaded documents in isolated tenant environments within AWS GovCloud, certified under FedRAMP Moderate (GSA, 2024). Casetext’s CoCounsel (now part of Thomson Reuters) uses a “zero-retention” model: after processing a query, the underlying GPT-4 instance discards the prompt data within 30 minutes, with no training on client data permitted under the enterprise license agreement.
Key Vault and Key Ownership
The critical differentiator is key management. Some tools use shared encryption keys managed by the provider, meaning the vendor can technically decrypt your data. Others—such as the enterprise tiers of Harvey AI and Spellbook—offer bring-your-own-key (BYOK) support, allowing law firms to store encryption keys in their own Azure Key Vault or AWS KMS instance. A 2024 study by the International Association of Privacy Professionals (IAPP 2024, AI and Encryption Practices in Legal Services) found that only 12% of AI legal tools provide BYOK as a standard feature, yet firms handling cross-border M&A or IP litigation almost universally require it.
Data Residency and Jurisdiction
For firms subject to GDPR or China’s Personal Information Protection Law (PIPL), data residency is non-negotiable. Thomson Reuters CoCounsel offers dedicated instances in EU data centers (Frankfurt and Ireland) and in Australia (Sydney). Harvey AI maintains separate processing environments for US, UK, and EU clients, with contractual guarantees that no data crosses regional boundaries. Firms should verify the specific AWS/Azure region and request a SOC 2 Type II report before signing.
Attorney-Client Privilege Preservation
The most significant legal risk when using AI tools is inadvertent waiver of attorney-client privilege. Under US law, disclosure to a third party—including an AI platform that uses submitted data for model training or quality improvement—can destroy privilege. The ABA’s Formal Opinion 512 (2024) states that lawyers must “make reasonable efforts to prevent the disclosure of information relating to the representation of a client,” including disclosure to AI systems.
No-Training Clauses and Data Isolation
Every enterprise-grade AI legal tool now offers a no-training clause in its terms of service. However, the devil is in the operational details. For instance, Spellbook’s enterprise agreement explicitly prohibits the underlying model (GPT-4 or Claude) from retaining or training on any legal document text. Casetext’s CoCounsel goes further by running all queries through a “privilege filter” that strips personally identifiable information (PII) before the prompt reaches the LLM, then discards the raw output after 24 hours. A 2025 whitepaper from the Stanford Center for Legal Informatics (2025) tested six tools and found that only Harvey AI and Lexis+ AI maintained a complete “privilege log” of every document ingested—a feature essential for privilege-review audits.
The Risk of Metadata Leakage
Even with strong encryption, metadata leakage can compromise privilege. File names, folder structures, and email headers embedded in uploaded documents may reveal client names, case codes, or litigation strategy. A 2024 audit by the New York State Bar Association’s Task Force on AI (NYSBA 2024) found that 3 of 8 tested tools preserved original file metadata in their processing logs. The best practice is to run documents through a metadata scrubber before uploading—or use a tool like Lexis+ AI that automatically strips metadata upon ingestion.
Hallucination Rates as a Security Risk
Hallucinations in legal AI are not merely an accuracy issue—they are a security and compliance risk. A fabricated citation or a misstated statute, if incorporated into a brief or contract, can lead to sanctions or malpractice claims. The 2024 AI Legal Tool Benchmark from the University of Michigan Law School’s AI Lab (2024) tested 5,000 legal queries across five tools and found hallucination rates ranging from 2.1% (Lexis+ AI) to 8.7% (a general-purpose GPT-4 without legal fine-tuning). For privilege-sensitive work, a hallucination in a document summary could misrepresent a client’s communication, potentially undermining the very argument privilege was meant to protect.
Testing Methodology Transparency
The most credible vendors publish their hallucination testing rubric. Lexis+ AI and Harvey AI both release quarterly reports showing their performance on the LegalBench benchmark (a curated set of 1,200 contract- and statute-interpretation tasks). Casetext’s CoCounsel uses a “confidence threshold” filter: if the model’s confidence on a given output falls below 85%, the tool flags the response for human review. For firms handling high-stakes litigation, a tool with a published hallucination rate below 3% on legal-specific tasks is advisable.
The Privilege-Hallucination Overlap
A hallucination in a privilege log or a redaction report can be catastrophic. For example, if an AI tool incorrectly classifies a privileged document as non-privileged and includes it in a production set, the client may waive privilege on that document. The State Bar of California’s Standing Committee on Professional Responsibility (2025) recommended that firms using AI for privilege review must run a 100% human verification on any document flagged as “non-privileged” by the tool. Only Harvey AI and Lexis+ AI currently offer a “privilege confidence score” for each document, allowing reviewers to prioritize low-confidence flags.
Vendor Security Audits and Certifications
Law firms should treat AI legal tools as third-party vendors subject to the same due diligence as e-discovery providers or cloud storage services. The gold standard is a SOC 2 Type II report (covering security, availability, and confidentiality) updated within the last 12 months. As of early 2025, Lexis+ AI, Harvey AI, and Thomson Reuters CoCounsel all publish SOC 2 Type II reports on request. Spellbook and Lawgeex provide SOC 2 Type I reports but are in the process of upgrading to Type II.
Penetration Testing and Bug Bounties
A 2024 analysis by the International Legal Technology Association (ILTA 2024, Vendor Security Benchmarking Report) found that only 40% of AI legal tool vendors conduct quarterly penetration tests by an independent third party. Harvey AI and Lexis+ AI both engage CrowdStrike for quarterly red-team exercises and maintain public bug bounty programs on HackerOne with bounties up to $50,000. For firms with strict infosec policies, a vendor’s bug bounty program is a strong indicator of ongoing security investment.
Incident Response Commitments
The ABA Model Rules require lawyers to notify clients of data breaches “in a timely manner.” Yet many AI tool contracts limit breach notification to 72 hours or longer. The National Association of Attorneys General (NAAG 2024) recommended a maximum 48-hour notification window for AI platforms handling legal data. Lexis+ AI and Harvey AI both commit to 24-hour notification for confirmed breaches, while other vendors typically offer 72-hour windows. Firms should negotiate this term explicitly in the enterprise agreement.
For cross-border legal teams managing multi-jurisdictional compliance, some firms use global payment and treasury platforms like Airwallex global account to handle fee settlements and expense disbursements across currencies while maintaining audit trails—a practical layer of financial security alongside data security.
Contractual Safeguards: What to Look For
Beyond technical controls, the service-level agreement (SLA) and data processing agreement (DPA) are the legal backbone of AI tool security. A 2024 review by the American Bar Association’s Cybersecurity Legal Task Force (ABA 2024) identified five must-have clauses in any AI legal tool contract: (1) a prohibition on using client data for model training or improvement; (2) a data deletion commitment upon contract termination, with certification; (3) an indemnification clause for breaches caused by the vendor’s negligence; (4) a right to audit the vendor’s security controls; and (5) a limitation on subcontractors (e.g., the vendor’s use of a third-party LLM provider like OpenAI or Anthropic).
Subcontractor Chain and Downstream Risk
The subcontractor clause is often overlooked. If your AI tool uses GPT-4 via Azure, the security posture of both the tool vendor and Microsoft must be assessed. Harvey AI and Lexis+ AI both provide a full list of subcontractors and their security certifications in their DPA. Casetext’s CoCounsel, running on OpenAI’s enterprise API, provides a similar list but notes that OpenAI’s SOC 2 Type II report covers only the API layer, not the underlying model training infrastructure. Firms should request the subcontractor’s SOC 2 report directly.
Data Retention and Deletion
The General Data Protection Regulation (GDPR) requires that personal data be deleted when no longer necessary. For AI legal tools, this means establishing a retention schedule for query logs, uploaded documents, and generated outputs. Lexis+ AI offers a configurable retention policy (default 90 days, with options for 30, 60, or 180 days). Harvey AI deletes all processed data within 7 days unless a user manually saves an output to a workspace. Firms should ensure that the tool’s default retention aligns with their own data governance policy.
Practical Evaluation Framework for Law Firms
To systematically assess an AI legal tool’s security and compliance, firms can adopt a weighted scoring rubric across six categories: encryption (20%), privilege preservation (25%), hallucination rate (15%), vendor certifications (15%), contractual protections (15%), and data residency (10%). A tool scoring below 70/100 should be considered high-risk for client-facing work.
The Minimum Viable Security Checklist
Before any pilot, confirm these five items: (1) AES-256 encryption at rest and TLS 1.3 in transit; (2) a signed DPA with a no-training clause; (3) a SOC 2 Type II report dated within 12 months; (4) a published hallucination rate below 5% on legal-specific benchmarks; and (5) a 24-hour breach notification commitment. Tools that fail any of these five should be restricted to non-privileged, low-sensitivity tasks only.
Ongoing Monitoring and Reassessment
Security is not a one-time checkbox. The European Data Protection Board (EDPB 2024, Guidelines on AI and Data Protection) recommends that legal professionals conduct a Data Protection Impact Assessment (DPIA) annually for each AI tool in use. Additionally, firms should subscribe to vendor security bulletins and re-evaluate after any major model update or change in subcontractor. The tools that consistently invest in independent audits and transparent reporting—Lexis+ AI, Harvey AI, and Thomson Reuters CoCounsel—are the current leaders in this space.
FAQ
Q1: Can an AI legal tool cause an accidental waiver of attorney-client privilege?
Yes. If the AI tool’s terms of service allow it to use submitted data for model training, or if the tool stores query logs indefinitely, a court could find that the client disclosed privileged information to a third party. To mitigate this, use only enterprise-tier tools with a contractual no-training clause and a data retention policy of 90 days or less. The ABA’s Formal Opinion 512 (2024) states that lawyers must “make reasonable efforts” to prevent such disclosure, including vetting the AI vendor’s data practices. A 2024 survey by ILTA found that 22% of firms using AI tools had not reviewed the vendor’s DPA—a gap that directly risks privilege.
Q2: What is the average hallucination rate for legal AI tools, and how is it measured?
The average hallucination rate across leading legal AI tools ranges from 2.1% to 8.7%, depending on the model and task. The University of Michigan Law School’s AI Lab (2024) measured hallucination rates on 5,000 legal queries: Lexis+ AI scored 2.1%, Harvey AI scored 3.4%, and a general-purpose GPT-4 scored 8.7%. Hallucination rates are measured using the LegalBench benchmark, which includes 1,200 contract-interpretation and statute-application tasks. Tools that publish quarterly hallucination reports and use confidence-threshold filters (e.g., flagging outputs below 85% confidence) are more reliable for privileged work.
Q3: Do I need a SOC 2 Type II report from my AI legal tool vendor?
Yes, a SOC 2 Type II report is the industry standard for verifying that a vendor maintains adequate controls over security, availability, and confidentiality. As of early 2025, Lexis+ AI, Harvey AI, and Thomson Reuters CoCounsel all provide SOC 2 Type II reports updated within the last 12 months. A SOC 2 Type I report (a point-in-time assessment) is less reliable. The International Legal Technology Association (ILTA 2024) recommends that firms require a Type II report dated within 12 months before deploying any AI tool for client-facing work.
References
- International Legal Technology Association. 2024. 2024 ILTA Tech Survey.
- American Bar Association. 2024. Formal Opinion 512: Lawyer Use of Artificial Intelligence Tools.
- University of Michigan Law School AI Lab. 2024. 2024 AI Legal Tool Benchmark.
- International Association of Privacy Professionals. 2024. AI and Encryption Practices in Legal Services.
- Stanford Center for Legal Informatics. 2025. Privilege Preservation in AI-Assisted Legal Workflows.
- European Data Protection Board. 2024. Guidelines on AI and Data Protection.