Data Lineage for AI

Data Lineage for AI

by Daniel Mercery
Publication Date: 08/01/2026

Share This eBook:

  $7.99

As artificial intelligence systems increasingly rely on large, complex, and externally sourced datasets, organizations are under growing pressure to prove where training data originated, how it was transformed, and whether it was used appropriately. Without defensible data lineage, AI systems become difficult to audit, explain, or regulate.


Data Lineage for AI is a technical and practical guide for engineers, compliance teams, and risk professionals responsible for managing and documenting the origins of AI training data. The book explains how lineage enables transparency, accountability, and regulatory compliance across the AI lifecycle.


This volume focuses on operational methods for capturing provenance and maintaining traceability from raw data ingestion through feature engineering and model training. It connects lineage practices directly to audit readiness, investigation support, and regulatory expectations.


Key areas covered include:



  • What data lineage mean in AI and machine learning contexts

  • Capturing provenance across data pipelines and transformations

  • Tooling and architectures for lineage logging and storage

  • Linking datasets to specific models and training runs

  • Using lineage as audit evidence for compliance reviews

  • Supporting regulatory inquiries and incident investigations


Written for practitioners operating in regulated or high-risk environments, this book provides concrete techniques to make AI data usage transparent, defensible, and verifiable without slowing delivery teams.

ISBN:
9798232905057
9798232905057
Category:
Internet: general works
Publication Date:
08-01-2026
Language:
English
Publisher:
​Daniel Mercery

This item is delivered digitally

Reviews

Be the first to review Data Lineage for AI.