A python package for removing duplicate text in clinical notes or other documents
io+hdf5+lazy-loading+processing-pipelines for EHRs and generic PyTrees for JAX friendly tasks, mainly ML