
Two parallel projects under one team: (1) text-based classification of historical drilling-loss events from 27 years of well reports, and (2) a CNN autoencoder for lossless compression of Acoustic Impedance graphs. Plus a Responsible-AI POC layering LIME and SHAP over both models so geoscientists could trust the output.
Python developer and data scientist. Built the supervised text classifier, the CNN autoencoder, and the Responsible-AI framework. Worked directly with geoscientist labelers and onboarded two interns.
Worked with subject-matter experts to weight-label ambiguous loss events. Supervised classifier on the labeled set bumped F1 from 0.62 to 0.84.
Trained a lossless CNN autoencoder on years of Acoustic Impedance traces. ~85% storage reduction with reconstructed output that geoscientists couldn't distinguish from the original.
Built a Responsible-AI POC that surfaces feature attribution for every prediction. Identified two latent biases in the training data — both corrected in v2.
Up from 0.62 unsupervised. Now flagged as the production model on three active rigs.
CNN-autoencoded impedance graphs replaced raw PNG storage with no visual loss.
LIME / SHAP audit surfaced two systematic biases in the training data — both corrected before deployment.