[URTeC 2021 Paper] Use of Machine Learning Production Driver Cross-Sections for Regional Geologic Insights in the Bakken-Three Forks Play

Machine learning explainability tools like SHAP values provide a powerful tool to understand how geologic performance drivers vary across a play.

Talk Details::
- Tuesday, July 27th at 11:15 AM | Room 361
- Theme 3: Emerging Geological Evaluations, Tools & Workflows: Data Driven Methods
- AUTHORS:: T. Cross, K. Sathaye, J. Chaplin (Novi Labs)
Production drive cross-sections covering the Bakken play in North Dakota. In the north, structural depth is a big negative for production. In the play core, low water saturations and low clay volumes provide a strong positive contribution. In the southwest, lower TOC values for the Lower and Upper Bakken source rock shales negatively impacts production.


Objectives/Scope: Petroleum geologists have long employed cross-sections as tools to understand subsurface structures, show variation in log properties, and provide regional context for local analysis. In this study we use cross-sections as our technique for visualizing a novel dataset generated with machine learning models: geologic production drivers. These “machine learning cross-sections” show not just regional variation in geologic properties but what the model has learned about the changing impact of those properties on hydrocarbon production along the cross-section.

Methods/Procedures/Process: We trained a decision-tree based model which uses completions, geology, and spacing data to predict oil, gas, and water production in the Bakken-Three Forks play of North Dakota. The subsurface training variables included oil in place, net to gross, porosity, water saturation, and clay volume. We trained a surrogate model to produce an explanation dataset for the predictions, using Shapley values, a method becoming increasingly widespread within the machine learning community. Finally, we generated synthetic multiwell unit developments along cross-sections stretching from the southwestern to the northeastern edges of the play, in order to extract these Shapley values evenly spaced along a cross-section.

Results/Observations/Conclusions: The cross-sections identify differing production drivers for the Bakken and Three-Forks sweet spots. For the Three Forks, the key driver for the outperformance in the Nesson area is water saturation, with the critical threshold at 45% water saturation. For the Bakken, we see a more diverse set of production drivers, but clay volume ranks the highest both on the Nesson and in the Parshall-Sanish field. Though clay volume varies only between 13% and 17% along the cross-section, the Shapley dataset shows a large increase when clay drops below 14.5%. We speculate that this reflects not only greater frackability but potentially paleobathymetric highs that have persisted and now sit as foci of migration, contributing to overpressure.

Applications/Significance/Novelty: While the focus of this study is the Bakken, the workflow presented here generalizes to other basins and other purposes. Shapley values can provide a machine learning-based perspective on the importance of geoscientific data to the “target” of interest. Incorporation of explainability datasets with traditional methods like cross-sections can provide powerful context for understanding subsurface variability and its impact.

Interdisciplinary Components: This study incorporates work from geoscience & data science.

You can find an early version of this study on our ML in Oil & Gas Blog here.

Complete the form below and we will email the paper straight to your inbox.

First name(Required)

Meet Novi Labs' knowledgeable experts.

Invite to Speak

Our team of experts is passionate about sharing valuable insights on US energy. Please provide us some information about your event, and we will get back to you as soon as we can

This field is for validation purposes and should be left unchanged.