“Upsizing well completions increase production 37.5%*!!!” Sure, but when? Where that asterisk lands will have a huge impact on the returns of your well completions design. If it’s Peak Rate, that might shorten payback. If it’s EUR — now you might have a real valuable well.
Oil and gas machine learning models can be trained to predict a time series of production. This means you can evaluate the impact of any oil well completion through time, to see when a completions design has the biggest impact on well performance. Long story short, using a single scaling factor for something that should be a time series can ruin your oil & gas forecasts.
This is the subject of our URTeC 2020 paper: Deriving Time-Dependent Scaling Factors for Completions Parameters in the Williston Basin using a Multi-Target Machine Learning Model and Shapley Values.
In This Post
The Well Completions Optimization process: time-series predictions and Shapley Values
How can our models estimate the impact of completions variables through time? What the machine learning model tries to predict is known as the “target.” In many instances within oil and gas, machine learning models will be single-target. That is, they will predict a single value for each well–often peak rate, 365-day cum, or EUR. We caution against using EUR as a target, as it already has interpretation and assumptions built into it. It is an Estimate after all!

By contrast, a multi-target model predicts multiple production values. Our standard approach had been to predict cumulative oil at 30-day increments out to IP day 720. However, we now have the capability to forecast with 5-day increments and out to later days. Multi-target models provide several advantages: they capture not just a single estimate but the shape of the curve, which is critical to well planning and economics.
Oil well design matrix with Multi-Target Time-Series machine learning model

Additionally, multi-target models enable investigation of the time-series impact of input parameters. This means that you can start to answer not just whether an oil well design choice is impactful by when it is.
To estimate feature impact, we employ Shapley values. Shapley values are generated by training a model on top of our standard tree-based model. This “surrogate” model estimates how much each feature (proppant loading, fluid loading, closest-neighbor spacing, etc.) contributed to the model prediction for that IP day. Shapley values are expressed in units of barrels or mcf (in the case of a gas model).
Our customers have found tremendous value in using Shapley values for explainability and transparency, whether downloaded from Novi Cloud or accessed through Production Modeler. Shapley values aren’t a substitute for running hypothetical designs through the model (what our Forecast Engine software is designed for), but they are a great tool for examining the model and investigation/research purposes — like this URTeC paper!
Well Completions Example: changing impact of proppant through time
Our model-build pipeline produces Shapley values in terms of cumulative oil. For this study, we transformed them into rate values. Bucketing the wells by proppant loading, we can see that the largest designs have, on average, a huge uplift over small designs in early days (48%), but very little in later days (27%).

The difference is even more acute for the >1400 lbs/ft designs compared to the 800-1000 lbs/ft designs. At IP120, the Shapley values show +15%, but by IP720, it’s only ~6%. This is the path towards a solid well design optimization model.
How does proppant compare to spacing and geology through time?
Proppant, of course, isn’t the only predictive feature in our model. You can use Novi outputs to make these charts for any variable, and can dive deeper by filtering to specific areas or well design groups. We find it useful to sum the Shapley values over our spacing and geology features; because we often train on 10+ spacing features and 10+ geology features, it is necessary to lump their Shapley values for comparison to completions parameters.
Below, I have filtered the dataset to just wells with >1000 lbs/ft proppant drilled at >6 wells per section spacing. I also took the absolute value of the Shapley values to show relative impact. Proppant quickly ramps up to the feature with highest contribution to the model prediction, but it falls below geology at IP330. By 720, spacing has passed fluid/ft and is about to pass proppant/ft. Stage length Shapley values are only ~2% on average relative to model average production rate, but I did not filter to short stages, which would show higher values here.

Well Completion Design Conclusions & Paper Information
- Multi-target models provide a powerful tool to investigate the impact of oil well drilling design choices on oil and gas production through time.
- Wells with a large completions design show largest uplift over small ones at IP120; the uplift steadily diminishes through IP720.
- Geology and oil well spacing increase in importance relative to completions designs with time.
- This workflow is repeatable for any Novi model across a wide range of plays.
Deriving Time-Dependent Scaling Factors for Completions Parameters in the Williston Basin using a Multi-Target Machine Learning Model and Shapley Values Control ID: 3103
URTeC 2020, Wednesday, July 22, Morning Session, Theme 8: EUR and Performance Prediction: Decline Curve Analysis and Beyond III