“Upsizing completions increases production 37.5%*!!!” Sure, but when?
Where that asterisk lands will have a huge impact on the returns of any completions decision. If it’s Peak Rate, that might shorten payback. If it’s EUR — now you might have a real valuable well.
Machine learning models can be trained to predict a time series of production. This means you can evaluate the impact of completions through time, to see when completions have the biggest impact on well performance. Long story short, using a single scaling factor for something that should be a time series can ruin your forecasts. This is the subject of our upcoming URTeC 2020 paper, Deriving Time-Dependent Scaling Factors for Completions Parameters in the Williston Basin using a Multi-Target Machine Learning Model and Shapley Values.
In This Post
The process: time-series predictions and Shapley Values
How can our models estimate the impact of completions variables through time? What the machine learning model tries to predict is known as the “target.” In many instances within oil and gas, machine learning models will be single-target: they will predict a single value for each well–often peak rate, 365-day cum, or EUR. We caution against using EUR as a target, as it already has interpretation and assumptions built into it (it is an Estimate after all!).
By contrast, a multi-target model predicts multiple production values. Our standard approach is to predict cumulative oil at 30-day increments out to IP day 720, but we have been experimenting with 5-day increments and predictions out to later days. Multi-target models provide several advantages: they capture not just a single estimate but the shape of the curve, which is critical to well planning and economics.
Additionally, multi-target models enable investigation of the time-series impact of input parameters. This means that you can start to answer not just whether a design choice is impactful by when it is.
To estimate feature impact, we employ Shapley values. Shapley values are generated by training a model on top of our standard tree-based model. This “surrogate” model estimates how much each feature (proppant loading, fluid loading, closest-neighbor spacing, etc.) contributed to the model prediction for that IP day. Shapley values are expressed in units of barrels or mcf (in the case of a gas model). Our customers have found tremendous value in using Shapley values for explainability and transparency, whether downloaded from Novi Cloud or accessed through Production Modeler. Shapley values aren’t a substitute for running hypothetical designs through the model (what our Prediction Engine software is designed for), but they are a great tool for examining the model and investigation/research purposes — like this URTeC paper!
Example: changing impact of proppant through time
Our model-build pipeline produces Shapley values in terms of cumulative oil, but we transformed them into rate values for this study. Bucketing the wells by proppant loading, we can see that the largest designs have, on average, a huge uplift over small designs in early days (48%), but very little in later days (27%).
The difference is even more acute for the >1400 lbs/ft designs compared to the 800-1000 lbs/ft designs: at IP120, the Shapley values show +15%, but by IP720, it’s only ~6%.
How does proppant compare to spacing and geology through time?
Proppant, of course, isn’t the only predictive feature in our model. You can use Novi outputs to make these charts for any variable, and can dive deeper by filtering to specific areas or design groups. We find it useful to sum the Shapley values over our spacing and geology features; because we often train on 10+ spacing features and 10+ geology features, it is necessary to lump their Shapley values for comparison to completions parameters.
Below, I have filtered the dataset to just wells with >1000 lbs/ft proppant drilled at >6 wells per section spacing. I also took the absolute value of the Shapley values to show relative impact. Proppant quickly ramps up to the feature with highest contribution to the model prediction, but it falls below geology at IP330. By 720, spacing has passed fluid/ft and is about to pass proppant/ft. Stage length Shapley values are only ~2% on average relative to model average production rate, but I did not filter to short stages, which would show higher values here.
Conclusions & Paper Information
- Multi-target models provide a powerful tool to investigate the impact of design choices on oil and gas production through time.
- Wells with large completions designs show largest uplift over small ones at IP120; the uplift steadily diminishes through IP720.
- Geology and spacing increase in importance relative to completions designs with time.
- This workflow is repeatable for any Novi model across a wide range of plays.
Paper Details ::
Deriving Time-Dependent Scaling Factors for Completions Parameters in the Williston Basin using a Multi-Target Machine Learning Model and Shapley Values Control ID: 3103
URTeC 2020, Wednesday, July 22, Morning Session, Theme 8: EUR and Performance Prediction: Decline Curve Analysis and Beyond III