How often does an engineer dash off a simple oil well water analysis, doing something like applying a flat WOR to their oil prediction? It’s easy to ignore water, but unexpectedly high production, leading to more produced water, can damage well economics. In the worst cases, it can force shut-in if disposal capacity is full.
Fortunately, advanced machine learning methods developed for oil can be applied to water. These technique help disentangle the complex interactions of completions, geology, and spacing. This is the subject of our URTeC 2020 paper: Predicting Water Production in the Williston Basin using a Machine Learning Model.
Editor’s note: after URTeC, this paper was featured in JPT. Click through to learn more.
Oil Well Water Analysis Breakdown:
- Why care about produced water in oil and gas?
- Oil Well Water Analysis: What drives water production?
- Interactions between oil well completions and geology
- What has driven the average basin water increase?
- How accurate are the oil well water models?
- Bringing it together with Forecast Engine
- Conclusions
Why care about produced water in oil and gas?
In the Bakken-Three Forks play, operators have gone from drilling oil wells that produce some water to water wells that also produce oil. This is a serious threat for the unconventional oil & gas industry.

Something similar has been happening in other basins. Several factors contribute in complex, interacting ways:
- Stepping out beyond the core where water saturations are higher
- Upsized completions and tighter spacing
- Dealing with greater parent-child effects (and potential changes in relative perm)
Water production (and disposal) and carries environmental risks, including induced seismicity and toxic spills. Reducing costs and mitigating risks require an understanding of how design choices impact water production. Novi’s standard model pipeline builds a model to forecast water production. Our customers can also use Novi’s powerful explainability tools to understand water produciton.
Oil Well Water Analysis: What drives water production?
We’ll start with geology (your author is a geologist, even if he forgets sometimes!). For this model, we built a proprietary subsurface dataset starting with publicly available logs and ending with principal components.
We don’t just produce geoSHAP (total rock quality index) for oil — it’s available for water and gas, too. It is a dynamic oil well water analysis tool. The model is able to identify the least water production comes along the Nesson Anticline and in the Parshall-Sanish Field — both where oil migration likely played a significant role.

We employ Shapley Additive exPlanations (SHAP) to estimate how much each feature (aka variable) impacted the model prediction. We rolled out SHAP values last year, and they quickly became one of our powerhouse products. SHAP values are useful on the well level (how did the model come up with its forecast for this well?) and at regional scale. (E.g., how does the impact of fluid loading on every well compare to proppant, or a geology feature, or well spacing?)
SHAPnado to analyze produced water drivers
A common way to display SHAP values is what we call the SHAPnado chart. It’s similar to a standard tornado chart, but with more information. With our tree-based models, the simplest prediction you could make would be the average of the input data. With additional information, you could then move the prediction away from the average by some positive or negative barrels of water (or oil, or mcf). SHAP values, then, are expressed in units of production and represent a delta away from the model set average. Here’s the key details to remember when looking at a SHAPnado:
- Each dot represents a well
- The color of the dot shows the feature value (e.g., the wells in the very top right of the chart have high fluid per ft and have high SHAP values for fluid per ft).
- The spread of the cloud of wells shows data density.

Completions fluid volume has a larger impact than proppant volume, as estimated by the SHAP values. This should not be a surprise. Completions fluid increases the total production of the well by promoting fracture growth and placing proppant. Additionally, what goes down must come up! At least in the oilfield.
Interactions between oil well completions and geology
How do completions and geology interact? Below, we crossplot completions fluid volume and proppant volume with their SHAP values. On the left, the fluid SHAP values show a larger range than do those for proppant. This reflects what we see in the SHAPnado (see previous section). An interesting interaction with geology happens as the completions are pushed out to larger sizes (beyond ~18 bbl/ft fluid and ~800 pounds/ft proppant). The wells are colored by the water geoSHAP values, with dark blue most water prone and green least water prone. The model has learned that large completions in water-prone rock will see more water production than large completions in oil-prone rock. That makes intuitive sense, but operators must keep that in mind as they step out beyond the play core.

What has driven the average basin water increase?
Let’s return to the plot from the start of the post: per-well average water production has been dramatically increasing in the basin — but why?
We can return to SHAP values for this question. Average SHAP values through time show how the evolution of trends in completions, geology, and spacing have impacted the average production. From 2012 to 2017, the main driver of increased water production was upsized completions. Completions SHAP has since stabilized (and even shrunk), but geology and spacing have moved strongly into the positive — likely the result of operators moving out of the play core and going to tighter configurations.

How accurate are the oil well water models?
The answer is — almost as accurate as oil models! For this Bakken-Three Forks model, our oil model stabilizes around 19% MAPE, and water around 24% MAPE. We calculate error for the test set, which is a random 20% split out of the data. Error is slightly higher at early days for both models, due to extra noise in the early life of the well. It’s pretty common for us to see higher error in the water model — something about water production is inherently noisier than oil production (fracking through to wet zones? completions interference?).

Bringing it together with Forecast Engine
Everything that I’ve shown so far is based off of SHAP values, a standard Novi model build output. But we have an actual model that we can use for forecasts!! We have this capability because water is an important cost stream, and thus important for well planning. So, we’ll do a quick exercise in Novi’s Forecast Engine well planning software to look at how completions designs and spacing interact across a pair of hypothetical developments in the core of the basin and fringe of the basin. Click the video below to see how I configured the setup:
Looking at the model’s forecasts, we can see that standalone parent wells with small completions will have similar water production — so if you’re just taking the increase in water the core saw with 660′ spacing and modern completions, you’d underestimate per-well water production by ~25,000 bbl of water in the first 90 days (gulp).

Conclusions
- Machine learning models are a powerful tool for understanding and forecasting water production (these are a standard Novi product, ask us if you would like help interpreting your water models)
- Completions fluid is more important than proppant volume for water production in the Williston, though using large proppant volumes in water-prone rocks is associated with bigtime increases in water production
- The increase in water production in the Williston since the start of the play has been largely driven by upsizing completions, though step-outs and tighter spacing over the last year has started to be more important.
- Forecast Engine generates produced water forecasts for hypothetical wells — this can be used to more accurately calculate well economics, plan facilities & disposal, or just research your designs.
Do you know where your play is going to produce more water?
Geology, completions, and spacing decisions affect water production. If you are an investor or operator, you can use Novi’s tools to predict total water production in different plays, plan offtake capacity, and understand water production drivers to make better business decisions.
Use our geoSHAP feature for this and many other analyses. You can request a free demo here: novilabs.com/demo
Paper Details ::
Predicting Water Production in the Williston Basin using a Machine Learning Model. Control ID: 2756
URTeC 2020, Wednesday, July 22, Morning Session, Theme 12: Resources, Effective Communication, and Social License to Operate