Whether exploring for oil offshore Brazil or scaling type curves in Lea County, engineers and geologists rely upon the power of analogy to estimate the productivity of a given area or engineering design choice. Wells are grouped according to geologic and engineering similarity, and an average of that group is used for a prediction. Decision-tree based machine learning algorithms follow a similar process, but at computerized speed, looking for the most predictive ways to group “similar” wells based on all the data those wells present.
We will explain the intuition behind this algorithm, explore the underlying computation with a simple example, and end with a case study showing how it can aid decision making processes when developing new acreage .
in This Post
What are Tree-Based Models?
Tree-based models start with all of the training data in a single group. Novi has employed tree based models since its founding in 2014, authoring a patent on applying these types of models to oil & gas data which was granted by the USPTO in November of 2018. The algorithm then groups wells with similar production by splitting according to a given feature (in this example, whether water saturation is above or below 25%). Then, the wells are further groups according to another feature, and another (and so on). In the below example tree, the model has split the wells according to water saturation, proppant loading, and spacing. The group of wells in the bottom right have average production of 108,000 bbl (1-year cum)–quite a large difference relative to the other groups.
As our algorithms build thousands of trees, they have options, always guided by what is the most predictive, including:
- What features (variables) should be used?
- How should it split on those features?
- How many levels deep should the trees be?
- How many trees should it build?
How Do Tree-Based Models Make Predictions?
Let’s stay with the single-tree example from above. To use it for a prediction, the algorithm simply runs the hypothetical well (shown in red) down the tree. If this well has 35% water saturation, 2500 pounds per foot proppant, and 1320′ spacing, it would end up in the leaf at the bottom grouped with a set of wells with average 1-year production of 104,000 bbl.
Let’s take this a step farther and add in additional trees. Typical Novi models have hundreds or thousands of trees, but let’s stick with a simple set of four trees that split three levels deep. Each of these trees would be different — splitting on different variables or at different thresholds.
Let’s imagine we are planning to drill a new well and want to know in advance what its production curves will look like. We can specify the location, length, azimuth, completion size, etc., and then put it into the model. As it traverses each tree, it ends up in a final (leaf) node with some wells from the training set. These are wells that the model thinks is analogous — and just as a group of geologists or engineers may have multiple ways of choosing analogs, so too does the model.
The model produces a set of training wells that were located in the same leaf as the planned well, along with the number of trees where this occurred. Novi refers to these as “contributing wells“, because they contribute to the prediction via a weighted average. The percentage of trees where the contributing well and planned well share a leaf forms the similarity score, which we then convert into weights by simply dividing by the sum of the similarity scores — this ensures the weights sum to one. For a simple set of sixteen training wells and four trees (like above), the “contributing wells” table might look like this:
Novi’s Prediction Engine software generates a table of analog wells with every prediction (in Novi Cloud, just look for the “contributing_wells” table). Contributing weights from one of Novi’s predictive models will be much lower than the table above, because the models typical consider one thousand or more analog wells for each prediction made.
These similarity scores and weights form a powerful tool for not just understanding how the model came up with its predictions but also for augmenting the traditional analog selection process. The algorithm may find similar wells that would have likely slipped past an engineer or geologist–another operator may have already tried that high fluid loading design in rock with similar pressure and brittleness, even though it was two counties over.
To leverage this powerful dataset, we developed Novi Production Modeler software, which allows for interrogation of contributing wells and weighting derived from polling of the underlying model tree structure. This video shows an example use case in the Williston Basin, looking at analogs for a barely-developed part of the basin:
Case Study for Contributing Wells: Parsley-Jagged Peak Deal
In October, Parsley announced their $2.27B acquisition of Jagged Peak, a huge move into the Delaware Basin for a producer focused on the Midland. We used Novi Prediction Engine and Novi Cloud software to analyze the deal a few weeks ago. In that analysis, we charted economic returns at reduced WTI strip prices. A critical part of that development plan is the Bone Springs 3rd Sand, which has shown very promising results in relatively limited development across their position.
How can Novi Contributing Well Data help provide context to interpret these well results and guide future development plans?
We used Novi’s Prediction Engine Software to make predictions and generate Contributing Well Data for a set of planned wells across the acreage, including Bone Springs Third Sand, Wolfcamp A-Lower, and Wolfcamp C zones. In Figure 7 below, I am showing maps of contributing wells built using the Novi Contributing Well Data for a 3rd Bone Springs Sand well in the Big Tex area (left), and a Wolfcamp A-Lower well in the Cochise area.
Note that much of the prediction for the 3rd Bone Springs well comes from analogs in the far northern and northwestern parts of the Delaware, whereas the Wolfcamp A-Lower prediction is dominated by nearby wells.
Why might this be the case? For the Bone Springs, much of the development is concentrated in the northern part of the basin, just off the paleo-shelf edge (see shelf edge map from Montgomery’s classic 1997 work). The model is finding similar geologic properties at Big Tex and the northern Delaware because the two locations have similar depositional environments. Of course, differences remain, such as steepness of the margin, siliciclastic source material, etc., but those geologic factors can be turned into interpretation products and run through the model to quantify their impacts.
- Tree-based models follow traditional analogy-based workflows to estimate production for a given well, similar to the type curve process or exploration analogs, but evaluated at computerized speed and with basin wide datasets to learn from.
- Novi Contributing Well Data, which is automatically generated as an artifact of Novi’s Prediction Engine software workflow, provides a powerful tool to understand how a model came up with its prediction AND a rich dataset to supplement and accelerate existing engineering and geoscience workflows.
- Our model finds geologic similarity for the 3rd Bone Springs Sand between Parsley’s Delaware Basin acreage and northern Eddy & Lea Counties. Parsley should consider studying 3rd Bone development in those areas for ideation on completions designs, well spacing, and targeting strategies.