Data Pipeline &
System Capabilities
How Overland's well data and mineral tract positions were ingested, matched, and enriched against our proprietary intelligence database.
Two files provided — a 195-well Enverus/DI export with header data, production, EUR estimates, and lat/long; and a 237-row mineral tract ownership CSV with Section-Township-Range positions across 5 ND counties.
Raw files archived as-is before any transformation. This creates a permanent record of the source data and enables re-ingestion if formats change.
Enverus column names mapped to normalized schema fields. API14 stored as primary identifier. Both surface and bottom hole coordinates retained. Production and EUR figures preserved from source.
STR strings (e.g. "160N-94W-27") parsed into discrete township, range, and section fields for direct SQL joins. Acreage (NMA/NRA) and entity ownership preserved per tract.
Mineral tract sections joined against our 42,609-well NDIC index. Every well physically located on an owned section is captured — including wells the customer didn't know to ask about. Township/range format normalized (NDIC uses spaces: "159 N" vs "159N").
Using tract centroids (derived from matched well coordinates) and our MILES_BETWEEN spatial UDF, every NDIC well within 5 miles of any holding is captured. This surfaces nearby DUCs, permitted wells, and producing wells that affect mineral value without sitting on the exact section.
Data lives in two places: S3 for raw file archival and Snowflake for queryable analytics. The Snowflake layer is fully multi-tenant — adding a second customer is a single command.
| Table / View | Rows | Purpose |
|---|---|---|
| CUSTOMER_REGISTRY | 1 | Customer identity and contact. One row per account. |
| WELL_SUBMISSIONS | 195 | Wells as submitted by customer. Full Enverus field mapping preserved. |
| MINERAL_TRACTS | 237 | Mineral ownership positions. STR parsed to township/range/section. Centroids derived from matched wells. |
| TRACT_WELLS | 762 | All wells in scope. Keyed by NDIC file number. Carries match_type, source, distance_miles. |
| ENRICHED_WELLS ⟶ | view | Full enrichment surface. Joins TRACT_WELLS to signals, production, intelligence, well master. |
| Column | Type | Description |
|---|---|---|
| customer_id | VARCHAR | Multi-tenant key. All queries filter on this. |
| ndic_file_no | VARCHAR | NDIC permit/file number. Primary join key to all internal analytics. |
| api14 | VARCHAR | 14-digit API number from RAW_WELL_INDEX or submission. |
| match_type | VARCHAR | exact_str — well sits on a mineral section. radius_5mi — within 5 miles of a holding. |
| source | VARCHAR | submitted — customer included. discovered — found by the system. |
| distance_miles | FLOAT | 0.0 for exact, computed miles for radius matches. Foundation for "within X miles" queries. |
| surface_lat / surface_lon | FLOAT | WGS84 coordinates from NDIC. Ready for future map rendering. |
| Layer | Source | Columns Added |
|---|---|---|
| Well Master | V_WELL_MASTER | well_name, operator, well_status, spud_date, formation, lat/long — 42,609 wells |
| Signals | CANONICAL_WELL_SIGNALS | tier (1/2/3), activity_score, has_h2s_flag, has_flare_flag, hazard_flags — 618 wells |
| Production | V_PRODUCTION_ENRICHED | latest_oil_bbls, latest_gas_mcf, cum_oil_bbls, cum_gas_mcf, months_produced |
| Intelligence | V_WELLFILE_INTELLIGENCE | wellfile_summary (LLM), upcoming_hearings_60d, signed_orders_90d |
| Coverage Flags | Derived | has_signals, has_intelligence, has_production — TRUE/FALSE per well |
Run queue_pipeline_gaps.py to identify wells with no signals. Outputs a prioritized list (submitted wells first, then discovered) with operator and status context.
Fetch PDF wellfiles from NDIC for queued file numbers. PDFs contain formation data, completion reports, scout tickets, and regulatory correspondence.
PDFs processed through ocrmypdf + pdftotext. Clean text written to staging directory.
DeepSeek model extracts structured intelligence from clean text: formation data, H2S/CO2 presence, completion details, service contractor signals, hazard flags. Results written to JSON and loaded to Snowflake.
| Well | Cum Oil (bbl) | Months | Latest Oil/mo | Note |
|---|---|---|---|---|
| Kenneth 6-17H | 369,695 | 7 | 24,721 | ~640K EUR pace. On Overland's mineral section. |
| Michael State Federal 7-16H | 305,330 | 7 | 19,776 | Adjacent section. Part of same Continental development. |
| Helen 7-8H | 294,455 | 7 | 24,319 | Pad partner. Same TWNSP/RNG as Kenneth wells. |
| Kenneth 5-17H1 | 277,120 | 7 | 18,364 | Multiple Kenneth wellbores suggest systematic development. |
| Entity | County | NMA | NRA | Tracts |
|---|---|---|---|---|
| Fund 3 | Williams | 1,478 | 1,900 | 82 |
| Fund 2 | Williams | 729 | 992 | 22 |
| Fund 5 | Williams | 615 | 714 | 38 |
| OMH | Williams | 408 | 511 | 82 |
| Fund 4 | Williams | 263 | 326 | 36 |
The Kenneth/Helen/Michael State Federal wells sit in 3 contiguous sections. Continental has 101 wells across those sections — an entirely Continental-controlled block. XTO Energy has 2 plugged/abandoned wells from a prior era; no competitive operator activity. This is a systematic multi-pad development in Continental's core Williams County position.
Continental currently has 8 DRL, 8 LOC, and 30 Confidential wells in Williams County — 38 total in the active pipeline. Two active pads identified: Brooklyn-Bakken pad (T.155N., R.98W., Sections 15 & 22 — 6+ wells spudded Sept–Oct 2025, Charleston/Addyson names) and Rose Federal pad (T.154N., R.97W., Sections 3 & 34 — 4 DRL wells, June–Sept 2025). Also: 6 "Plano" wells permitted in T.154N., R.101W., Section 28 with no spud dates yet.
Case C32803 (March 2026): Continental filed for a new 2,560-acre overlapping spacing unit in the Wildrose/Corinth-Bakken Pool, Divide AND Williams counties. Hearing April 1, 2026 — a cross-county infill/expansion push touching Williams directly. A second batch of 18 Continental docket cases is scheduled for the April 23, 2026 NDIC hearing, including spacing and pooling for T.160N./T.159N., R.97W. — adjacent to Overland's Williams County position. Earlier regulatory actions: two field rule exceptions granted Feb 2026 for the Charleston 8-22HSL well (drilled within 150-ft north boundary setback on the active Brooklyn-Bakken pad).
Monthly production confirms a standard Bakken decline. Kenneth 6-17H and Helen 7-8H peaked at 83,360 and 77,686 bbl/month respectively in August 2025 (first full production month). By January 2026 (month 6) both are producing ~24,000–25,000 bbl/month — a ~70% decline from peak, which is exactly on-curve for this formation and lateral length. At this rate, EUR projection is 300–400K bbl per well. Kenneth 5-17H1, an older vintage, is mid-decline at ~18K/month after 7+ months. The curve shape tells us Overland's royalty income from this pad is stabilizing, not collapsing.
customers/continental_investigation.json
— 7 queries, 101+ rows · 2026-04-08
Hess Bakken Investments II has 26 signals across 130 wells (20.0%) versus Continental at 10.5% (24/228). Signals are concentrated in four sections that overlap Overland's mineral tracts directly: T.156N. R.95W. Sec.15 (5 wells, all signaled, all on tracts), T.157N. R.94W. Sec.12 (4/4/4), and T.156N. R.94W. Sec.3 (4/5 on tracts). Hess signals span all three tiers — Event-Based, Structural, and Time-Sensitive — across Williams and Mountrail counties.
76 confidential wells in scope. 61 sit on Overland's mineral sections (not in the radius ring). Most recent activity: Oasis Petroleum spudded 6 wells in Williams T.156N. Sec.23 between Feb 24–28, 2026 (Troon 5602 and Prestwick Federal 5602 pads). Kraken Operating added 2 in T.158N. Sec.4 in December 2025. Enerplus has 9 confidential wells in scope. These wells will not appear in Enverus data until confidentiality expires (~6 months post-spud).
22 H2S-flagged wells in scope. 20 are on Overland's mineral sections. Williams County leads with 11 (9 on tracts), McKenzie 7 (all on tracts), Mountrail 4 (all on tracts). Densest section clusters: Whiting in McKenzie T.152N. R.103W. Secs.31/32 (7 H2S wells on tracts) and Oasis in Williams T.156N. R.102W. Sec.17 (4 H2S wells). H2S is derived from OCR text analysis of NDIC wellfiles — presence drives specialized equipment requirements and affects spacing unit economics.
Joined MINERAL_TRACTS to ENRICHED_WELLS using REPLACE-normalized township/range (MINERAL_TRACTS stores "156N", TRACT_WELLS stores "156 N"). Fund 3's Williams County sections contain 81 wells with 5.6M barrels cumulative production — the Kenneth/Helen/Arley/Christiana pad wells are all in Fund 3 coverage. Fund 5 (Williams) follows at 614.8 NMA / 2.1M bbl. Full entity breakdown: Fund 3 Williams, Fund 2 Williams, Fund 5 Williams, OEP II Dunn, OMH across Burke/Mountrail/Williams.
FCT_DOCKET_CASES is operator/location-based (no NDIC file number), so the regulatory pipeline query matches by county and hearing date rather than well join. 73 upcoming hearings in Overland's 5 counties for non-Continental operators. Burlington Resources has a large pooling batch in McKenzie on April 22. Hess has 26 signaled wells (21 on Overland tracts) across all tier levels. Oasis has 7 signaled in Williams (all on tracts, all H2S). Whiting has 7 in McKenzie (all on tracts, all H2S).
439 radius wells (5-mile ring). 16 in active-pipeline status (DUC/DRL/Confidential). Zavanna Energy has 8 confidential wells at 2.74–3.05 miles. Hess has 4 confidential wells at 2.26 miles. Continental has 4 DRL wells at 2.3 miles. Dominant radius ring operators: Whiting (141 wells, 1.1 mi closest), Continental (138, 2.3 mi), Hess (65, 1.5 mi), Zavanna (32, 1.1 mi). Radius formations are overwhelmingly Bakken/Three Forks variants, consistent with the primary zone on Overland's own tracts.
51 of 762 wells have signal coverage (7%). 537 have production data (70%). The top 8 producing wells — all Continental Kenneth/Helen/Arley/Christiana pad, 250K–370K cum bbl each — have zero signal coverage. This is structural: signals derive from regulatory history and wellfile text; recently-completed wells haven't accumulated the case filings and wellfile volume that generate signals. Continental has 207 wells with production but only 24 with signals. The highest-value pipeline target is LLM enrichment of the Kenneth/Helen pad wellfiles.
customers/overland_insights.json
— 8 query groups, 762 wells · 2026-04-08customers/overland_insights_summary.md
— narrative summary, 7 insights