<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Agile Data N’ Info]]></title><description><![CDATA[Simply Magical content about Agile Data Ways of Working]]></description><link>https://agiledata.info</link><image><url>https://substackcdn.com/image/fetch/$s_!ErtR!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8892c64-a0c7-4c7b-9f49-a73be5280f22_1280x1280.png</url><title>Agile Data N’ Info</title><link>https://agiledata.info</link></image><generator>Substack</generator><lastBuildDate>Wed, 08 Apr 2026 12:31:25 GMT</lastBuildDate><atom:link href="https://agiledata.info/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Agile Data Limited]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[DataNInfo@agiledataguides.com]]></webMaster><itunes:owner><itunes:email><![CDATA[DataNInfo@agiledataguides.com]]></itunes:email><itunes:name><![CDATA[Shagility]]></itunes:name></itunes:owner><itunes:author><![CDATA[Shagility]]></itunes:author><googleplay:owner><![CDATA[DataNInfo@agiledataguides.com]]></googleplay:owner><googleplay:email><![CDATA[DataNInfo@agiledataguides.com]]></googleplay:email><googleplay:author><![CDATA[Shagility]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Its not about your shiny tools, its about the value you deliver using them]]></title><description><![CDATA[And if you get this wrong you will probably be called to account for the cost not the value]]></description><link>https://agiledata.info/p/its-not-about-your-shiny-tools-its</link><guid isPermaLink="false">https://agiledata.info/p/its-not-about-your-shiny-tools-its</guid><dc:creator><![CDATA[Shagility]]></dc:creator><pubDate>Fri, 27 Feb 2026 11:45:47 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/fa9e9189-e8e3-45bd-aaf6-3d3fb4933f77_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I caught up with a data friend for a virtual coffee this week, they were lamenting the whole &#8220;single source of truth&#8221; they were experiencing with an org they were working with.</p><p>It was the same pattern <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Joe Reis&quot;,&quot;id&quot;:3531217,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6e4716b1-c223-41e3-b943-def0291bf217_1175x783.jpeg&quot;,&quot;uuid&quot;:&quot;beada44e-bfb9-4fb0-877d-0a4cbf60520f&quot;}" data-component-name="MentionToDOM"></span> talks about in this draft book chapter:</p><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:188928717,&quot;url&quot;:&quot;https://practicaldatamodeling.substack.com/p/what-data-modeling-is-and-is-not&quot;,&quot;publication_id&quot;:1473069,&quot;publication_name&quot;:&quot;Practical Data Modeling&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!Q0I-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6eecf34a-ff04-4526-a4b3-4163469579cd_500x500.png&quot;,&quot;title&quot;:&quot;What Data Modeling Is and Is Not &quot;,&quot;truncated_body_text&quot;:&quot;Here&#8217;s the revision of Chapter Two for Mixed Model Arts, where I discuss various definitions of data modeling and bring it into the present day. We are no longer modeling just for humans, but modeling for humans AND machines.&quot;,&quot;date&quot;:&quot;2026-02-23T18:12:02.833Z&quot;,&quot;like_count&quot;:32,&quot;comment_count&quot;:3,&quot;bylines&quot;:[{&quot;id&quot;:3531217,&quot;name&quot;:&quot;Joe Reis&quot;,&quot;handle&quot;:&quot;joereis&quot;,&quot;previous_name&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6e4716b1-c223-41e3-b943-def0291bf217_1175x783.jpeg&quot;,&quot;bio&quot;:&quot;Best Selling Co-author of Fundamentals of Data Engineering (O'Reilly) | Data Engineer and Architect | Recovering Data Scientist &#8482; | Speaker | Professor | Podcaster &amp; content creator | DJ | Occasional athlete&quot;,&quot;profile_set_up_at&quot;:&quot;2022-03-09T19:34:00.392Z&quot;,&quot;reader_installed_at&quot;:&quot;2022-11-04T04:10:27.874Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:39449,&quot;user_id&quot;:3531217,&quot;publication_id&quot;:47214,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:47214,&quot;name&quot;:&quot;Joe Reis&quot;,&quot;subdomain&quot;:&quot;joereis&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;My rants on data, technology, and business&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5bdde2d6-c6ac-46b5-942a-004438d1fd47_300x300.png&quot;,&quot;author_id&quot;:3531217,&quot;primary_user_id&quot;:3531217,&quot;theme_var_background_pop&quot;:&quot;#0068EF&quot;,&quot;created_at&quot;:&quot;2020-05-18T11:49:05.293Z&quot;,&quot;email_from_name&quot;:&quot;Joe Reis&quot;,&quot;copyright&quot;:&quot;Joe Reis&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:null,&quot;is_personal_mode&quot;:false}},{&quot;id&quot;:1438665,&quot;user_id&quot;:3531217,&quot;publication_id&quot;:1473069,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:1473069,&quot;name&quot;:&quot;Practical Data Modeling&quot;,&quot;subdomain&quot;:&quot;practicaldatamodeling&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Welcome to Practical Data Modeling! Whether you're a beginner or an experienced data professional interested in leveling up your data modeling, we will help you take your skills to the next level.&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6eecf34a-ff04-4526-a4b3-4163469579cd_500x500.png&quot;,&quot;author_id&quot;:3531217,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#2EE240&quot;,&quot;created_at&quot;:&quot;2023-03-07T02:12:42.856Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Joe Reis&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}},{&quot;id&quot;:8187789,&quot;user_id&quot;:3531217,&quot;publication_id&quot;:8003297,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:8003297,&quot;name&quot;:&quot;Practical Data Community&quot;,&quot;subdomain&quot;:&quot;practicaldatacommunity&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Home of the Practical Data Community&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6e4716b1-c223-41e3-b943-def0291bf217_1175x783.jpeg&quot;,&quot;author_id&quot;:3531217,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2026-02-13T01:50:56.273Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Joe Reis&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}}],&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100,&quot;status&quot;:{&quot;bestsellerTier&quot;:100,&quot;subscriberTier&quot;:5,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;bestseller&quot;,&quot;tier&quot;:100},&quot;paidPublicationIds&quot;:[10845,1501429,35345,817132,4417548],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:false,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://practicaldatamodeling.substack.com/p/what-data-modeling-is-and-is-not?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!Q0I-!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6eecf34a-ff04-4526-a4b3-4163469579cd_500x500.png"><span class="embedded-post-publication-name">Practical Data Modeling</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">What Data Modeling Is and Is Not </div></div><div class="embedded-post-body">Here&#8217;s the revision of Chapter Two for Mixed Model Arts, where I discuss various definitions of data modeling and bring it into the present day. We are no longer modeling just for humans, but modeling for humans AND machines&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">a month ago &#183; 32 likes &#183; 3 comments &#183; Joe Reis</div></a></div><p>A quote from that chapter:</p><p>&lt;-o-&gt; </p><p>&#8220;I&#8217;d just been hired as a consultant for a mid-sized e-commerce company that was hemorrhaging money. Their CEO pulled me aside within the first hour: &#8220;Joe, our inventory system says we have 50,000 units in stock. Our warehouse says 12,000. Our website shows customers that they can buy things that don&#8217;t exist. We had to refund $400K last month alone.&#8221;</p><p>&#8220;I spent the next three days spelunking through their systems. What I found was a horror show. They had an &#8216;orders&#8217; table with 500 columns. Customer data lived in six different databases, none of which agreed on what a &#8220;customer&#8221; was. The product catalog was a single enormous spreadsheet that someone manually uploaded to the database every Friday afternoon. Date fields were stored as strings. Some prices included tax, some didn&#8217;t, and nobody could tell you which was which.&#8221;</p><p>&lt;-oo-&gt; </p><p>My data friend was telling me about the time and money the orgs CDO had spent implementing a &#8220;Modern Data Platform&#8221; and how they were now being asked to present what &#8220;outcomes&#8221; had been delivered for that expenditure.</p><p>Unfortunately when my data friend had to go and get a single number for the  equivalent of Joes &#8220;Count of Stock&#8221;, they found multiple systems that had different counts.</p><p>And when asking various Subject Matter Experts (SME) which count could be trusted they all gave different answers.</p><p>But the SME&#8217;s were all aligned when they said the one count they didn&#8217;t trust was the count in the new data platform.</p><p>I joked that the CDO should probably google (or perplexity) the three envelope joke about now.</p><p>But realistically its not a joke. That is shareholders money that has been spent, its peoples jobs that will probably be impacted as a result of cost that seemed to have no value.</p><p>So lets say it again .....</p><div class="pullquote"><p><strong>Its not about the tools you use.</strong></p><p><strong>Its about the value you deliver using those tools.</strong></p></div><p>End of rant and here is a suggestion.</p><p>If you are investing in new tools, or a new shiny data platform do three simple extra steps.</p><ol><li><p>Define an Information Product using the Information Product Canvas<br><br>You can learn about it for free here:  </p><div class="embedded-publication-wrap" data-attrs="{&quot;id&quot;:2810971,&quot;name&quot;:&quot;Information Product Canvas&quot;,&quot;logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!UH3F!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68ded0e0-62cf-497b-812e-8be9bbbe0629_855x855.png&quot;,&quot;base_url&quot;:&quot;https://informationproductcanvas.agiledataguides.com&quot;,&quot;hero_text&quot;:&quot;Information Product Canvas\na pattern template, to quickly discover and capture, data and information requirements, \nin a repeatable way, so stakeholders love them and data teams can build from them&quot;,&quot;author_name&quot;:&quot;Shagility&quot;,&quot;show_subscribe&quot;:true,&quot;logo_bg_color&quot;:&quot;#ffffff&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPublicationToDOMWithSubscribe"><div class="embedded-publication show-subscribe"><a class="embedded-publication-link-part" native="true" href="https://informationproductcanvas.agiledataguides.com?utm_source=substack&amp;utm_campaign=publication_embed&amp;utm_medium=web"><img class="embedded-publication-logo" src="https://substackcdn.com/image/fetch/$s_!UH3F!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68ded0e0-62cf-497b-812e-8be9bbbe0629_855x855.png" width="56" height="56" style="background-color: rgb(255, 255, 255);"><span class="embedded-publication-name">Information Product Canvas</span><div class="embedded-publication-hero-text">Information Product Canvas
a pattern template, to quickly discover and capture, data and information requirements, 
in a repeatable way, so stakeholders love them and data teams can build from them</div><div class="embedded-publication-author-name">By Shagility</div></a><form class="embedded-publication-subscribe" method="GET" action="https://informationproductcanvas.agiledataguides.com/subscribe?"><input type="hidden" name="source" value="publication-embed"><input type="hidden" name="autoSubmit" value="true"><input type="email" class="email-input" name="email" placeholder="Type your email..."><input type="submit" class="button primary" value="Subscribe"></form></div></div></li><li><p>Take <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Nick Zervoudis&quot;,&quot;id&quot;:6245781,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/2c89fc3f-12ff-4f16-b1a1-502c70441381_1332x1810.png&quot;,&quot;uuid&quot;:&quot;88ba922f-742d-4cdc-9c20-7d7fa5d5a7a4&quot;}" data-component-name="MentionToDOM"></span> course on how to easily and quickly identify the value of the information that the Information Product will deliver. <br><br>You can find his course here: <a href="https://maven.com/nick-zervoudis/dpm-value-course">https://maven.com/nick-zervoudis/dpm-value-course</a><br></p></li><li><p>Have you data team build the identified Information Product at the same time they build your shiny new Data Platform.  <br><br>And then use the number from #2 above to start justifying the value the new data platform is delivering.<br></p></li></ol>]]></content:encoded></item><item><title><![CDATA[What is the new Moat in the new "AI" Vibe Coding" world]]></title><description><![CDATA[Its no longer effort and i'm not sure its expertise either, it might still be experience..]]></description><link>https://agiledata.info/p/what-is-the-new-moat-in-the-new-ai</link><guid isPermaLink="false">https://agiledata.info/p/what-is-the-new-moat-in-the-new-ai</guid><dc:creator><![CDATA[Shagility]]></dc:creator><pubDate>Wed, 25 Feb 2026 11:08:33 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!GS0_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd73633a7-eb63-4201-aabe-24d5b93d135d_1790x1297.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>The moats of old are disappearing and im ok with that.</p><p>The thing I love about experimentation is it helps me coalesce some divergent ideas that have been floating in my head for a while, into some semblance of order.</p><p>A bit like writing does.</p><p>I wanted to get a handle on the latest state of &#8220;vibe coding&#8221; so decided to experiment with building an app using Claude Code and Opus 4.6.</p><p>I picked vibe coding an app for the Information Product Canvas, its an app I have wanted built for a while, but the cost to build it the old way never matched the value I found people were willing to pay for it.</p><p>Vibe coding in theory reduced the cost, the experiment was would it?</p><p>You can see the results of a few iterations of the canvas:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GS0_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd73633a7-eb63-4201-aabe-24d5b93d135d_1790x1297.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GS0_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd73633a7-eb63-4201-aabe-24d5b93d135d_1790x1297.png 424w, https://substackcdn.com/image/fetch/$s_!GS0_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd73633a7-eb63-4201-aabe-24d5b93d135d_1790x1297.png 848w, https://substackcdn.com/image/fetch/$s_!GS0_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd73633a7-eb63-4201-aabe-24d5b93d135d_1790x1297.png 1272w, https://substackcdn.com/image/fetch/$s_!GS0_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd73633a7-eb63-4201-aabe-24d5b93d135d_1790x1297.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GS0_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd73633a7-eb63-4201-aabe-24d5b93d135d_1790x1297.png" width="1456" height="1055" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d73633a7-eb63-4201-aabe-24d5b93d135d_1790x1297.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1055,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:315781,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/189123057?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd73633a7-eb63-4201-aabe-24d5b93d135d_1790x1297.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GS0_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd73633a7-eb63-4201-aabe-24d5b93d135d_1790x1297.png 424w, https://substackcdn.com/image/fetch/$s_!GS0_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd73633a7-eb63-4201-aabe-24d5b93d135d_1790x1297.png 848w, https://substackcdn.com/image/fetch/$s_!GS0_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd73633a7-eb63-4201-aabe-24d5b93d135d_1790x1297.png 1272w, https://substackcdn.com/image/fetch/$s_!GS0_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd73633a7-eb63-4201-aabe-24d5b93d135d_1790x1297.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Plus a side experiment into the world of the Context Plane (couldn&#8217;t help myself).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oyWY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6ff1ac2-2ecd-4f63-a516-07a53f60a860_1790x1297.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oyWY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6ff1ac2-2ecd-4f63-a516-07a53f60a860_1790x1297.png 424w, https://substackcdn.com/image/fetch/$s_!oyWY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6ff1ac2-2ecd-4f63-a516-07a53f60a860_1790x1297.png 848w, https://substackcdn.com/image/fetch/$s_!oyWY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6ff1ac2-2ecd-4f63-a516-07a53f60a860_1790x1297.png 1272w, https://substackcdn.com/image/fetch/$s_!oyWY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6ff1ac2-2ecd-4f63-a516-07a53f60a860_1790x1297.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oyWY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6ff1ac2-2ecd-4f63-a516-07a53f60a860_1790x1297.png" width="1456" height="1055" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e6ff1ac2-2ecd-4f63-a516-07a53f60a860_1790x1297.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1055,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:477155,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/189123057?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6ff1ac2-2ecd-4f63-a516-07a53f60a860_1790x1297.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!oyWY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6ff1ac2-2ecd-4f63-a516-07a53f60a860_1790x1297.png 424w, https://substackcdn.com/image/fetch/$s_!oyWY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6ff1ac2-2ecd-4f63-a516-07a53f60a860_1790x1297.png 848w, https://substackcdn.com/image/fetch/$s_!oyWY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6ff1ac2-2ecd-4f63-a516-07a53f60a860_1790x1297.png 1272w, https://substackcdn.com/image/fetch/$s_!oyWY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6ff1ac2-2ecd-4f63-a516-07a53f60a860_1790x1297.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>But that not the point of this post.</p><p><strong><a href="https://www.linkedin.com/feed/#">Nick Pinfold</a></strong> is experimenting with his teams Agile Data Ways of Working and freely sharing his journey via LinkedIn comments.</p><p>In this comment</p><p><a href="https://www.linkedin.com/feed/update/urn:li:ugcPost:7432018978505617408?commentUrn=urn%3Ali%3Acomment%3A%28ugcPost%3A7432018978505617408%2C7432267116197765120%29&amp;dashCommentUrn=urn%3Ali%3Afsd_comment%3A%287432267116197765120%2Curn%3Ali%3AugcPost%3A7432018978505617408%29">https://www.linkedin.com/feed/update/urn:li:ugcPost:7432018978505617408?commentUrn=urn%3Ali%3Acomment%3A%28ugcPost%3A7432018978505617408%2C7432267116197765120%29&amp;dashCommentUrn=urn%3Ali%3Afsd_comment%3A%287432267116197765120%2Curn%3Ali%3AugcPost%3A7432018978505617408%29</a></p><p>He talks about how he is creating a Streamlit app that allows him to capture the IPC content as Context and use it to assist with the next steps in their Information Factory.</p><p><span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Anna Bergevin&quot;,&quot;id&quot;:61243663,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ffdf50fb-64e3-4a0b-8806-3ea6c47d3d66_1537x2046.jpeg&quot;,&quot;uuid&quot;:&quot;502bb079-2225-4622-84e8-007ac70d91e1&quot;}" data-component-name="MentionToDOM"></span> posted a comment on this Substack post:</p><div class="comment" data-attrs="{&quot;url&quot;:&quot;https://open.substack.com/&quot;,&quot;commentId&quot;:218173031,&quot;comment&quot;:{&quot;id&quot;:218173031,&quot;date&quot;:&quot;2026-02-22T18:16:36.126Z&quot;,&quot;edited_at&quot;:null,&quot;body&quot;:&quot;Great comment Andrew, this idea of buying datapacks you can converse with is exactly what I&#8217;m thinking. \n\nI&#8217;m currently reading &#8220;Your Best Meeting Ever&#8221; on audio written by Rebecca Hinds. Fantastic book. But I can&#8217;t take notes easily when I drive or pull a quote to share with my leadership team to talk about applying the principles. Or build a could slides to raise in our leadership meeting about having our own Meeting Doomsday. \n\nI want authors like Rebecca to get paid for her work (I bought it and would pay extra for access to a data pack I could converse with.) - if we can figure out how to protect the IP and keep authors pay I think there&#8217;s an interesting path forward here for readers and to get even more value from what authors create. \n\nSome may skip the traditional end to end reading, some may do both like I am. But if authors are getting paid and the ideas are circulating that feels like an interesting idea to me.&quot;,&quot;body_json&quot;:{&quot;type&quot;:&quot;doc&quot;,&quot;attrs&quot;:{&quot;schemaVersion&quot;:&quot;v1&quot;},&quot;content&quot;:[{&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;Great comment Andrew, this idea of buying datapacks you can converse with is exactly what I&#8217;m thinking. &quot;}],&quot;type&quot;:&quot;paragraph&quot;},{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;I&#8217;m currently reading &#8220;Your Best Meeting Ever&#8221; on audio written by Rebecca Hinds. Fantastic book. But I can&#8217;t take notes easily when I drive or pull a quote to share with my leadership team to talk about applying the principles. Or build a could slides to raise in our leadership meeting about having our own Meeting Doomsday. &quot;}]},{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;I want authors like Rebecca to get paid for her work (I bought it and would pay extra for access to a data pack I could converse with.) - if we can figure out how to protect the IP and keep authors pay I think there&#8217;s an interesting path forward here for readers and to get even more value from what authors create. &quot;}]},{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;type&quot;:&quot;text&quot;,&quot;text&quot;:&quot;Some may skip the traditional end to end reading, some may do both like I am. But if authors are getting paid and the ideas are circulating that feels like an interesting idea to me.&quot;}]}]},&quot;restacks&quot;:0,&quot;reaction_count&quot;:3,&quot;attachments&quot;:[{&quot;id&quot;:&quot;d6381ea9-4de0-42b9-83b8-0df058b34293&quot;,&quot;type&quot;:&quot;comment&quot;,&quot;publication&quot;:null,&quot;post&quot;:null,&quot;comment&quot;:{&quot;id&quot;:218005616,&quot;body&quot;:&quot;Interesting, reminds me of the ideas here: https://lethain.com/competitive-advantage-author-llms/&quot;,&quot;body_json&quot;:{&quot;content&quot;:[{&quot;type&quot;:&quot;paragraph&quot;,&quot;content&quot;:[{&quot;text&quot;:&quot;Interesting, reminds me of the ideas here: &quot;,&quot;type&quot;:&quot;text&quot;},{&quot;text&quot;:&quot;https://lethain.com/competitive-advantage-author-llms/&quot;,&quot;type&quot;:&quot;text&quot;,&quot;marks&quot;:[{&quot;type&quot;:&quot;link&quot;,&quot;attrs&quot;:{&quot;target&quot;:&quot;_blank&quot;,&quot;href&quot;:&quot;https://lethain.com/competitive-advantage-author-llms/&quot;,&quot;rel&quot;:&quot;nofollow ugc noopener&quot;,&quot;class&quot;:&quot;note-link&quot;}}]}]}],&quot;attrs&quot;:{&quot;schemaVersion&quot;:&quot;v1&quot;},&quot;type&quot;:&quot;doc&quot;},&quot;publication_id&quot;:null,&quot;post_id&quot;:null,&quot;user_id&quot;:12301499,&quot;type&quot;:&quot;feed&quot;,&quot;date&quot;:&quot;2026-02-22T09:54:04.023Z&quot;,&quot;edited_at&quot;:null,&quot;ancestor_path&quot;:&quot;217895979&quot;,&quot;reply_minimum_role&quot;:&quot;everyone&quot;,&quot;media_clip_id&quot;:null,&quot;user&quot;:{&quot;id&quot;:12301499,&quot;name&quot;:&quot;Andrew Jones&quot;,&quot;handle&quot;:&quot;andrewrjones&quot;,&quot;previous_name&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1487daae-ccfb-4061-9206-b6b0653a3003_3024x3024.jpeg&quot;,&quot;bio&quot;:&quot;Principal (Data) Engineer. Coined Data Contracts. Father of 2. Brewer of beer.&quot;,&quot;profile_set_up_at&quot;:&quot;2023-02-07T16:59:01.478Z&quot;,&quot;reader_installed_at&quot;:&quot;2023-09-23T08:51:51.077Z&quot;,&quot;bestseller_tier&quot;:null,&quot;status&quot;:{&quot;bestsellerTier&quot;:null,&quot;subscriberTier&quot;:null,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:null,&quot;paidPublicationIds&quot;:[],&quot;subscriber&quot;:null},&quot;primary_publication&quot;:{&quot;id&quot;:3078050,&quot;subdomain&quot;:&quot;andrewrjones&quot;,&quot;custom_domain_optional&quot;:false,&quot;name&quot;:&quot;Andrew Jones&quot;,&quot;author_id&quot;:12301499,&quot;user_id&quot;:12301499,&quot;handles_enabled&quot;:false,&quot;explicit&quot;:false,&quot;is_personal_mode&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;pledges_enabled&quot;:true}},&quot;reaction_count&quot;:0,&quot;reactions&quot;:{&quot;&#10084;&quot;:0},&quot;restacks&quot;:1,&quot;restacked&quot;:false,&quot;children_count&quot;:0,&quot;user_bestseller_tier&quot;:null,&quot;userStatus&quot;:{&quot;bestsellerTier&quot;:null,&quot;subscriberTier&quot;:null,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:null,&quot;paidPublicationIds&quot;:[],&quot;subscriber&quot;:null},&quot;user_primary_publication&quot;:{&quot;id&quot;:3078050,&quot;subdomain&quot;:&quot;andrewrjones&quot;,&quot;custom_domain_optional&quot;:false,&quot;name&quot;:&quot;Andrew Jones&quot;,&quot;author_id&quot;:12301499,&quot;user_id&quot;:12301499,&quot;handles_enabled&quot;:false,&quot;explicit&quot;:false,&quot;is_personal_mode&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;pledges_enabled&quot;:true},&quot;attachments&quot;:[{&quot;id&quot;:&quot;462e6143-2317-4ff1-9d9f-e7aedaa4b2e0&quot;,&quot;type&quot;:&quot;link&quot;,&quot;linkMetadata&quot;:{&quot;url&quot;:&quot;https://lethain.com/competitive-advantage-author-llms/&quot;,&quot;host&quot;:&quot;lethain.com&quot;,&quot;title&quot;:&quot;What is the competitive advantage of authors in the age of LLMs?&quot;,&quot;description&quot;:&quot;Over the past 19 months, I&#8217;ve written Crafting Engineering Strategy,\na book on creating engineering strategy. I&#8217;ve also been working increasingly with\nlarge language models at work.\nUnsurprisingly, the intersection of those two ideas is a topic that I&#8217;ve been thinking\nabout a lot. What, I&#8217;ve wondere&#8230;&quot;,&quot;image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7ff68084-baae-4aa4-b837-6a29d58595f1_400x400.png&quot;,&quot;original_image&quot;:&quot;https://lethain.com/static/author.png&quot;},&quot;explicit&quot;:false}]},&quot;trackingParameters&quot;:{&quot;item_primary_entity_key&quot;:&quot;c-218005616&quot;,&quot;item_entity_key&quot;:&quot;c-218005616&quot;,&quot;item_type&quot;:&quot;comment&quot;,&quot;item_comment_id&quot;:218005616,&quot;item_content_user_id&quot;:12301499,&quot;item_content_timestamp&quot;:&quot;2026-02-22T09:54:04.023Z&quot;,&quot;item_context_type&quot;:&quot;comment&quot;,&quot;item_context_type_bucket&quot;:&quot;&quot;,&quot;item_context_timestamp&quot;:&quot;2026-02-22T09:54:04.023Z&quot;,&quot;item_context_user_id&quot;:12301499,&quot;item_context_user_ids&quot;:[],&quot;item_can_reply&quot;:false,&quot;item_last_impression_at&quot;:null,&quot;impression_id&quot;:&quot;f93ce6b0-ac7f-45aa-8462-2ffd78b82e16&quot;,&quot;followed_user_count&quot;:172,&quot;subscribed_publication_count&quot;:140,&quot;is_following&quot;:true,&quot;is_explicitly_subscribed&quot;:false,&quot;note_velocity_factor&quot;:1.00489453674,&quot;note_delay_seconds&quot;:93,&quot;note_notes_per_hour&quot;:3242.770362,&quot;item_current_reaction_count&quot;:0,&quot;item_current_restack_count&quot;:1,&quot;item_current_reply_count&quot;:0}}],&quot;name&quot;:&quot;Anna Bergevin&quot;,&quot;user_id&quot;:61243663,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ffdf50fb-64e3-4a0b-8806-3ea6c47d3d66_1537x2046.jpeg&quot;,&quot;user_bestseller_tier&quot;:null,&quot;userStatus&quot;:{&quot;bestsellerTier&quot;:null,&quot;subscriberTier&quot;:1,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;subscriber&quot;,&quot;tier&quot;:1,&quot;accent_colors&quot;:null},&quot;paidPublicationIds&quot;:[10845,1473069],&quot;subscriber&quot;:null}}}" data-component-name="CommentPlaceholder"></div><p>On how she is seeing &#8220;datapacks&#8221; that provide the content of a book in a way it can be easily used in an LLM as having some value.</p><p>Both of these show that there is value in writing books and creating apps, but that the typical moat of both of those things has changed.</p><p>Nick can vibe code a IPC app as fast as I can, if not faster</p><p>Anna can take the ePUB version of my book and use it in a LLM as fast as I can.  If she pays more for tokens and the latest models than I do, she can do it faster and better than I can.</p><p>So effort and expertise are no longer the moat.</p><p>Given I have always said I &#8220;Can&#8217;t Code, Don&#8217;t Code, Won&#8217;t Code&#8221; but now I can now create an App but just asking questions, Im pretty sure Effort and Expertise is not the moat it was anymore either.<br><br>But maybe experience is.<br><br>To create my app in a way that meant it was actually useful, I had to have experience:</p><ul><li><p>experience using the canvas</p></li><li><p>experience working with multiple data teams on the problem the IPC solves</p></li><li><p>experience using apps to know what UX features were needed</p></li><li><p>experience to know that I needed to add Google Auth to provide an easy login </p></li><li><p>experience to know not to store any API keys in a way that were public</p></li><li><p>experience to know that I wanted to use Google Spanner as the backend data repository and Svelte as the front end language</p></li><li><p>experience to know I wanted a API layer between the backend data repository and the front end app</p></li><li><p>experience to know &#8230;.. </p></li></ul><p>I don&#8217;t have an answer to what the new moat is, but I have more clarity on what it isn&#8217;t after these experiments.</p>]]></content:encoded></item><item><title><![CDATA[Fact-based modelling patterns with Marco Wobben]]></title><description><![CDATA[AgileData Podcast #81]]></description><link>https://agiledata.info/p/fact-based-modelling-patterns-with</link><guid isPermaLink="false">https://agiledata.info/p/fact-based-modelling-patterns-with</guid><dc:creator><![CDATA[Shagility]]></dc:creator><pubDate>Sat, 14 Feb 2026 10:46:07 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/6954060b-988e-4da4-9ab9-379b975be344_800x800.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Join Shane Gibson as he chats with Marco Wobben about the patterns within Fact-based modeling.</p><blockquote><p><strong><a href="https://agiledata.substack.com/i/187938526/listen">Listen</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/187938526/google-notebooklm-mindmap">View MindMap</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/187938526/google-notebooklm-briefing">Read AI Summary</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/187938526/transcript">Read Transcript</a></strong></p></blockquote><p></p><h2>Listen</h2><p>Listen on all good podcast hosts or over at:</p><p><a href="https://podcast.agiledata.io/e/fact-based-modelling-patterns-with-marco-wobben-episode-81/">https://podcast.agiledata.io/e/fact-based-modelling-patterns-with-marco-wobben-episode-81/</a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://podcast.agiledata.io/e/fact-based-modelling-patterns-with-marco-wobben-episode-81/&quot;,&quot;text&quot;:&quot;Listen to the Podcast Episode on Podbean&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://podcast.agiledata.io/e/fact-based-modelling-patterns-with-marco-wobben-episode-81/"><span>Listen to the Podcast Episode on Podbean</span></a></p><p></p><p></p><blockquote><p><strong>Subscribe:</strong> <a href="https://podcasts.apple.com/nz/podcast/agiledata/id1456820781">Apple Podcast</a> | <a href="https://open.spotify.com/show/4wiQWj055HchKMxmYSKRIj">Spotify</a> | <a href="https://www.google.com/podcasts?feed=aHR0cHM6Ly9wb2RjYXN0LmFnaWxlZGF0YS5pby9mZWVkLnhtbA%3D%3D">Google Podcast </a>| <a href="https://music.amazon.com/podcasts/add0fc3f-ee5c-4227-bd28-35144d1bd9a6">Amazon Audible</a> | <a href="https://tunein.com/podcasts/Technology-Podcasts/AgileBI-p1214546/">TuneIn</a> | <a href="https://iheart.com/podcast/96630976">iHeartRadio</a> | <a href="https://player.fm/series/3347067">PlayerFM</a> | <a href="https://www.listennotes.com/podcasts/agiledata-agiledata-8ADKjli_fGx/">Listen Notes</a> | <a href="https://www.podchaser.com/podcasts/agiledata-822089">Podchaser</a> | <a href="https://www.deezer.com/en/show/5294327">Deezer</a> | <a href="https://podcastaddict.com/podcast/agiledata/4554760">Podcast Addict</a> |</p></blockquote><div id="youtube2-A8NNwWWsMro" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;A8NNwWWsMro&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/A8NNwWWsMro?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>You can get in touch with Marco via <a href="https://www.linkedin.com/in/wobben/">LinkedIn</a> or over at <a href="https://casetalk.com">https://casetalk.com</a></p><div class="pullquote"><p><strong>Tired of vague data requests and endless requirement meetings? The Information Product Canvas helps you get clarity in 30 minutes or less?</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://agiledataguides.com/ipc&quot;,&quot;text&quot;:&quot;Fix Your Data Requirements&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://agiledataguides.com/ipc"><span>Fix Your Data Requirements</span></a></p></div><h2>Google NotebookLM Mindmap </h2><p></p><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SSeF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc06da155-bf0f-4e7c-99ce-adf7fd87e932_4012x5337.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SSeF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc06da155-bf0f-4e7c-99ce-adf7fd87e932_4012x5337.png 424w, https://substackcdn.com/image/fetch/$s_!SSeF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc06da155-bf0f-4e7c-99ce-adf7fd87e932_4012x5337.png 848w, https://substackcdn.com/image/fetch/$s_!SSeF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc06da155-bf0f-4e7c-99ce-adf7fd87e932_4012x5337.png 1272w, https://substackcdn.com/image/fetch/$s_!SSeF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc06da155-bf0f-4e7c-99ce-adf7fd87e932_4012x5337.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SSeF!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc06da155-bf0f-4e7c-99ce-adf7fd87e932_4012x5337.png" width="1200" height="1596.4285714285713" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c06da155-bf0f-4e7c-99ce-adf7fd87e932_4012x5337.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1937,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:1079335,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/187938526?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc06da155-bf0f-4e7c-99ce-adf7fd87e932_4012x5337.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SSeF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc06da155-bf0f-4e7c-99ce-adf7fd87e932_4012x5337.png 424w, https://substackcdn.com/image/fetch/$s_!SSeF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc06da155-bf0f-4e7c-99ce-adf7fd87e932_4012x5337.png 848w, https://substackcdn.com/image/fetch/$s_!SSeF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc06da155-bf0f-4e7c-99ce-adf7fd87e932_4012x5337.png 1272w, https://substackcdn.com/image/fetch/$s_!SSeF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc06da155-bf0f-4e7c-99ce-adf7fd87e932_4012x5337.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h2>Google NoteBookLM Briefing</h2><h2><strong>Executive Summary</strong></h2><p>This document synthesizes key insights from a  discussion between Shane Gibson and Marco Wobben regarding <strong>Fact-Based Modeling </strong>&#8212;also known as Fact-Oriented Modeling. The central premise is that modern data modeling has become a &#8220;lost art,&#8221; leading to significant &#8220;business debt&#8221; where organizations lose the context and meaning behind their data due to silos and rapid staff turnover.</p><p>The core solution presented is Fact-Based Modeling, a methodology that grounds abstract technical requirements in &#8220;administrative reality&#8221; by combining linguistic terms with actual data examples. By focusing on how stakeholders communicate (Information Modeling) rather than just how systems store data (Data Modeling), Fact-Based Modeling allows organizations to bridge the gap between business subject matter experts (SMEs) and technical implementations. This approach not only ensures more accurate system design but also provides the necessary semantic grounding for emerging technologies like Large Language Models (LLMs).</p><p>--------------------------------------------------------------------------------</p><h3><strong>The Crisis of Lost Context: Technical and Business Debt</strong></h3><p>The current state of data management is characterized by a widening gap between what is stored in systems and what those records mean to the business.</p><ul><li><p><strong>Evaporation of Knowledge:</strong> Senior experts with decades of organizational history are retiring or leaving, and the average job tenure (four to six years) is too short to maintain deep context.</p></li><li><p><strong>Business Debt:</strong> This is the cumulative loss of meaning within an organization. When systems are built without documenting the &#8220;story&#8221; behind the data, the original business intent is lost, leaving IT to guess the context of legacy records.</p></li><li><p><strong>The Context Gap:</strong> Technical optimization (how data is stored) often overrides business representation (how data is used). This leads to &#8220;tribal wars&#8221; where different departments use the same terms (e.g., &#8220;inventory&#8221; or &#8220;customer&#8221;) to mean entirely different things based on their specific departmental needs.</p></li></ul><p>--------------------------------------------------------------------------------</p><h3><strong>Defining Fact-Based Modeling </strong></h3><p>Fact-Based Modeling is a methodology developed to capture domain knowledge by focusing on &#8220;facts&#8221;&#8212;statements about the business that are agreed upon as true within a specific context.</p><p><strong>Core Components of the Fact-Based Modeling Approach</strong></p><ul><li><p><strong>Grounding in Examples:</strong> Unlike traditional modeling that looks at abstract entities and attributes, Fact-Based Modeling uses &#8220;data by example.&#8221; Instead of discussing a &#8220;Customer&#8221; entity, a modeler uses a statement like: <em>&#8220;Customer 123 buys Product XYZ.&#8221;</em></p></li><li><p><strong>Binding Term and Fact:</strong> By combining the linguistic term with a concrete fact, modelers can identify misalignments quickly. For instance, seeing that one system identifies a customer by an email and another by a numeric ID reveals a transformation problem that abstract modeling might miss.</p></li><li><p><strong>Information Modeling vs. Data Modeling:</strong></p><ul><li><p> <strong>Information Modeling:</strong> Focuses on how humans communicate about data to reach alignment.</p></li><li><p> <strong>Data Modeling:</strong> Focuses on technical storage, structures, and optimization.</p></li><li><p> <strong>The Distinction:</strong> Information modeling is the &#8220;primary citizen,&#8221; while technical artifacts (SQL, schemas) are secondary outputs derived from it.</p></li></ul></li></ul><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WSQf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57420bb3-698b-43ec-a661-0de5e37e561d_486x215.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WSQf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57420bb3-698b-43ec-a661-0de5e37e561d_486x215.png 424w, https://substackcdn.com/image/fetch/$s_!WSQf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57420bb3-698b-43ec-a661-0de5e37e561d_486x215.png 848w, https://substackcdn.com/image/fetch/$s_!WSQf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57420bb3-698b-43ec-a661-0de5e37e561d_486x215.png 1272w, https://substackcdn.com/image/fetch/$s_!WSQf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57420bb3-698b-43ec-a661-0de5e37e561d_486x215.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WSQf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57420bb3-698b-43ec-a661-0de5e37e561d_486x215.png" width="676" height="299.05349794238685" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/57420bb3-698b-43ec-a661-0de5e37e561d_486x215.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:215,&quot;width&quot;:486,&quot;resizeWidth&quot;:676,&quot;bytes&quot;:27542,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/187938526?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57420bb3-698b-43ec-a661-0de5e37e561d_486x215.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!WSQf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57420bb3-698b-43ec-a661-0de5e37e561d_486x215.png 424w, https://substackcdn.com/image/fetch/$s_!WSQf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57420bb3-698b-43ec-a661-0de5e37e561d_486x215.png 848w, https://substackcdn.com/image/fetch/$s_!WSQf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57420bb3-698b-43ec-a661-0de5e37e561d_486x215.png 1272w, https://substackcdn.com/image/fetch/$s_!WSQf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57420bb3-698b-43ec-a661-0de5e37e561d_486x215.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>--------------------------------------------------------------------------------</p><h3><strong>Methodology: The Process of Fact-Based Modeling</strong></h3><p>Fact-Based Modeling follows a specific logical flow to ensure that complexity is simplified without losing essential nuances.</p><ol><li><p><strong>Scope the Domain:</strong> Identify the specific problem area (e.g., Sales, Emergency Room, Tax).</p></li><li><p><strong>Gather Data Stories:</strong> Interview SMEs to collect verbalizations of how they describe their work.</p></li><li><p><strong>Identify Business Constraints:</strong> Use interactive questioning to find the &#8220;rules&#8221; of the data. (e.g., &#8220;Can a citizen be registered in more than one municipality at once?&#8221;)</p></li><li><p><strong>Identify Exceptions:</strong> Use the data examples to flush out the &#8220;edge cases&#8221; that SMEs often forget until they see a specific record.</p></li><li><p><strong>Alignment through Generalized Objects:</strong> When different departments use different identifiers for the same concept (e.g., Name vs. Email), Fact-Based Modeling uses &#8220;generalized object types&#8221; to link these different views into a unified communication framework.</p></li></ol><p>--------------------------------------------------------------------------------</p><h3><strong>Strategic Value and Modern Application</strong></h3><p><strong>1. Automation and Efficiency</strong></p><p>Fact-Based Modeling allows for a &#8220;context-first&#8221; implementation. Because the model is rich in semantics and constraints, tools can automatically generate:</p><ul><li><p>SQL for database creation.</p></li><li><p>Data Vault or normalized models.</p></li><li><p>Database views that represent the original user stories.</p></li><li><p>Test data derived directly from the interviews.</p></li></ul><p><strong>2. Grounding AI and LLMs</strong></p><p>LLMs are proficient at generating &#8220;fabricated stories&#8221; but lack organizational context. Fact-Based Modeling provides the &#8220;ground truth&#8221; needed to keep AI outputs accurate. By feeding an LLM the terms, definitions, facts, and business constraints from a fact-based model, the AI can perform tasks with a much higher degree of reliability.</p><p><strong>3. Avoiding the &#8220;Generic Model&#8221; Trap</strong></p><p>The document highlights the failure of massive, pre-built industry models (e.g., the IBM Banking Model). These models often fail because organizations do not know their own &#8220;edge&#8221; or specific context. Fact-Based Modeling allows a company to capture its unique business logic rather than trying to fit into a generic template that ignores their specific reality.</p><p>--------------------------------------------------------------------------------</p><h3><strong>Notable Insights and Quotes</strong></h3><p><strong>On Complexity and Simplicity:</strong> &#8220;If the end product is presented and everybody goes: &#8216;Wow, is this it? Did it really take you that long... I could have done this,&#8217; then I succeeded in making something very complicated very simple to understand.&#8221; &#8212; <em>Marco Wobben.</em></p><p><strong>On the collaborative nature of solving data ambiguity: </strong>&#8220;It is somehow a team effort to slay this beast of miscommunication until everybody agrees and understands each other. ... It&#8217;s all about, working together and trying to fight it. What are we not seeing? What are we missing? How do we tackle this? And a lot of that is just human interaction.&#8221;  &#8212; <em>Marco Wobben.</em></p><p><strong>On the Definition of a Fact:</strong> &#8220;A fact is a piece of data that&#8217;s physically represented somewhere... I can point to it. It has been created. I&#8217;m not inferring it. It is something that is factually there.&#8221; &#8212; <em>Shane Gibson.</em></p><p><strong>On Party Entity Data Models:</strong> &#8220;The most expensive part of our systems is the humans and [understanding] that context... as soon as you design a system with &#8216;thing as a thing&#8217; and that context lives nowhere else, I now have to spend a massive amount of expensive time trying to understand what the hell [it is].&#8221; &#8212; <em>Shane Gibson.</em></p><p><strong>On the concept of &#8220;business debt&#8221; created by rapid technological change: </strong>&#8220;This is the paradox where business wants to have changed faster. And it ruined the party by saying, we can deliver faster with this new latest tech, but neither party realized what they were losing along the way. So it&#8217;s technical debt, it&#8217;s business debt.&#8221; &#8212; <em>Marco Wobben.</em></p><p>--------------------------------------------------------------------------------</p><p><strong>Conclusion</strong></p><p>Fact-Based Modeling serves as a bridge between the human understanding of business processes and the technical requirements of data storage. By prioritizing the &#8220;authentic story&#8221; of the business and grounding it in real-world data examples, organizations can mitigate the risks of technical and business debt, ensuring that their data remains a usable, understood asset even as technology and personnel change.</p><p></p><div class="pullquote"><p><strong>Tired of vague data requests and endless requirement meetings? The Information Product Canvas helps you get clarity in 30 minutes or less?</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://agiledataguides.com/ipc&quot;,&quot;text&quot;:&quot;Fix Your Data Requirements&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://agiledataguides.com/ipc"><span>Fix Your Data Requirements</span></a></p></div><p></p><h2>Transcript</h2><p><strong>Shane</strong>: Welcome to the Agile Data Podcast. I&#8217;m Shane Gibson.</p><p><strong>Marco</strong>: And I&#8217;m Marco Wobben.</p><p><strong>Shane</strong>: Hey, Marco. Thank you for coming on the show. Today we&#8217;re gonna talk about a thing called fact-based modeling, but before we do that, why don&#8217;t you give the audience a bit of background about yourself?</p><p><strong>Marco</strong>: Ah, yes. Thank you. Thanks for having me. Yeah, background a lot of background there. I&#8217;ve been around a few decades. I first fell in love with computers when I was still in secondary school. That got me hooked when somebody showed me the break key on a keyboard and we could stop games mid play, change the code and resume.</p><p>And that was like, oh, this is magic. I need to figure out how to make this my toolbox. And I enjoyed making software, hacking software working with computers ever since. And. From getting a job onboarding people on Microsoft Windows and Office. Back in the day, I decided to just chase my own career, quit the job and started doing entrepreneur work, made custom software from design to end product for a number of startups. And then somewhere early two thousands, a professor knocked on our door and said, we have this source code of a modeling tool. And the kids that graduated on it, with, they took a different path and we need somebody to maintain it. And ever since early 2000, I&#8217;ve been working on fact-based modeling that I had to learn from the inside out.</p><p>So that&#8217;s where I am now. I&#8217;m being considered the expert currently. &#8216;Cause a lot of professors retired and the young people haven&#8217;t caught on for it yet, just and here I am talking about fact-based modeling or fact oriented modeling, if you will. </p><p><strong>Shane</strong>: Yeah, it&#8217;s interesting. We before we started, we talked briefly about the fact that data modeling&#8217;s become a lost art. And actually, I think it&#8217;s coming back now with all the AI focus, it seems that data modeling is a term I&#8217;m seeing used a lot. But, if you think about your career path, that idea that you could start out with games as a way of introducing yourself to, to computers in the early days.</p><p>I was like, you I started out. We had a couple of computers at school. The old I think they went green. I think they were amber screens back then. And yeah, again, I got hooked by the games. And I wondered whether that&#8217;s the thing that&#8217;s been missing is gamifying data modeling.</p><p>Like actually making it exciting is, it&#8217;s probably the missing thing, right? Is if there was a game that you just happened to date, a model when you did it. Maybe that&#8217;s what we needed to make what we do sticky with with people that are coming up in their career.</p><p><strong>Marco</strong>: that&#8217;s an interesting take on it. it&#8217;s interesting looking at my users, I&#8217;ve been maintaining this information modeling tool for years now, is that there seems to be more. Interest from people that are curious and like doing things right and talk about things. And this is, I can see that in, in a lot of gaming communities where, you know, in the old days it was like you play your game single player with this computer, right?</p><p>But gamification nowadays is so much more you can team up and you none of that stuff was there in the early years. Even graphics were not there yet. The communication aspects right nowadays is more and more prevalent and important and it&#8217;s. If you look back, what data modeling really is to get the technical requirements of what people actually needed the computer to do in store and how to manage it.</p><p>Seeing that come back a little I dunno if gamification would help, but there&#8217;s definitely more openness to let&#8217;s talk about it. And you can see it a little bit in the agile phenomena where there&#8217;s a lot of standups stakeholders, product owners, and everybody starts talking and communicating with each other.</p><p>So that&#8217;s definitely on the rise. I&#8217;ve seen a few data modeling efforts that actually try to gamify it and it&#8217;s like a, they have this data modeling tool and it has flashing text and animated tables and, I&#8217;m not sure if that&#8217;s really the end goal, but I can see how it might help. </p><p><strong>Shane</strong>: Yeah. I was thinking more about gamification as in the adrenaline rush when you are successful in solving a problem not the flashing things. Because earlier in my career, I worked for a software startup in the accounting space called X. One of the reasons that they became successful was they had this gamification of bank reconciliation.</p><p>So pretty much you had bank transactions on the left and your invoices and all that on the right. And whenever a transaction came up on your bank statement, you pretty much dragged it and matched it to the right. And then that road disappeared and it was a very simple gamification process.</p><p>But there&#8217;s this adrenaline rush going in and seeing your bank rec with 50 transactions you haven&#8217;t reconciled, and just going bing, bang Bosch, and it becomes clear. And, I don&#8217;t, it didn&#8217;t have confetti. Actually that was one of the big arguments at the time was should it have confetti con? </p><p><strong>Marco</strong>: more like a Tetris road disappearing from your screen.</p><p><strong>Shane</strong>: Yeah. Yeah, exactly. And so you sit there going, maybe that&#8217;s it. So as I think about it more, when I&#8217;m conceptually modeling the adrenaline rush is when I create a map that I can show to a stakeholder and they just nod and go, yeah, that&#8217;s business reality. I get it. Yeah. That&#8217;s how we work.</p><p>When I logically model, it&#8217;s this idea of yeah, I can take that conceptual model and I can slam it into the modeling Pattern that I use and it makes sense. And then when I take that logical model and I make a physical model and the cloud database actually takes it and I can actually query it fast and it doesn&#8217;t cost me a fortune, and any question that I get asked can be answered with that data.</p><p>There&#8217;s that adrenaline rush, right? That gamification of each of those steps adding value to my life or somebody else&#8217;s. I don&#8217;t dunno, I just, I haven&#8217;t thought about it that way until you mentioned that you started out hacking games.</p><p><strong>Marco</strong>: That is, it definitely is true for me because, as I add features that support user functionality and all that, and it&#8217;s like there&#8217;s definitely a rush when I see people use it and they go oh, this is handy. This is practical. This, makes my work easier. Then I get the adrenaline rush. Definitely on the modeling part itself. It usually goes through long cycles of deep talks about the subject matter at hand, which, there&#8217;s definitely an adrenaline rush, but not necessarily always the good ones. As a, as an example that I used a lot and some other authors put it in their book too, is I had a long interview with the subject matter expert and it took us about two hours to figure out one requirement, and there was a lot of talk on the type level that created confusion because, yeah, what is a customer really? So I have to get through that. And in the end there was a little bit of an euphoria, which is, the adrenaline in rush, if you will, where we finally nailed it. The subject matter expert re replied, somewhat baffled and in disbelief, and he says, I had no idea my work was that complicated. And then being able to write it down in a way that everybody suddenly understands. That&#8217;s definitely a moment of adrenaline and rush, if you will. It takes two hours of hard work. It takes a lot of interviews, a lot of digging. And then finally when you reach that point, and it&#8217;s like one of, I remember a quote in a book I forgot which one it was, but I the quote really spoke to me and he this man was speaking about information modeling on a data modeling level.</p><p>And he said if the end product is presented and everybody goes. Wow, is this it? Did it really take you that long and work so hard to just present this? I could have done this, and then his reply was then I succeeded in making something very complicated. Very simple to understand.</p><p><strong>Shane</strong>: And that&#8217;s the key is to take that complexity and describe it in a simple way where everybody nods and either agrees or disagrees. I remember one where we literally spent three months with an organization trying to get the definition of active subscription. And the problem was we had three business units.</p><p>Effectively three domains that all had different definitions, but couldn&#8217;t agree. They actually had different definitions. It wasn&#8217;t around the model itself, it was around the plain language description of that term where either we added a term in front of it, marketing, active subscription, finance, active description.</p><p>So we were clear they two different definitions or they actually agreed active description was described in this way. And that human engagement was where the time was spent not creating a map with nodes and links on it. Not creating the database, but without actually getting to a stage that we ever agreed or disagreed with that term.</p><p>There was minimal value carrying on because we would present a number that people don&#8217;t agree with because the definition&#8217;s wrong, not the number. </p><p><strong>Marco</strong>: in the years of information modeling that I, I call it information modeling, but it&#8217;s really just a fact-based modeling underneath as well. But I&#8217;ve encountered so many synonyms or harmoniums or it&#8217;s just, people just completely get confused and it becomes almost like a battle.</p><p>But here&#8217;s the thing, and I think you explained that as well. It is somehow a team effort to slay this beast of miscommunication until everybody agrees and understands each other. I just recently started watching the latest series on Stranger Things that it&#8217;s we have to team up to fight this monster from the underworld, which is, we can&#8217;t see it. We know it&#8217;s there. And it&#8217;s like, how do you fight it? And and it&#8217;s all about, working together and trying to figure it out. What are we not seeing? What are we missing? How do we tackle this? And a lot of that is just human interaction.</p><p><strong>Shane</strong>: And finding ways of taking that expertise that we have that ability to take complexity in a business organization and try and create a map that has simplicity that we can all share. That is a skill, And it&#8217;s how do we take other people on the journey without going into a room for six months on our own or creating really complex ERD diagrams with many to many crow&#8217;s feet that, few people understand, And terminology is really important. And so it&#8217;s interesting that you talk about information modeling and then you also talk about fact-based modeling. &#8216;cause as soon as I heard of fact-based modeling, I naturally go to dimensional modeling and star schemas because that is where I first heard a definition of the term fact.</p><p>And my understanding is you are talking about information modeling rather than anything to do with dimensional or star schemas. Is that correct?</p><p><strong>Marco</strong>: Yes. That&#8217;s funny &#8216;cause your first take on the word fact is how I met my wife at a data modeling conference in Portland in the USA. She was like, oh, fact-based modeling I&#8217;m doing something with data warehousing. I should go to that class to listen what this man has to say.</p><p>And it was just not the same fact. So even there, even in it, we have confusion of words, but the word fact really boils down to something that is maybe a little older even where database records really store facts as they happen in our administrative reality as I call it nowadays.</p><p>Because there&#8217;s a lot of the single point of truth. We need to get the truth out there, the reality and all of those things. But there&#8217;s something seriously flawed with that, is that we all perceive from our own bias and subjective reality, the world out there. So there is no such thing as truth. But when you store data and you consider that to be true in your world, then you can state that as effect. Effect as in, I&#8217;m writing this down and me and my colleagues agree on it. And so that&#8217;s where the terminology came from. Nowadays we trying to figure out, maybe we should call those claims instead because we all say something and we all think it&#8217;s true.</p><p>And sometimes, especially in data warehousing you collect data from different source systems. Yeah, but you can&#8217;t just say that they all state facts because some might be alternate facts. So let&#8217;s put it as all the source systems claim a certain statement about this is what happened. </p><p><strong>Shane</strong>: it&#8217;s interesting, &#8216;cause we&#8217;re talking again beforehand about how we&#8217;ve both been in the domain for quite a while, but we&#8217;ve never really crossed paths. And I&#8217;ve heard your name a lot, but I&#8217;ve never really read a lot of your stuff. And one of the things I did do, I had to train a new team moving to the data space.</p><p>And so I was trying to describe the difference between facts, measures, and metrics. And what he ended up coming up with is a language definitions that I used and the definition I used was a fact is a piece of data that&#8217;s physically represented somewhere. So if I go into a database and I see a number.</p><p>That is a fact. If I go into a spreadsheet and I see a number or a piece of text, that is a fact because I can prove it existed. I can say, that factually it&#8217;s there. I can point to it. It has been created. I&#8217;m not inferring it. It is something that is factually there for me. And that&#8217;s kinda why I use fact.</p><p>And the reason I raised information modeling versus dimensional modeling is &#8216;cause as soon as I use that term fact, anybody in the data domain goes, oh, you&#8217;re talking about a fact table. And I&#8217;m like, no. And then for me, I defined, measure as a formula, some of this, that kind of thing.</p><p>And then a metric is a complex formula. So this over this based on that at this point in time. And so for me, I didn&#8217;t mind whether people disagreed with my definitions as long as they gave me a different definition. But that&#8217;s the three terms that I used that seemed to get clarity and understanding when I was talking to people who weren&#8217;t data experts.</p><p>So yeah, I go back to the true definition of that term. Fact is not a fact table and a dimensional model. And, maybe yeah, should we move to claim or should we just bring back the true definition of that term, that&#8217;s one of the problems we have in our domain is we have what we call pedantic semantic arguments about the the most non-interesting things for.</p><p><strong>Marco</strong>: We are not gonna solve that because there&#8217;s so much when I do actually information modeling and we can come up with the word, I&#8217;ve used it in, in, in different environments, but we can all agree upon what the definition is for the word inventory. It&#8217;s the amount of things that we have and offer certain article, but then you go into different departments.</p><p>Sale comes up with three because they already sold a few. Purchasing says, we have eight because they already ordered a few. And then you go into the warehouse and the guy goes but I only have one on the shelf. The, what the hell is going on? No, even though everybody agrees they have different data. And as soon as you, you and I would speak about facts, then, I could say that the customer buys a product and we agree and we call that a fact, but it has nothing to do with the data at all. So the word fact itself is like inventory, is location is like customers that you can apply it to anything and it doesn&#8217;t mean anything. So getting too hung up on it. Is very tricky because then you will start tribal wars because no. This is what the definition for a fact is. But the reality is that the word fact the linguistical part of it, the term of it is used in different contexts. So if you wanna use it within your dimensional world, it&#8217;s fine. I&#8217;m not gonna argue with that. Is similar to calling something red, we will find it in different environments. I&#8217;m looking at a book that has a red cover. You look at the fire truck is also red. It&#8217;s, oh, we&#8217;re all good.</p><p><strong>Shane</strong>: Unless you&#8217;re in a country where the firetrucks green. And yeah, it&#8217;s interesting &#8216;cause Remco Broekmans talks about the example he has of definition of a bed in a hospital. And where one group basically said, it is the metal frame that the patient sleeps on. And another part of the organization said, nah, hold on.</p><p>It&#8217;s the room where the patient&#8217;s located. And so if I looked at the data, I would see one was probably two meters by two and a half meters and the other one was probably five meters by five meters. And then I could say the fact that this beard is five meters by five meters and has no wheels confuses me because I&#8217;d expect it to be two by three with wheels.</p><p>Maybe that data&#8217;s gonna gimme a hint that the definitions are different. So getting into that. can you just gimme a helicopter view of how fact-based modeling works.</p><p><strong>Marco</strong>: You already gave me some beautiful examples is that to distinguish the things. You also have to look at the data and what I see happening in the data modeling space is usually. The data is not that relevant. We&#8217;re all looking at tables, entities, classes and what kind of attributes they have, what columns need to go in and what are the relationships or foreign keys and all.</p><p>So there&#8217;s a lot of technical views on it, and some of it may be guided by the data at hand. But the data itself is a secondary citizen. And as I just stated with the example of inventory, is it&#8217;s only by looking at the data that you start realizing, wait a minute, we&#8217;re all calling this inventory, but we&#8217;re seeing different things.</p><p>So there must be a distinction. And instead of having, this, I call it these tribal wars, these linguistical fights about no, that&#8217;s not what I said. This is what it means. And all of that. I call that. Type level arguments, type level discussions. It&#8217;s abstract in a way. So what factory ended modeling does is two things, is first, how do I talk about my data? And I use the data in the expressions, so I&#8217;m grounding it. I&#8217;m not talking about customer buys product. I&#8217;m saying customer with customer number 1, 2, 3 buys the product X, Y, z. Where both 1, 2, 3, and x, YZ is actual data, is real examples to ground the discussion.</p><p>Because, if I use inventory and I don&#8217;t specify that, I look at it in a certain way by giving the data, we will not discover that we might have a difference of perception there. So why the factor oriented modeling or fact-based modeling is that really we need to figure out how. I talk about it, how you talk about it and how we can talk about it and ground it with actual facts, this actual data it&#8217;s not enough to say the customer buys the product because you and I would agree, but it&#8217;s only by giving the actual data examples that we might start realizing, wait a minute, I have numbers to identify my customers, and you might have email addresses to identify your customers, so there&#8217;s something else going on. So it is really digging that one level deeper into it, and not letting that go because in the fact-based modeling, we keep tying the examples data with the language, with the fact types. So it&#8217;s always that package so that we can at any point, transform our information models into any kind of artifact, but still show the example data to accompany the definitions, to accompany the structures and to illustrate that this comes with a specific data use. And I think that&#8217;s the biggest difference from what you look at data modeling or dimensional modeling or data vault modeling is that it&#8217;s all on a type level and it&#8217;s usually geared towards how do we structure the storage of data. And it&#8217;s less about how do you and I figure out how we talk about the data and how do we quantify with data that we&#8217;re talking the same thing? </p><p><strong>Shane</strong>: What I found interesting was so I&#8217;ve used the who does what Pattern a lot, And I think, I probably got it from Lawrence Coors Beam stuff. &#8216;cause that was some of the stuff I read earlier in my career. And I&#8217;ll often talk about customer orders, product from employee in store.</p><p>And then I&#8217;ve also used data by example a lot. And again, I can&#8217;t remember where I found that patent, but I found it valuable, so Bob purchased three scoops of chocolate ice cream from Jane at 1 0 6 High Street in Swindon. And what interests me about the fact-based modeling when I had a quick look at some of your presentations, is it seems like you are combining both those patterns.</p><p>So you are combining both the term and the fact into that data story. So you are saying, customer Bob purchased product, chocolate, ice cream from employee Jane at store 1 0 6 High Street.</p><p>Is that what you&#8217;re doing? You are combining the term and the fact together to give much more context around that business reality to then help you in the rest of the work.</p><p><strong>Marco</strong>: yeah, definitely. It&#8217;s very much that approach and of course tools give you different functions to, to do it in a different way. But the theory that was even developed in, mid seventies and continue to develop all the way up to the end of the nineties was really about how can we write it down in a way that non-technical people can actually read what it means.</p><p>And in the interviewing phase, in the workshops, in the explorative areas, is that, as I&#8217;ve given the examples earlier is that it&#8217;s not enough to just say, okay, we have a customer and we need a data system for that. It&#8217;s like we needed to also figure out how you talk about it.</p><p>And I think what is an increasingly more difficult problem is that where in the seventies, eighties, even nineties a lot of systems were built for a specific purpose, for a specific department, for a specific system that everybody knew the context of. So now if I&#8217;m doing a customer registration, then I&#8217;m just doing just that for my department within my group of employees.</p><p>And we all know the context. So everybody understands that if I put customer there, they all know what customer is. Now with the increase of it is that. It added a computer on every department, in every system, But they never integrated. So this is where data warehousing came in. But because it was very high context, implied the data warehousing, the BI team now had to figure out, okay, what if we pull the data?</p><p>What does it even mean? But we&#8217;re really trying to rediscover the context of where it is used. The human aspects of understanding need to be reverse engineered on top of the data structure. And this especially goes for what we&#8217;re currently seeing with all the LLMs. Yes, it can give great semantics or storylines, but does it really understand the context?</p><p>So this is another gap. And what the fact oriented modeling approaches is that we try to interview and capture the knowledge of the domain and subject matter experts. With all the context, with their language, with their examples and try to line that up with the data, the language and the meaning of some other department that uses a different system. And not just to be able to talk about that specific system, but to also align the communication across departments, across contexts. That&#8217;s really what got lost. If you have individual silos that you want to bring together into a data warehouse solution, that the missing piece is the original and authentic communication about their systems in the first place. And then the factor oriented approach allows those subject matter experts to talk about it in a way that they understand and can be verified by colleagues and can be transferred across. Context. And I think that&#8217;s the real added value that when systems were built within a specific context, the value of that was not really seen because it served the context.</p><p>Everybody knew that. where you see now an increasing in interest in, okay, we need to get back to our ontology, we need to get back to our semantics. We need to get back to meaning, we have to get back to information. So all of that was there and it&#8217;s still there if you don&#8217;t throw away the baby with the bath water.</p><p><strong>Shane</strong>: I agree. I, we go through waves, so we see a wave right now outside the AI wave. We see a wave of bi me metric layers. This idea of having a layer where you can define a metric and then regardless of which system, the data&#8217;s coming from or you are hitting that same metric&#8217;s been applied.</p><p>And I go well, you know, years ago in the old BI tools we had end user layer or a universe. We&#8217;ve had that Pattern and then we lost it, and then we get it back. I&#8217;m also with you in terms of the way systems have evolved. We&#8217;ve gone from a mainframe where we had one system where there was a term called customer and there was one fact and that was it.</p><p>And then we went to client Server N Tier, and we ended up with seven. And then software as a service, 50 to a hundred. And with the AI wave now and Gen ai, we&#8217;re gonna see these one shot apps. We&#8217;re gonna see thousands of things that have created for one use case get stood up really quickly, and then probably disposed of.</p><p>And so this ability to have a language where regardless of the system and how many we have, we get shared context is really important.</p><p>And that example that you used, I just wanna come back to that because I hadn&#8217;t thought about it that way. So if I said customer orders, product. I have a thousand systems that involve a customer, an order, and a product.</p><p>I have absolutely no context whether they&#8217;re defining the same things the same way.</p><p>But if I use this data by example, if I say customer, Bob, ordered product, and then in another system I see customer bob@gmail.com ordered product, and in another system I see customer 1, 2, 3, ordered product, and in another system, customer A, B, C, ordered product.</p><p>What I know now based on Pattern recognition, is that I have a problem with the unique identifier of customer.</p><p>Like I can tell that in 10 seconds by seeing those four data examples and it&#8217;s something I know I need to solve.</p><p>Because to mash that all up, I need some form of conformity, right?</p><p>Some form of either shared identifier or a way of mapping it. And so I can see by this combining this term and this fact together, it gives me the richness to understand the problems I need to solve much quicker than any of the other techniques where they&#8217;re separated.</p><p><strong>Marco</strong>: That&#8217;s true and it&#8217;s, you&#8217;re still some somewhat catching an optimal path because you already assumed in this example that all those systems had something called customer. And in one system it&#8217;s probably CST. Then you have to combine it with SAP, where it has X 3 0 4 as a table name and then you have a very abstract table called persons. So you can see where this is going because all of those systems, the storage and the structuring of the data in that system served only one purpose. Optimizing the IT part of it, the IT end product presented the data in a way that the business wants to work with it and wants to see it, but it doesn&#8217;t represent how it is stored. The storage and the management of the physical data is technically optimized. It is not with the business representation in mind necessarily. And this is what a lot of people in the data space have encountered too. It&#8217;s okay, now I have 200 source systems and I need to figure out what is what.</p><p>So is this email address, does that indeed correspond with my customer? In the other system where it&#8217;s identified with 1, 2, 3, is that even the same customer? Is it a customer in the first place? So there is so much not just technical debt for systems that are undocumented or not well documented or behind in documentation, but also a, what I increasingly call the business debt is that nobody knows what that meant to the business anymore. And this is the paradox where business wants to have changed faster. And it ruined the party by saying, we can deliver faster with this new latest tech, but neither party realized what they were losing along the way. So it&#8217;s technical debt, it&#8217;s business debt. In governments it&#8217;s even worse because there&#8217;s a massive gap between the legal articles made up by politicians full of compromises and loopholes to the actual systems used by government bodies.</p><p>And, now we changed the law. Which system did we need to change or vice versa? We&#8217;re looking at data here but we have no idea if we&#8217;re even allowed to have this data or even be able to look at it because we don&#8217;t know the legal articles with it. So everything in it scaled up in the past 30 years so quickly. It totally got outta control in a way where the next tool&#8217;s not gonna solve it. An LLM is wonderful, magical stuff, but it&#8217;s not gonna solve the real thinking issues. We can do data profiling because we were still not sure if we got the context right. We can do LLM, but we are still not sure if the relationships are correct. So there&#8217;s still that gap of knowledge that we lost and somehow need to reintroduce to make things really work.</p><p><strong>Shane</strong>: And actually that&#8217;s interesting around that organizational context and losing that knowledge. Because if I think about it, if I was walking into a new organization and I wanted to understand the context of that organization, my natural technique was to find somebody who&#8217;d been there for a long time as a subject matter expert, that person that&#8217;d been there for 20 years because they&#8217;ve got the stories of how that context happened and why it happened.</p><p>And so moving to the uk I had to set up bank accounts. And so I went, and there&#8217;s a whole reason that I needed to do a UK domicile bank rather than one of the newer, easier to deal with banks. And so I went to one of the main banks, I created a personal bank account.</p><p>Took a while. All good. And then I went back to that. &#8216;cause now I&#8217;m a customer of being identified. In theory. Everything is easy. And I tried to create a business bank account and it forced me to create a new identity. I had to go through the whole identity process, even though I used the same email address, which was my form of identification, my identifier, I had to go and revalidate myself that I have a same address, same passport number.</p><p>And you sit back as a customer and you go, that&#8217;s just crazy bollocks. And then I talked to somebody that had worked for that bank, that subject matter expert who&#8217;d been there for a while. And they said, yeah, but you gotta understand that was two different banks, two different systems, and they haven&#8217;t been merged.</p><p>And therefore you may think you&#8217;ll be dealing with the same organization, but you&#8217;re not really. And I&#8217;m like, yeah, actually. Okay, that makes sense. Now why? Why did I not understand that working in data so often, but if we go back to that example of term, in fact, so yes, if the term changes, so now I see a customer and I see person and I see prospect and I see X 2 0 3.</p><p>If the fact is the same, if every one of those is customer Bob at Gmail X 2 0 3, Bob at Gmail, prospect Bob at Gmail, again, I&#8217;m getting more context, I&#8217;m not getting an answer, but I&#8217;m getting more information that can help me understand a problem to be solved. And as we know with data, there&#8217;s so many problems to be solved.</p><p>It&#8217;s such a complex space. When we hit reality of organizations, the way they work, the terms they use, the systems they use, the way they create and store data. But this idea of binding term and facts. Gives me some more hints because now I can say they&#8217;re all the same email address, are they the same term?</p><p>And then somebody will say actually no, when you see prospect and customer, it is a different rule and then somebody else will go. But you do know that they can change their email address whenever they feel like it. And you&#8217;re like, yep, seen that Pattern before. Okay, so it&#8217;s an identifier, it&#8217;s a unique identifier, but it&#8217;s not a consistent or persistent identifier.</p><p>There&#8217;s all these patterns and data that we know are gonna hit us, but by binding that term, in that fact, I get some hints at the beginning across multiple subject matter experts. So I can see real value in, in that part of the process early. And so talk me through that, You drop into a new organization.</p><p>You wanna start off with fact-based modeling. How do you do it? What do you actually do?</p><p><strong>Marco</strong>: It&#8217;s a funny question. I, a lot of people ask me that, how do you start it? If you open up the books about this topic that either are written as a university. Proof of concept all the way to, self study. It always starts with gather your sources, figure out what is the domain about in the first place.</p><p>So there&#8217;s very much a almost top down approach where you go okay, we got sales, we&#8217;ve got production. so it&#8217;s the general area. You can&#8217;t just jump into the jungle and start describing all the little insects on the jungle floor.</p><p>It&#8217;s that&#8217;s not how it works. But usually there is a problem domain, there is an integration problem. and. So what needs to happen is that you need to be able to carve out some time with at least some business domain expert subject matter expert to sit down and say, okay what&#8217;s the issue here?</p><p>What do you, what are you doing? What does it do? And start writing it down. so far, not that much different from actual data modeling, if you will. But the distinction starts with the nitty gritty where you need to get things right. And it&#8217;s usually getting things right, not just to verify if you understood correctly, because that&#8217;s already a hard part as my two hour example earlier showed to, just get one line of requirements, correct. But that back and forth with language and examples is of great help in getting to actual understanding. But the major part is usually you need to work with a colleague that also needs to understand it. And with the current short-lived career jumps, if you will, is that you may find an expert, but he might be gone in four hours or as is happening currently as well. A lot of seniors that actually know the organization, they&#8217;re getting in a pension range and they just leaving the company. And what is left behind is usually short lived career steps, short lived managers that move on. There&#8217;s a lot of tempo where at the average job years is four to six years. So the knowledge actually evaporates while we&#8217;re looking at it, while we&#8217;re trying to document it. and that&#8217;s where the difference starts to rise between traditional data modeling that creates diagrams and type level schemas into a, if you compare that with factory and modeling, you have real user stories that you can, with a click of a mouse can pull up and you can read how it&#8217;s being used in language in the organization and how it&#8217;s being transferred to other departments. And I think that securing that knowledge, that semantic rich document, if you will that is something that I see that gets lost with the more traditional, more technical data modeling. It&#8217;s, yes, you have the traditional layers of conceptual, logical, physical. But still, there&#8217;s not really a story there. And the people that I talked with and explained this kind of stuff is that they all recognize what I&#8217;m describing and a lot of the architects and a lot of the data models will reply with, yeah, that&#8217;s what I do in my head. And then my question is, my obvious question is but does it leave your head, does it get written down somewhere so that if you leave, your colleague can take over? And then the answer is usually no. There might be a document somewhere, describes the use case, but that very quickly gets evaporated as well, because everybody starts looking at the artifacts and the technical schemas. So by having an environment where you tie it all together, that you cannot do one without the other. That was really the grounding for the discussions the proof of the pudding, if if you will. But it also gives you the anchors going forward to technical artifacts. So by capturing that from the domain experts, putting that in an information model and being able to generate technical artifacts, even SQL to generate database, it will still comment all that SQL with all the written language and examples. It will generate a database with test data that came from the interview in the first place. It will add database views representing the full user stories on top of the production data, representing the interview. So it really is that don&#8217;t throw it away, don&#8217;t throw it away. Keep it as long as possible so that everybody is able to understand and read what the actual data is and what it means and how it&#8217;s communicated.</p><p><strong>Shane</strong>: that&#8217;s interesting because that is a form of context first implementation. And what I mean by that is with our product, we define the context first and then the technical implementation is hydrated. So I&#8217;ll create the context of a business rule, change rule, and then our system will hydrate that into physical tables and SQL transformations.</p><p>But that context that I create is the key thing we, we care about, right? That is our pet. The way we deploy it and run it is our cattle. And what we&#8217;re seeing in the new Gen ai, LLM world is that context is actually far more important for the LMS and the physical implementation of that context.</p><p>And then as I said, we&#8217;re I&#8217;m working with Juha Coer at the moment trying to write a book around how to concept model. And one of the things we came up with is one of the steps is define your domain scope. So like you said, find a subject matter expert who understands it.</p><p>The next one is get data stories. And then after that, identify events, then concepts, and then connections or relationships. And the key thing is, as you said, is when you talk to experts in this process, if they can articulate the steps they take, those steps are often common. They might use different terms, and they might do them in slightly different order in slightly different ways, but we all do it the same.</p><p>If we think about it consciously, it&#8217;s where we bring in patent templates, where we bring in artifacts that we use repeatably, that we get that repeatability, that ability for that context to be stored in a way that another person can use it.</p><p>So in my view, yes, it&#8217;s great if we have a system that does all that for us, but we don&#8217;t have to, if we just have templates that are reusable, that&#8217;s valuable on its own.</p><p>If we have a repeatable process that&#8217;s not a methodology, right? It&#8217;s not fixed, but it&#8217;s just a way of working that is valuable because it&#8217;s bringing that knowledge back. So I just wanna take you back to something that you said, So if I take this idea of term, in fact having massive value and that we get that with a subject matter expert, and by documenting it in that format.</p><p>That context is able to be seen and understood by many people other than us. You then said that it&#8217;s also a way of getting alignment. So if I get that term, in fact, if I get that fact-based model for inventory from three different domains,</p><p>How do you deal with the alignment problem?</p><p><strong>Marco</strong>: I can illustrate it by an example. And let&#8217;s stay close to the examples that we already mentioned, but it&#8217;s more powerful, more generic and more diverse than that. But I think we would need a podcast of another two days that would explain all of the ins and outs, but, so really to just show you how that would work. And I have to for the listeners doing this in an audio only podcast, I&#8217;ll try to make it as visual as I can. So when I say Marco Ban lives in Urich, which is the city of, where I live, is that would be effect that you and I can agree on and, me and my wife, we definitely agree upon that.</p><p>So let&#8217;s consider that effect and the statement that is to hold some truth value there. But there&#8217;s something else going on because Marco Warban lives, INTA is a state statement that, my wife lives in nut, my kids live in, so there&#8217;s a bunch of statements.</p><p>They all express something that we can classify as the city of residence. So by stating multiple examples like that, we can type those kind of statements as city of residence as the fact type. Now within me saying Mark of wo lives inre, I actually embed knowledge in there, even though you and I can agree upon the actual fact. I&#8217;m also saying, wait, Marco Ban is the citizen, is the city. But it goes a little deeper because my citizen is not really a citizen. It is just how I identify a citizen. And in that identification I can see, wait a minute, there&#8217;s a first name and there&#8217;s a surname. Similar with the city of Urich. It&#8217;s not really a city, it&#8217;s we&#8217;re representing the city by storing the name of the city. So you can see that if you can visualize that there&#8217;s knowledge, almost like a graph in there where I started with the city of residence. I gave it the semantics lives in, but it also has structure. It has the citizen, it has the city, it has the first name, the surname, the city name, and all of that ties together into this single fact statement. Now I can populate that with different examples. I can give my wife a position in there and my kids. And so it is populated with all kinds of example data and then the subject matter experts is then post with a series of questions. Could it happen at the same time that Marco lives in re as ma Marco lives in New York?</p><p>And then he would probably say, no, there&#8217;s something wrong with that. By going through these interactive sessions, which is almost like a gamification of the interview if you will, is you discover the business constraints and then by discovering the business constraints, that will lead to a certain structure.</p><p>When it comes to data modeling. If I can live in only one city, then the city is probably an attribute in the citizen&#8217;s table. If I can have multiple cities where I can live because that&#8217;s allowed in our register, then I would probably need a linking table in the end in physical model. So these constraints steer how the data is structured. And this is the interesting byproduct and I&#8217;m not sure if I&#8217;m still on track of their question, is that. In the interviews with subject matter experts, we can find ourselves very easily in hours long sessions about, what is a citizen. But as soon as they cannot give me a proper example to illustrate what they&#8217;re talking about, I&#8217;m talking to the people that are working outside of their scope of expertise. So that helps steering that. So there&#8217;s an organizational alignment in my efforts to find the data, illustrating the information. Now, the alignment on the other hand is now I start identifying Marco at some Gmail address lives in. So now I&#8217;m identifying the same citizen, but I&#8217;m using an email address. Now, obviously, in, in official organizations and registers that would never do, but, let&#8217;s suppose we have a small tennis club and it&#8217;s fine, so what I find now is that I still speak of city of residence. I still speak of citizen and city name, but I don&#8217;t have first name and surname.</p><p>So suddenly that citizen, where it used to be first name and surname now is an email address, but it still identifies a citizen. So what happens in the information grammar, if you will, is that there is something introduced called a generalized object type. My citizen can either be identified by first name and surname or by email address. So in the modeling part, it&#8217;s very easy to find statements that either generalize, which means that it allows different ways of identification for it, or different ways to talk about it. I now, I am presented with a data challenge because which combination of first name and surname goes with which email address. So again, I need a different fact statement that now would introduce, Marco W has an email address called Marco such and such at Gmail, which links the two keys. So nothing in my communication has to be altered to support alignment of different ways of identifying it, doing data mapping, just as part of the verbalization. I have a department here that only works with names. I have a department there that only works with Gmail addresses. I can make them talk to each other &#8216; cause that needs to happen sooner or later. And the way that they talk to each other is say yeah, you&#8217;re right. This market woman corresponds with that email address because I have proof of that.</p><p>So then suddenly you have a mechanism introduced in the communication on how to identify certain entities in your data administration in a unified way. So this is different examples on how these alignments work on both semantic level, on a data level, on an identification level. In short.</p><p>Shane: I was thinking about it slightly differently, but it&#8217;s exactly the same process, I think. So let me play it back to you where I think I got to. you are talking there about, okay, we have this term and the term has the same context definition but the facts that identify that term are slightly different, right?</p><p>And we can then do some alignment where we see these different identifiers.</p><p><strong>Marco</strong>: Yep.</p><p><strong>Shane</strong>: The one that I was thinking about is where a domain or a business unit has a completely different definition for that term. And another one does. And we identify that. So let me give you an example that&#8217;s real for me right at the moment around citizenship or residency.</p><p><strong>Marco</strong>: Oh.</p><p><strong>Shane</strong>: So I think I know what made me a resident in the uk. It&#8217;s when I got a certain kind of visa and I entered the country. And as soon as I did that, that&#8217;s the rule that tricked the tick in the box that I am a resident of the uk but when I deal with tax residency, it&#8217;s a different set of rules.</p><p>And so they&#8217;re slightly different, I have to live here, but then also I have to make sure that there&#8217;s certain things I don&#8217;t do anymore back in my original country of tax residency. And so they&#8217;re both residencies, but they&#8217;re slightly different. And then how would I articulate that using term and fact, How would I fact based modeling it. And the thing that you talked about is this idea of business constraints. If I can describe the business constraint of what our residency is versus the business constraint of a tax residency, again, now I&#8217;ve got two patterns and I can look at those two patterns, those two constraints and say they&#8217;re the same or similar or they&#8217;re very different.</p><p>And in this case I&#8217;d say they&#8217;re different, One is around how my financial transactions work and one is where my ass sits when I have breakfast for the majority of the year, &#8216;cause I can spend a certain amount of time outside the country and I am still a resident of this country.</p><p>And then I think the other thing that this Pattern gives us, which we know happens a lot because we see humans do it, is as soon as I give somebody those terms and facts. And a set of business constraint, a set of descriptions of how they behave. Humans love to point out exceptions, especially subject matter experts.</p><p>It&#8217;s yeah, you could be a tax resident in the UK if you do that, but actually if you do this one other thing that nobody ever tells you about, actually you are not That&#8217;s the exception. Oh. And by the way, system A knows about that, but System B doesn&#8217;t. And humans are really good at pulling out that.</p><p>I dunno, what would you call it? The knowledge that only they have. And like you said, when we used to work for organizations for 25 years, that knowledge state in the organization, now you&#8217;re lucky if it&#8217;s five. So that context, that knowledge of those exceptions disappears. I can see how this idea of defining or articulating business constraints, identifying exceptions, finding them to that term, in fact Pattern in an artifact that&#8217;s repeatable.</p><p>We can now shortcut, like you said, the need for a data expert and a subject matter expert to spend a week in a room going through this magical process that only two people can do. And make it a little bit more, not democratized, but a little bit more accessible and repeatable.</p><p><strong>Marco</strong>: it&#8217;s an interesting example, but again, it&#8217;s, it touches the exceptions. Again, your example in itself is an exception. And it&#8217;s, it is too much to say about exceptions. But even talking about it as you just did from a tax context, Shane, the citizen dot. So even in our semantic and in our language use, oh, we already start distinguishing that. And then it comes back to do we identify the citizen in a text context the same as the citizen in a legal context. And to tie it back to my previous example, are those two the same citizen? And all three of them are different domains. There is the text domain, there&#8217;s the legal domain, and there is the domain where we might want to match if this is the same person, for whatever anti-terrorist rule organization or whatever. So there&#8217;s different domains and every domain has different rule set. So blankly calling them all citizen because in mentally we can say, yeah, it&#8217;s the same person, it&#8217;s the same citizen. But data administration wise, the data administration of identifiers for a citizen are not the same as the data administration for identifiers for a citizen in a different context. And I think again, that&#8217;s the big separator between and it causes a lot of confusion in interviews and workshops, is that. We humans unify that because we see that the reality, it is about one person. But what distinguishes that is that we need to talk about how we talk about the data. And that is not the same thing as the reality. And by separating those out, it makes it, in a way, it makes it easier to say, okay, but are we talking about the same data administration?</p><p>No. Then we&#8217;re separating those out and if we are talking about three different domains that need to come together in one data administration, then we need more semantics. Distinguishing the three.</p><p><strong>Shane</strong>: a couple of things I want to loop around there. So again if we go back to the term actually, there is a citizen and a , physical residency and a tax residency because I&#8217;m actually a citizen of New Zealand still, but I&#8217;m resident physically in the UK and my tax residencies coming with me. But again, it&#8217;s not until we, we start using term and facts together that I can articulate that those are three different things, and it&#8217;s the business constraints and the exceptions that tell me there are different things. one of the things that I find interesting is this idea of focusing on the complexity and modeling that, is the area that&#8217;s gonna cause you a problem.</p><p>So if you ever do data bulk training with Hans and WinCo Brockman, they&#8217;ll, and I dunno if they do it anymore, but they used to talk about Peter the fly. And so the example would be if I am sitting in this ice cream store and I see Bob come in and order that ice cream and there&#8217;s an order id, So I can say when I put term, in fact, there&#8217;s a an ID in there of 1, 2, 3, 4, which is the order number. And if I want to know whether they actually got the ice cream. Is a separate part of that process. And on Peter the fly, I can see the ice cream being handed to Bob. So I know it happens if I&#8217;m in the room, but if I&#8217;m looking purely at the data, at the facts, I see nothing because there is no handover,</p><p>there&#8217;s no delivery ID that they got the ice cream. So I&#8217;m gonna have to infer that if there was an order and no refund turned up, that I&#8217;m inferring that happened, but it&#8217;s not a fact. And so I think, again, those terms and facts tell our stories and then as data modelers we can look at the exceptions that we know we need to worry about.</p><p>But I wanna go back to this idea of domain because it&#8217;s one that I struggled with a lot and I still do is a domain boundary can be anything. It can be a business unit, it can be a series of core business events. It can be a process, it can be a use case, it can be a team topology it, a domain is just a boundary where you say it&#8217;s in that boundary or it&#8217;s not right.</p><p>It&#8217;s in this domain. It&#8217;s in that domain. And I hadn&#8217;t thought about using the term fact and business constraints as a way of defining boundaries. For domains. So if I see, a term, in fact that&#8217;s all around my citizenship and I see another term in series of facts and business constraints around my residency and I see another one around tax residency and they are different, then I can use those as domain boundaries.</p><p>They might be too granular for the intent that I want to use it for. But what we&#8217;re saying is they are different. You write them down as words and numbers you can tell they&#8217;re different. So that gives us a form of boundary. And so that&#8217;s really interesting because it gives us a Pattern,</p><p>it gives us a formula of how we can say that this sits in boundary A domain A, and this sits in domain B because of these rules. And I find that really intriguing and valuable &#8216;cause it&#8217;s solving a problem that I&#8217;ve been trying to solve for many years. &#8216;cause it&#8217;s annoying me.</p><p><strong>Marco</strong>: Yeah. It&#8217;s, it is. So there&#8217;s two you&#8217;re right there&#8217;s two big areas where you can see there&#8217;s a separation of domain. And so rules is one of them. It&#8217;s in a hospital systems for example, you might wanna need somebody&#8217;s birthday to be able to put it into the computer.</p><p>As we had this kind of operations, these treatments, it needs to go this to his insurance company and all of that. But if you&#8217;re brought into the emergency room unconscious, you have no idea on you. You&#8217;re still being registered as a patient. So the rule is entirely different, but yet we have a patient number somehow. And the definition of the patient is clear across the whole hospital. So rules determine some sort of context. Or, if you come in with an appointment, they probably know your birthdate. If you wanna send it to the insurance company, they have to have your birthdate. The emergency room doesn&#8217;t really care. So how do you align all of that? So then you see, again it&#8217;s we&#8217;re unified in the language, we&#8217;re differentiating on the rules, and somehow we have to integrate systems. The most important part is that independent of which system is being built, we need to figure out how we communicate first.</p><p>So that language always goes number one. And that&#8217;s basically the what fact-based modeling does. It allows people to talk about it from their context, and then given that specific context, you add specific rules for that context. The example where can I live in multiple cities, municipality wise, no, you cannot. You have to be registered in one municipality. Now, in a more generic way, you can have people registered to different places. Sure. I have an office here, I live there, I have a vacation home there. And it&#8217;s yeah, different places. But on a municipality level where it becomes more context specific, there is a rule that says you can only register in one municipality. So there&#8217;s that too. There&#8217;s, there is a way to generalize in a more abstract way. If Ft has been doing that for years, where they introduced the party model, is it a person, is it an organization? Is it, we don&#8217;t know, is let&#8217;s call it party. We&#8217;ll give it an artificial key and we&#8217;re good. That&#8217;s just the way to keep the IT system running. It has nothing to do with semantic or integration or alignment or whatsoever. It&#8217;s just a technical solution because we didn&#8217;t get the business meaning in the first place, or we are not specific enough to a specific context. So these problems will always occur. And I think that separates the data modeling from information modeling is where the data modeling is. Do we find alignment in how we need to build the system? Whereas information modeling is, do we find alignment on how we communicate about the data?</p><p><strong>Shane</strong>: so again, you are making a context differentiation</p><p>between the audience. That&#8217;s the consumer of what we are creating. And you are saying if it&#8217;s stakeholders who aren&#8217;t data experts or IT experts, then we&#8217;re information modeling. If it&#8217;s data people or IT people then we are data modeling,</p><p>And the ability to have both languages, but then a mapping and sharing across those languages of where the real value is. And so like you, I&#8217;m not a fan of thing as a thing. I don&#8217;t agree we should ever use that term in our information models. We should never use party entity thing as a thing.</p><p>A thing is associated with a thing as a generic way of describing context to a stakeholder. I also don&#8217;t think we should put that in our technical systems. Years ago when we had mainframes and we had no memory and we had no disc and the infrastructure was expensive yes, it had value to us. Right now, the most expensive part of our systems is the humans and understand that context.</p><p>And as soon as you design a system with thing, as a thing, and that context lives nowhere else, I now have to spend a massive amount of expensive time trying to understand what the hell, how many things you have, how many things they are related to, and how many things those relationships are.</p><p>And that is an expensive piece of work. And I just can&#8217;t justify doing that anymore. But as you can tell, I&#8217;m slightly opinionated on that one. And it&#8217;s also, if I come back to the way we do that information modeling the grain of it, The detail we go to is interesting</p><p>because if I do it just based on terms customer orders product, it&#8217;s very different doing it based on terms and facts customer bob orders ice cream </p><p><strong>Marco</strong>: Yes. </p><p><strong>Shane</strong>: And so what you are saying is actually you are bringing more detail, a higher level of grain into that information modeling process earlier because it has value, so you&#8217;re doing more work upfront because you&#8217;ve found ways of taking that work that&#8217;s done early and automating some of the downstream work in terms of the technical implementations.</p><p>And so bringing, from an agile point of view, you are doing work upfront work in advance, which could be waste, but you&#8217;re found a way of taking that work and reducing the waste further down the value stream.</p><p><strong>Marco</strong>: Yeah. Correct. It&#8217;s really that, and for those who are interested in this a lot of the information can be found on the website called casetalk.com. What it shows, and this was developed in in, like I said in the seventies, all the way up to 2000 is it&#8217;s not just, oh, how can we capture the language?</p><p>But it was really, how do we talk to the domain expert to come up with the appropriate IT system? And I think what happened really is that it is such a rich and detailed environment where people modeling traditionally with business knowledge in their heads doing the IT itself, which was very close bound to, organization. When the first computers came in, it&#8217;s like it was usually the domain expert that got an IT training. They knew the context, but nowadays it scales up so much is that you are an IT professional. You have no idea about any business. You just, your business is it. So getting back on that and trying to find that alignment with business is, it is of increasing interest, but it&#8217;s not the majority.</p><p>A lot of new young professionals base their efforts on the tools and the ability of tools and not as much as are we doing the right thing for the business because they don&#8217;t really care. They were not educated as such. There&#8217;s a massive gap there that where business looks at it is like you&#8217;re, you are the expert, so you tell me and then there&#8217;s this rift and. You mentioned data Vault a couple of times where, traditionally the hub is the natural business key, right? It&#8217;s not a technical key in the source system. So what is the business key? Then you have to talk to the business. And the rich information modeling does with all the fact base is that already mentioned Marco Ledge and Nutra.</p><p>So I already have the natural business keys right there. And with a push of a button, it can then be generated into a citizen table and a city table and even have artificial keys with minor annotations like city of residence. We might want to keep a log in time, we might wanna have history in there, but also when is he planning to move to the next city so it becomes bitemporal. And these are very simple flags in the information model that can be generated to a data model that says, did you want a data volt model or a normalized model, or did you want a adjacent schema? So the physical parts become automatable. The data model becomes automatable and having all the semantics and verbiage and examples, it makes it verifiable and readable by the business. And so that really what it ties in and some experiments show that precisely that combination is the real power to keep LLMs grounded as well.</p><p><strong>Shane</strong>: I definitely agree on the LM front. We know that if we pass at the terms definitions of the terms, the facts, so data examples around those terms and their relationships and natural language business constraints into those lms, we get a much better response when we want to do a task.</p><p><strong>Marco</strong>: Yep.</p><p><strong>Shane</strong>: So that context is really important.</p><p>And that&#8217;s why, it&#8217;s interesting watching all the, bi semantic layer vendors try to say they&#8217;re context engines and you&#8217;re sitting there going, but you&#8217;re just at the end of the chain, you&#8217;ve just got metrics and maybe some definitions, but the richness is all sitting on the left of our value stream.</p><p>It&#8217;s all around problem ideation, discovery design. It&#8217;s not physical implementation of our consume layer for your cube. Yes, it&#8217;s got some value And so I&#8217;m in really interested to watch the reinvention of the market yet again. &#8216;cause as you said technology is often what people get taught.</p><p>A number of times I&#8217;ll talk to a data science student who&#8217;s been taught Python</p><p>And you go back and you go, but. How do you understand the business problem? How do you understand the value you&#8217;re gonna deliver if you build that ML model? And we&#8217;ve seen, as you said, we&#8217;ve seen over time data teams that don&#8217;t add value don&#8217;t survive.</p><p><strong>Marco</strong>: True. And I remember an interview that was told that we&#8217;re, and not to talk them down, but it really points at the lack of education as well, or, the business debt and the technical debt. It&#8217;s just people are being trained in tools and in technical approaches where some of the analysts really said, what is your highlight of the day?</p><p>Is that when they discover insight, it&#8217;s like insight. Should it not have started with the insight as documentation, or what are we really doing instead of let&#8217;s discover it. That is, like you said it&#8217;s the end of the chain while everybody shouts. Shift left shift left. It&#8217;s yeah, why?</p><p>Because that&#8217;s where it started and that&#8217;s what we lost. From my perspective, it just starts with can we talk to each other and are we writing that down? How we do that? Because that&#8217;s what we need to do in the end. And whatever it system we built, whatever dashboard is being built, it actually serves to better communicate about what we are really doing. So it all ties back to that. And yeah, there&#8217;s a never ending story because the LLM is the next silver bullet. It&#8217;s gonna solve all our problems. And it&#8217;s not, it&#8217;s helpful, but it&#8217;s not the silver bullet. &#8216;cause we still need to get the authentic story, not the fabricated story.</p><p><strong>Shane</strong>: Yeah it&#8217;ll help us understand where the context differs but it won&#8217;t help us actually define what the context is.</p><p><strong>Marco</strong>: Yep.</p><p><strong>Shane</strong>: At the moment. from the stuff I&#8217;ve done with it. Oh maybe the, the next generation a GI version will, or, maybe we all end up with standardized context that every business follows.</p><p>But we know that&#8217;s not true either, right? We saw that with tools like SAP, where it&#8217;s implemented vanilla and then $50 million and five years later there&#8217;s a bastardized customized version because that organization is different with air quotes.</p><p><strong>Marco</strong>: and I can illustrate that by fairly simple examples is that, I mentioned that somewhere else. It&#8217;s like there&#8217;s not a bank in the world that I didn&#8217;t purchase the IBM banking model at the same time I dare any bank in the world to actually have implemented it. and that points to a massive gap is that, to be able to implement it, you first have to know what you already do. Which is a massive dilemma &#8216;cause nobody has the information models and then you are presented with a technical model, like the banking model that says, everything will fit in here.</p><p>It&#8217;s okay, but what do we have that actually fits? So there&#8217;s a massive dilemma there. And then in the end every bank will try to do it a little bit different. So they have an advantage over the competition. So they don&#8217;t really want to confirm to one standard, which points back at. There is no real universal Pattern because everybody tries to give it their own little edge, which means you have to capture that edge, not just build something that you think will fit.</p><p>You have to be very specific and in that I&#8217;ve seen systems database was designed with rigor and the software was developed three times over, but the database didn&#8217;t change. What happened is. A new wave of tech came, it used to be mainframe, then it became Windows, then it became internet, then it became mobile Data did not change the way the technology work, changed the organization, changed the business processes because, we&#8217;re not having a physical storefront for the bank, but we now we have a mobile app. So the process of working with the data changed, but the data itself didn&#8217;t change. So there&#8217;s a lot of dynamics going on and it really shows the importance of getting the data right and don&#8217;t throw away the story that came with it so that you can actually reinvent all the processes and all the software on top of it without losing meaning. But it also points at the fast changing world where we perceive everything changes all the time. So we cannot sit down to do a proper information model and have a data model set up, et cetera, et cetera, because we&#8217;re in a hurry. We need to deliver what? Nobody realizes that if you sit down long enough to have that information correct, you have the data model correct.</p><p>And don&#8217;t throw away the story that it will definitely give you a return of investment.</p><p><strong>Shane</strong>: I&#8217;ve gotta say that the IBM data model was a masterclass in sales, The fact that you could sell a diagram, a picture for millions of dollars that nobody ever used, apart from putting it on their wall. That is a masterclass. But I go back just to close it out, I go back to that idea of terms and facts.</p><p>So if I have a retail bank, had a store. The terms and facts might have been customer Shane has an account. 1, 2, 3.</p><p>If I became a mobile only customer when we changed technology, I might see the term in fact change slightly, where it&#8217;s customer oh seven five, yada y has an account 1, 2, 3.</p><p>Now I can look at those two lines and I can see something&#8217;s changed.</p><p>The fact that relates to that term has changed and now I can have a conversation about is that just what you are showing me? And in the background it still says Shane, or did it never say Shane in the real data? It was always customer id. 1, 2, 3, 9, 2, 4. There&#8217;s a whole lot of conversations I have, but I know that something&#8217;s different and now I know what to have a conversation about with that subject matter expert.</p><p>As you said in the past, the subject matter expert, the data person and the IT person was the same person.</p><p>Then it became the same team. Then we became a business and IT team, and now we are a business subject matter expert team, a data team and an IT team. We are, we&#8217;ve team topologies have got more matrix E, more bigger, and that separation causes some problems.</p><p>So by using patents and patent templates and artifacts and shared language, we can close that gap again. And for me, this idea of fact-based modeling, this idea of binding terms and facts with a set of rules, with a set of exceptions as context is really valuable. We can use it in so many ways.</p><p>So just to close it out, if people wanted to hear more about this, read more about this, find out more around fact-based modeling, where do they go?</p><p><strong>Marco</strong>: I think the quickest way into it is to just go to I personally wrote a little book which is published by techniques publications called Just the Facts. It tells the story and names a few examples that I mentioned in this podcast too. It gives you an overview from management to architecture, modelers and developers, and how communication and effects weave through all the disciplines. Obviously CaseTalk is the software tool to support all of that. But you will find links for actual books if you happen to have a copy of the DMBOK by DAMA . The older edition has a crippled article about it. The version two latest edition has a slightly improved article about it. It&#8217;s on Wikipedia sources enough, but the good starting point would be casetalk.com.</p><p><strong>Shane</strong>: Excellent. Alright, thank you for that. I&#8217;ve got a new set of patterns and templates I need to go and read a lot more about. It&#8217;s gonna be me over the next few months again. But thank you for that</p><p><strong>Marco</strong>: good. Thanks for having me and talk to you soon.</p><p><strong>Shane</strong>: I hope everybody has a simply magical day. </p><h2>&#171;oo&#187;</h2><div class="pullquote"><p><em>Stakeholder - &#8220;Thats not what I wanted!&#8221; <br>Data Team - &#8220;But thats what you asked for!&#8221;</em></p></div><p>Struggling to gather data requirements and constantly hearing the conversation above?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0Bu2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0Bu2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg" width="387" height="342" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:342,&quot;width&quot;:387,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:19725,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/160520537?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!0Bu2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Want to learn how to capture data and information requirements in a repeatable way so stakeholders love them and data teams can build from them, by using the Information Product Canvas.</p><p>Have I got the book for you!</p><p>Start your journey to a new Agile Data Way of Working.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://adiwow.com/168&quot;,&quot;text&quot;:&quot;Buy the Agile Data Guide now!&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://adiwow.com/168"><span>Buy the Agile Data Guide now!</span></a></p><h2>&#171;oo&#187;</h2>]]></content:encoded></item><item><title><![CDATA[Google Cloud just added Shopify and Mailchimp to their list of data collection services]]></title><description><![CDATA[deffo a case of slowly slowly catchy monkey]]></description><link>https://agiledata.info/p/google-cloud-just-added-shopify-and</link><guid isPermaLink="false">https://agiledata.info/p/google-cloud-just-added-shopify-and</guid><dc:creator><![CDATA[Shagility]]></dc:creator><pubDate>Wed, 04 Feb 2026 08:27:24 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/e3510903-552f-4f7f-a15b-405d9f1e999b_1024x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Google Cloud just added Shopify and Mailchimp to their list of data collection services, deffo a case of slowly slowly catchy monkey.</p><p>Just under seven years ago Nigel Vining and I onboarded our first customer, The Good Registry, to give our MVP AgileData.Cloud platform a run for its money, to test what survived and what didn&#8217;t.</p><p>One of the first requirements was to collect data from Shopify.</p><p>We had a principle back then (a principle we still have today) to use the Google Cloud Data Transfer Services as our first cab off the rank for data collection.</p><p>Alas at that time there was no Google Cloud data collection service for Shopify, so Nigel crafted a set of patterns using the Meltano open source framework to automate the collection of this data.</p><p>Part of this work was to refactor Meltano so we could operate it as a &#8220;serverless&#8221; pattern on Google Cloud, rather than it operating under a 24/7 always on container pattern, that would cost more money.</p><p>We have reused this data collection pattern many many times over the last 7 years, another great example of our DORO (Define Once, Rese Often) principle.</p><p>I saw last month an announcement that Google Cloud have added Shopify and Mailchimp to their ever expanding list of data collection services or what they call &#8220;Data Transfer Services&#8221;.</p><p>If you haven&#8217;t noticed Google have been quietly expanding out the list of systems of capture that this service can collect data from.</p><p>To me it looks like they are starting to ramp up in this space and will become a serious competitor to tools like Fivetran.</p><p>Time will tell.</p><p>One of the many things I have learnt about Google Cloud over the last 7 years, is they may not be first to market, but when they go after a market, they have the firepower to do it at scale.</p><p>Google Gemini showed us that yet again.</p><p>You can check out the doco for the new Data Transfer Services here:</p><p><a href="https://docs.cloud.google.com/bigquery/docs/shopify-transfer%0A%0A">https://docs.cloud.google.com/bigquery/docs/shopify-transfer</a></p><p><a href="https://docs.cloud.google.com/bigquery/docs/mailchimp-transfer">https://docs.cloud.google.com/bigquery/docs/mailchimp-transfer</a></p><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[Redesigning traditional data systems to support Large Language Models and AI agents with Mayowa Oludoyi]]></title><description><![CDATA[AgileData Podcast #80

Join Shane Gibson as he chats with Mayowa Oludoyi about redesigning traditional data systems to better support LLM and AI Agents]]></description><link>https://agiledata.info/p/redesigning-traditional-data-systems</link><guid isPermaLink="false">https://agiledata.info/p/redesigning-traditional-data-systems</guid><dc:creator><![CDATA[Shagility]]></dc:creator><pubDate>Sun, 18 Jan 2026 10:46:30 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/d2fa4972-e9fb-4616-915e-b9b7aa2cf9c0_800x800.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Join Shane Gibson as he chats with Mayowa Oludoyi about redesigning traditional data systems to better support LLM and AI Agents</p><blockquote><p><strong><a href="https://agiledata.substack.com/i/184771911/listen">Listen</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/184771911/google-notebooklm-mindmap">View MindMap</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/184771911/google-notebooklm-briefing">Read AI Summary</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/184771911/transcript">Read Transcript</a></strong></p></blockquote><p></p><h2>Listen</h2><p>Listen on all good podcast hosts or over at:</p><p><a href="https://podcast.agiledata.io/e/redesigning-traditional-data-systems-to-support-large-language-models-and-ai-agents-with-mayowa-oludoyi-episode-80/">https://podcast.agiledata.io/e/redesigning-traditional-data-systems-to-support-large-language-models-and-ai-agents-with-mayowa-oludoyi-episode-80/</a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://podcast.agiledata.io/e/redesigning-traditional-data-systems-to-support-large-language-models-and-ai-agents-with-mayowa-oludoyi-episode-80/&quot;,&quot;text&quot;:&quot;Listen to the Podcast Episode on Podbean&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://podcast.agiledata.io/e/redesigning-traditional-data-systems-to-support-large-language-models-and-ai-agents-with-mayowa-oludoyi-episode-80/"><span>Listen to the Podcast Episode on Podbean</span></a></p><p></p><blockquote><p><strong>Subscribe:</strong> <a href="https://podcasts.apple.com/nz/podcast/agiledata/id1456820781">Apple Podcast</a> | <a href="https://open.spotify.com/show/4wiQWj055HchKMxmYSKRIj">Spotify</a> | <a href="https://www.google.com/podcasts?feed=aHR0cHM6Ly9wb2RjYXN0LmFnaWxlZGF0YS5pby9mZWVkLnhtbA%3D%3D">Google Podcast </a>| <a href="https://music.amazon.com/podcasts/add0fc3f-ee5c-4227-bd28-35144d1bd9a6">Amazon Audible</a> | <a href="https://tunein.com/podcasts/Technology-Podcasts/AgileBI-p1214546/">TuneIn</a> | <a href="https://iheart.com/podcast/96630976">iHeartRadio</a> | <a href="https://player.fm/series/3347067">PlayerFM</a> | <a href="https://www.listennotes.com/podcasts/agiledata-agiledata-8ADKjli_fGx/">Listen Notes</a> | <a href="https://www.podchaser.com/podcasts/agiledata-822089">Podchaser</a> | <a href="https://www.deezer.com/en/show/5294327">Deezer</a> | <a href="https://podcastaddict.com/podcast/agiledata/4554760">Podcast Addict</a> |</p></blockquote><div id="youtube2-R-Zc3D5GWvk" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;R-Zc3D5GWvk&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/R-Zc3D5GWvk?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>You can get in touch with Mayowa via <a href="https://www.linkedin.com/in/oludoyi-mayowa/">LinkedIn</a></p><div class="pullquote"><p><strong>Tired of vague data requests and endless requirement meetings? The Information Product Canvas helps you get clarity in 30 minutes or less?</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://agiledataguides.com/ipc&quot;,&quot;text&quot;:&quot;Fix Your Data Requirements&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://agiledataguides.com/ipc"><span>Fix Your Data Requirements</span></a></p></div><h2>Google NotebookLM Mindmap </h2><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7AEc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa08baf5e-9d9e-49e6-91aa-2315c0e1c868_4391x7139.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7AEc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa08baf5e-9d9e-49e6-91aa-2315c0e1c868_4391x7139.png 424w, https://substackcdn.com/image/fetch/$s_!7AEc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa08baf5e-9d9e-49e6-91aa-2315c0e1c868_4391x7139.png 848w, https://substackcdn.com/image/fetch/$s_!7AEc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa08baf5e-9d9e-49e6-91aa-2315c0e1c868_4391x7139.png 1272w, https://substackcdn.com/image/fetch/$s_!7AEc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa08baf5e-9d9e-49e6-91aa-2315c0e1c868_4391x7139.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7AEc!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa08baf5e-9d9e-49e6-91aa-2315c0e1c868_4391x7139.png" width="1200" height="1950.8241758241759" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a08baf5e-9d9e-49e6-91aa-2315c0e1c868_4391x7139.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:2367,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:1513523,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/184771911?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa08baf5e-9d9e-49e6-91aa-2315c0e1c868_4391x7139.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7AEc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa08baf5e-9d9e-49e6-91aa-2315c0e1c868_4391x7139.png 424w, https://substackcdn.com/image/fetch/$s_!7AEc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa08baf5e-9d9e-49e6-91aa-2315c0e1c868_4391x7139.png 848w, https://substackcdn.com/image/fetch/$s_!7AEc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa08baf5e-9d9e-49e6-91aa-2315c0e1c868_4391x7139.png 1272w, https://substackcdn.com/image/fetch/$s_!7AEc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa08baf5e-9d9e-49e6-91aa-2315c0e1c868_4391x7139.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p></p><p></p><h2>Google NoteBookLM Briefing</h2><h2><strong>Executive Summary</strong></h2><p>To effectively harness the business value of Large Language Models (LLMs) and AI agents, enterprises must fundamentally redesign their data systems. The current paradigm, built for direct human interaction via structured queries, is ill-suited for the speculative and exploratory nature of AI agents, leading to significant inefficiency, waste, and prohibitive costs. Agents do not simply query data; they &#8220;probe&#8221; and investigate, a process that generates redundant queries and consumes vast resources if left unmanaged.</p><p>A paradigm shift is required, moving toward an architecture that is inherently AI-friendly. This involves three core transformations:</p><ol><li><p><strong>Developing Multimodal Query Interfaces:</strong> Systems must evolve beyond SQL-only interaction to support both structured queries and natural language, providing the rich, multimodal communication channel that agents require.</p></li><li><p><strong>Integrating Context as a First-Class Citizen:</strong> The historical separation of data and its descriptive context must end. Future systems need to store and surface rich metadata, business definitions, and operational knowledge directly alongside the data, providing the essential &#8220;grounding&#8221; that agents need to perform accurately and efficiently.</p></li><li><p><strong>Implementing Intelligent Query Management:</strong> Data platforms must become active participants in the query process, capable of determining which agent-generated probes are necessary and which are wasteful, thereby preventing redundant execution and controlling costs.</p></li></ol><p>This briefing document synthesizes these critical insights, outlining the limitations of traditional systems and presenting a blueprint for the next generation of data architecture designed to convert AI&#8217;s speculative power into tangible business speed and value.</p><p><strong>1. The Imperative for Change: From BI to Foundational Data Engineering</strong></p><p>The evolution of data roles provides a crucial lens for understanding the need for architectural change. The journey of Mayowa Oludoyi from a Business Intelligence (BI) Analyst to a Data Engineer highlights a critical realization: the ultimate value of any data product, be it a dashboard or a machine learning model, is contingent upon the quality and structure of the underlying data.</p><ul><li><p><strong>Motivation for the Shift:</strong> The transition was prompted by the understanding that &#8220;everything still come back to the data.&#8221; A significant portion of time in analytics and machine learning is spent on transforming and cleaning data, suggesting that more focus should be placed on the foundational data processes rather than solely on the end product.</p></li><li><p><strong>Challenges in Skill Acquisition:</strong> This career transition was not a straightforward path. The primary difficulties encountered were:</p><ul><li><p> <strong>Lack of Mentorship:</strong> Finding a mentor to provide structured guidance&#8212;explaining the &#8220;why&#8221; behind learning paths and the nature of real-world problems&#8212;proved difficult. Most learning had to be self-directed through courses and books.</p></li><li><p> <strong>Absence of a Structured Curriculum:</strong> Initial learning was described as &#8220;a little bit scattered&#8221; due to the lack of clear, process-oriented roadmaps. While tool-focused roadmaps (Spark, DBT, etc.) exist, the more valuable curricula focus on fundamental processes and principles, as &#8220;tools we always change... it is a means to an end.&#8221;</p></li></ul></li><li><p><strong>The Value of Mentorship:</strong> A valuable mentor is not one who teaches coding, but one who provides structure and context. This includes explaining the types of problems data engineering solves, the reasons for solving them, and the logical sequence of learning (&#8221;you need to learn A, before you learn B&#8221;). This contextual understanding is often missing from online courses and is essential for career progression.</p></li></ul><p><strong>2. The Core Problem: Why Traditional Data Systems Fail AI Agents</strong></p><p>The central argument for redesigning data systems is that they were built for a different user: a human executing precise, structured commands. AI agents interact with data in a fundamentally different manner, and this mismatch creates significant friction and waste.</p><p><strong>Defining the &#8220;Traditional Data System&#8221;</strong></p><p>For this analysis, a traditional data system is one where the primary interaction model involves a user leveraging a structured language (e.g., SQL) to execute a query against a database (e.g., Postgres) and receive a direct result.</p><p><strong>The Nature of AI Agents vs. Humans</strong></p><p>The key distinction lies in the method of inquiry. Humans query, but agents investigate.</p><ul><li><p><strong>Speculation and Exploration:</strong> Unlike a human who formulates a specific SQL statement, an agent must often &#8220;speculate&#8221; to find an answer. It performs exploratory work, probing the data system with a series of queries to build understanding. This process is inherently iterative and less direct.</p></li><li><p><strong>Probing vs. Querying:</strong> The interaction is better described as &#8220;probing&#8221; rather than querying. The agent is investigating the data landscape, which is fundamentally different from a human retrieving a known piece of information. As stated in the discussion, &#8220;agents don&#8217;t just query, they probe.&#8221;</p></li></ul><p><strong>The Consequence of Misalignment: Waste and Inefficiency</strong></p><p>Simply layering an AI agent on top of a traditional data system introduces massive inefficiencies that businesses cannot afford.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0LhY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b020eae-a97b-4b20-b3bb-7a2d5bed8b25_831x228.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0LhY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b020eae-a97b-4b20-b3bb-7a2d5bed8b25_831x228.png 424w, https://substackcdn.com/image/fetch/$s_!0LhY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b020eae-a97b-4b20-b3bb-7a2d5bed8b25_831x228.png 848w, https://substackcdn.com/image/fetch/$s_!0LhY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b020eae-a97b-4b20-b3bb-7a2d5bed8b25_831x228.png 1272w, https://substackcdn.com/image/fetch/$s_!0LhY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b020eae-a97b-4b20-b3bb-7a2d5bed8b25_831x228.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0LhY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b020eae-a97b-4b20-b3bb-7a2d5bed8b25_831x228.png" width="728" height="199.74007220216606" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0b020eae-a97b-4b20-b3bb-7a2d5bed8b25_831x228.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:228,&quot;width&quot;:831,&quot;resizeWidth&quot;:728,&quot;bytes&quot;:45681,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/184771911?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b020eae-a97b-4b20-b3bb-7a2d5bed8b25_831x228.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0LhY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b020eae-a97b-4b20-b3bb-7a2d5bed8b25_831x228.png 424w, https://substackcdn.com/image/fetch/$s_!0LhY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b020eae-a97b-4b20-b3bb-7a2d5bed8b25_831x228.png 848w, https://substackcdn.com/image/fetch/$s_!0LhY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b020eae-a97b-4b20-b3bb-7a2d5bed8b25_831x228.png 1272w, https://substackcdn.com/image/fetch/$s_!0LhY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b020eae-a97b-4b20-b3bb-7a2d5bed8b25_831x228.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><strong>The Context Gap</strong></p><p>Traditional systems are designed to store data, not the rich context surrounding it. Agents, however, are critically dependent on this context for &#8220;grounding&#8221;&#8212;the ability to understand the data&#8217;s meaning, relevance, and structure. While data catalogs have historically attempted to solve this, they often failed due to the high manual effort required to populate them. With AI, this descriptive, natural language context is no longer a &#8220;nice-to-have&#8221; but a core requirement for system performance and accuracy.</p><p><strong>3. Architecting the Future: A Blueprint for AI-Ready Data Systems</strong></p><p>To address these shortcomings, a new architectural approach is necessary. This approach redefines the interface, integrates context as a core component, and introduces intelligent oversight of the query process.</p><p><strong>3.1. A New Multimodal Query Interface</strong></p><p>The first step is to create a different query interface that serves both humans and machines effectively. This interface must be multimodal, accommodating both:</p><ul><li><p><strong>Structured Query Language (e.g., SQL):</strong> For precise, efficient data retrieval.</p></li><li><p><strong>Natural Language:</strong> To allow agents to process and leverage the rich descriptive context needed for grounding and reasoning.</p></li></ul><p>This dual capability allows the system to support traditional analytics while simultaneously providing the necessary foundation for advanced agent-based interactions.</p><p><strong>3.2. Integrated Context as a First-Class Citizen</strong></p><p>Context must be elevated from an afterthought in a separate catalog to an integrated component of the data platform. This can be achieved by providing agents with access to various forms of grounding material:</p><ul><li><p><strong>Past Queries and Code:</strong> Providing an LLM with a repository of previously written, successful SQL queries serves as powerful, practical context. Mayowa noted this technique works &#8220;perfectly&#8221; for his personal use, allowing the LLM to generate new queries based on established patterns.</p></li><li><p><strong>Code with Embedded Explanations:</strong> The most effective context combines structured code with natural language explanations. In one example, an agent&#8217;s performance improved dramatically when it was given access not only to a repository of transformation rules (code) but also to comments within that code explaining <em>why</em> each rule was created and what business purpose it served. This provides both the &#8220;how&#8221; (the code) and the &#8220;why&#8221; (the context).</p></li><li><p><strong>Centralized Context Stores:</strong> The system architecture must include a mechanism, potentially a &#8220;meta store&#8221; or a new function within the database itself, to store and serve this context universally. This ensures that any agent interacting with the system has access to the same grounding information, promoting consistency and accuracy.</p></li></ul><p><strong>3.3. Intelligent Query Management</strong></p><p>To combat waste, the data system must evolve from a passive recipient of queries to an active manager of them. The system itself should have the intelligence to &#8220;determine what query or probe needs to be executed.&#8221; By leveraging its own metadata, the system can identify and prevent the execution of irrelevant or redundant queries generated by an agent&#8217;s speculative process, thereby preserving resources and controlling costs.</p><p><strong>3.4. The Rise of Specialised Agents</strong></p><p>An effective strategy for managing complexity and improving results is to move away from a single, generalist agent. Instead, a team of specialized agents, each with a &#8220;bounded context,&#8221; can be deployed. For instance:</p><ul><li><p>An <strong>&#8220;ADI the data modeler&#8221;</strong> agent focuses solely on data modeling tasks and is given context specific to that domain.</p></li><li><p>An <strong>&#8220;ADI the engineer&#8221;</strong> agent handles data transformation rules.</p></li><li><p>An <strong>&#8220;ADI de boss&#8221;</strong> agent orchestrates the workflow, passing tasks to the appropriate specialist agent.</p></li></ul><p>This approach mirrors how human expert teams function and has been shown to produce significantly better and more reliable results by ensuring each agent operates within a well-defined and deeply contextualized skill set.</p><p></p><div class="pullquote"><p><strong>Tired of vague data requests and endless requirement meetings? The Information Product Canvas helps you get clarity in 30 minutes or less?</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://agiledataguides.com/ipc&quot;,&quot;text&quot;:&quot;Fix Your Data Requirements&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://agiledataguides.com/ipc"><span>Fix Your Data Requirements</span></a></p></div><p></p><h2>Transcript</h2><p><strong>Shane</strong>: Welcome to the Agile Data Podcast. I&#8217;m Shane Gibson.</p><p><strong>Mayowa</strong>: Hi, this is Mayowa joining from Nigeria.</p><p><strong>Shane</strong>: Thanks for coming on the show. Today we&#8217;re gonna talk about redesigning current data systems to power LMS and agents. But before we do that, why don&#8217;t you give the audience a bit of background about yourself.</p><p><strong>Mayowa</strong>: thank you once again. Again, let me just say thank you for bringing me to the podcast and , like I mentioned earlier, my name is Maya . I started off my career working as a business intelligence analyst. And honestly, I still feel it&#8217;s still one of the most interesting roles and anybody , can, get involved in. It&#8217;s very interesting. And while working as a business intelligence analyst, this kind of gave me the opportunity to work several projects. I started off working in consulting and so that kind of gave me the opportunity to work on, several projects.</p><p>We have clients working in telecommunication, in banking, in health tech. And so , starting off that way, give me, enough opportunity to experience, different projects. And while working, I had the opportunity to working on several projects, like working on data warehousing, building data reporting, power bi, Tableau and all of those interesting data projects. But then as time goes on, I move on to. A new company after two years. And then in this new company, it is just basically a payment company that is in FinTech. And then, , while I was there, I did a lot of work around analytics, data engineering, and that was where my interest in de engineering started, evolving, and from there I left the company another after two years, and then I moved into a bank. And then this time around I started working as a data engineer. And why, working as a data engineer, as a bank have a lot of legacy system. There&#8217;s a lot of things, a lot of moving parts when it comes to data engineering in a bank. You talk about privacy, you talk about, building systems that are highly protected, all of those things.</p><p>So while in the bank I started, having this interest in. how to actually build, robust pipelines. So I wanted to move from the normal, building ETL. And so I started learning and that was how my career has, spiraled all through the years. And yeah.</p><p>Here we are today.</p><p><strong>Shane</strong>: that moved from being a business intelligence analyst focusing on the reporting. Tablet, power bi, I&#8217;m assuming the data was served to you, so you grabbed the data you needed and you focused on visualization and user experience,</p><p>And then moving back into the engineering side, into the code to transforms the data, collects it, all that kind of gnarly stuff that, that&#8217;s a common path for people.</p><p>How did you find it? How did you find changing from, a set of tools and a certain, set of skills to then expanding them out into new tools and new skills? Was that an easy journey or did you find that step change quite difficult.</p><p><strong>Mayowa</strong>: I&#8217;ll say for me it was not that easy journey. The reason being that the resources are there on online, most time the resources are not what translates to business value, right? So sometimes you have to struggle to get things done. So it was not an easy journey. But one thing that I feel prompted this shift in me was that at the end of the day I discovered that one of the most important process still remains the data. You can have beautiful dashboard, you can have machine learning models that are doing, fantastic. But everything still come back to the data. So I started asking myself the question, . I think I spend a lot of time, transforming data, cleaning data, whether to produce report or to do machine learning. And so I feel like it&#8217;s important to spend a lot of time around the data itself than even the product. I mean, The end product is really important, Because that is the whole point. But then I feel like it&#8217;s important to spend more time on Iran. So that was what prompted, but then I had to, look for a way to attend conferences, take courses online, and then that shift was not easy.</p><p>But, gradually I started building all the necessary experience that I need. Yeah.</p><p><strong>Shane</strong>: And so often I&#8217;ll see people that make that change from being a, bi centric set of skills to that data engineering set of skills. Often they&#8217;re in an organization that actually encourages that change. So they&#8217;re in an organization, they&#8217;ve got a bunch of mentors, a bunch of people who&#8217;ve done it before and they can get a lot of help and mentoring about the things they need to learn and a lot of feedback.</p><p>In some organizations that&#8217;s not possible. So people will go and study on the site, they&#8217;ll go and find courses and they&#8217;ll try and learn it to help them then, effectively change organizations to change roles. Which way did you go? Did you find that you had mentoring and support within an organization to do the step change?</p><p>Or did you have to learn outside the organization you&#8217;re in and then change jobs to, to go into the new role?</p><p><strong>Mayowa</strong>: honestly I couldn&#8217;t find a mentor. I had to take the other route. I went all out to, get information. I have a couple of people who, tell me exactly, oh, this is what you need to do. You can take this course. I have those people, I have people tell me that, but I didn&#8217;t have the opportunity of having someone. Who will hold my hand through the rubies and show me everything I need to know. So most of the time I had to go out there myself, pick up courses, pick up books to read and that was it. So that was the part that I took. But I understand what you&#8217;re saying. There are a lot of people who have the opportunity of having senior data engineer on the team who can tell them what to do, and then they learn through, working on different tasks, different projects along the, but I didn&#8217;t have the opportunity.</p><p><strong>Shane</strong>: And if we think about the idea of a curriculum, an idea of these types of courses, these types of tools, these types of skills and learning them potentially in this order makes sense, so almost like a manifest or a curriculum of what you might want to do, rather than having to try and find it all yourself.</p><p>Did you find that, did you find that there were resources out there that gave you the idea of a bootcamp? Or did you have to go and talk to lots of people and cobble it together to yourself to figure out how you went through that path?</p><p><strong>Mayowa</strong>: basically, my knowledge was a little bit scattered at the very start because there was no tailored curriculum to give me more a roadmap. These days I see a lot of roadmaps here and there, and I wish I had this when I was starting, right? But then when I start, when I started, I didn&#8217;t have this roadmap.</p><p>I had to learn. So at the end of the day, I have, a lot of knowledge and it took a lot of time, a lot of, learning to put everything together to make sense actually. But I think that problem has been solved. I think I&#8217;ve seen a lot of resources online, a lot of roadmaps that I think is doing justice to that now.</p><p>yeah. And those roadmaps, are they primarily technical and tool-based, or are they about the process and the ways of working as much as they are about the technologies? Because again, when I see, if I look at job ads or I look at what people say they do, I&#8217;ll hear DBT Spark, Databricks, I hear tools and technology versus the skills and the tasks that, you need technology to deliver them, but they&#8217;re just as important as the technology itself.</p><p><strong>Shane</strong>: So what did you find?</p><p><strong>Mayowa</strong>: I can share this with you at the end of the, session because I don&#8217;t wanna mention names. there&#8217;s this particular roadmap that I&#8217;ve seen, which I think is very good. When you look at the roadmap, they didn&#8217;t spend a lot of time talking about tools.</p><p>Tools can change, but they spend a lot of time around processes, which I think is really great. Things, fundamentals that I think everybody should know when it comes to data engineering. And even though that domain I think is really good. And I shared the same. Opinion. I see a lot of roadmaps where they talk about, spark DBTI, I don&#8217;t think that is the best. I think it&#8217;s because tools we always change. They are, tools are like, it is a means to an end. It&#8217;s not an end in itself. So I feel like roadmap that talks about the real process, the real principles.</p><p>I think they&#8217;re very great and I think that is what we help a lot of people to, to move up quickly. .</p><p><strong>Shane</strong>: We met each other on the practical data community and it&#8217;s still one of my most, my favorite, most active communities I&#8217;m in. And yeah, I think something, something we should look to do is creating that roadmap or that curriculum as an open source roadmap and curriculum as part of that community, because lots of people have that struggle when they&#8217;re trying to get into the domain or drop a jump across the roles.</p><p>Where do you start? And so last question before we get onto the current data stuff because this whole onboarding of a person into a new career really intrigues me. So when you talk about, if you had have found a mentor who would&#8217;ve helped you through that process.</p><p>What would you have expected them to do with you? What would you want them to help with?</p><p><strong>Mayowa</strong>: Sorry, can you repeat that again?</p><p><strong>Shane:</strong> So you talked about the fact that, you were looking for a mentor and you struggled to find one.</p><p>Say you had found one, let&#8217;s say somebody had said, yep. Happy to help you take that next step in your journey. What would&#8217;ve been valuable for that mentor to do? What would you expect them to actually do to help you progress your career?</p><p><strong>Mayowa</strong>: I think one major thing I look out for in a mentor, it&#8217;s not to. Teach me maybe how to write code, but rather just to show me exactly what I need to succeed and I&#8217;ll explain what I mean. So for example when I was trying to, break into data engineering, it was difficult to actually see somebody who have the experience of what happens in data engineering. There is a lot of resources on how to write SQL Python and all those things, right? But I need somebody who work in data engineering who tells me, this is the kind of problem we solve. This is why we need to solve this problem. This is why you need to learn for, that kind of structure. that is what I&#8217;m looking for, and don&#8217;t think it&#8217;s something that is readily available. In the market, or I&#8217;m not even sure. Most courses offer that even when they, offer these courses, online. So I think for me personally, I&#8217;m looking for in a mentor, somebody who is not just interested in, learn these skills, but tell you exactly the structure of learning. All right? This is important. For example, you need to learn A, before you learn. B, you need to learn B before you learn C, something like that. That is what I&#8217;m looking for in a mentor.</p><p><strong>Shane</strong>: I think that&#8217;s the important part. I, so I mentor people every now and again, and I know a bunch of people that want to mentor but aren&#8217;t sure what the process is. And they often think it is more teaching. They think they&#8217;re gonna have to spend hours teaching somebody how to do something. And if you&#8217;re in an organization where you are mentoring somebody in your team, junior, then yes you probably will do that.</p><p>But if you are external to that organization, to that person, to me it&#8217;s about connecting with somebody, listening to where they&#8217;re at. Potentially making a suggestion what they might want to try next and why it&#8217;s about storytelling, one of the things I find is if you can tell a story from your career that I did this and this is what happened and this is why it happened, that&#8217;s valuable.</p><p>And then the last one is context. When somebody says, okay, why does everybody say you shouldn&#8217;t do real time streaming? The context is, it&#8217;s expensive still to do that versus batch nine Times outta 10, they don&#8217;t need the data straight away. They can live with 15 minutes or an hour. So everybody says they want it, but actually when they see the cost of it, they say they don&#8217;t.</p><p>That&#8217;s the context. And you tend not to find that story of the context of why does everybody say don&#8217;t do near real time? Unless you have to have, it is not well written somewhere. And it&#8217;s definitely I don&#8217;t often see it as part of the course, so Yeah. I&#8217;m with you.</p><p>Is it really is an hour, maybe to a week having a chat with somebody and just helping them with their career. And so for me I think about all the people that helped me, all the people that helped mentored me and my career, , I think people who have been doing it for a while need to pay it back.</p><p>So anyway end of end of pitch for people to mentor more people in the world. Okay. So let&#8217;s talk about this idea of redesigning the current data systems to make them better for LMS and ai. So when you kind of propose that as a subject, talk me through it. What do you mean by that?</p><p><strong>Mayowa</strong>: I started thinking about the whole LLM. Thing some few months back. As a matter of fact just like I told you, I have a, an article I want to release, which I think I just wanted to talk about, the traditional data systems and what we need to get to that point where enterprise can take advantage of what LLM is offering right now. And this started off from, working and everybody&#8217;s talking about LLM, since 2023 and now everybody has been talking about LLM and the whole boss, and there&#8217;s so much, happening LLM these days. So I ask myself on the job right now, I use charge GPT, I use Claude for a lot of things, but I ask myself, how does the enterprise, how does the business still get value from these LMS at the end of the day? So that was all led to that thinking. And then I started exploring and I&#8217;ve seen very great articles and, research papers around that. So for me, the management team here or where I work, they&#8217;re not just interested in hype, they&#8217;re interested in value, so before they put money into any of these things, they want to know how much value they can get from it.</p><p>So sometimes it&#8217;s difficult to get a buy-in the management about, some of these things unless they see, what they can get from it. So that was why I started talking about, I started thinking about how we can, bring value to the enterprise through early. But I then I discovered that. the current system. I don&#8217;t think that value can be created immediately with the current systems that we have, if at all. There&#8217;s gonna be a value. When I say value I&#8217;m not saying, we&#8217;ve seen agenda doing interesting things like, a lot of people integrate agent to their GitHub repository.</p><p>that is good for the developer, but when it comes to the business, I don&#8217;t think there&#8217;s so much value that they&#8217;re getting from that. So I think for us to get to that point, there&#8217;s gonna be a need for us to rethink some of the processes and even the data system itself should, so that, that was how I got to that topic, to that theme that I gave you.</p><p>Yeah.</p><p><strong>Shane</strong>: Okay and so there&#8217;s a couple way we can approach this. We can look at what you would see as a traditional data system And describe that, and then figure out what you think should change, or we could just. Identify some things, people, processes, technology design that you think when you look at that, that needs to change for the use of LMS and to add value back to stakeholders.</p><p>So I&#8217;m with you. We see lots of use cases of developers using copilots and LMS to automate or make their jobs easier, remove that drossy work. We see, some interesting use cases using those tools to automate business processes</p><p>Data. And then there&#8217;s this third. Kind of dimension, which is agents using data for stakeholders.</p><p>It&#8217;s not business process automation, and it&#8217;s not co-pilots for developers. And</p><p>Now everybody goes, oh, it&#8217;s text to sql, right? Ask a question get an answer. That&#8217;s the obvious use case. So which way do you wanna do it? Do you wanna describe a legacy system? And then what your change, or do you want to talk about specific use cases that you think are valuable to a stakeholder?</p><p>And then we talk about it from that lens.</p><p><strong>Mayowa</strong>: I think we talk, we can talk about a bit of the two, when I was, trying to, just put some point together before joining I actually have three questions. I feel like it&#8217;s important for us to answer.</p><p>One of the most important questions is why do we need to redesign the data system? Why is that we can&#8217;t take advantage of the current system? I&#8217;m talking about the traditional systems, right? I think that is very important and then. We then need to answer the question, why can&#8217;t the current system be leveraged as it Because there are two different things. One thing is that we need to answer the question why we need to redesign the data system. We need to talk about why can&#8217;t the current system be used as it is? Unless even, maybe we can even add on top of that. what do the traditional data system looks like?</p><p>Maybe somebody&#8217;s listening to this conversation and does not even know what the traditional data system looks like. So we might even want to touch about that. And I think the final question that I think we should try to provide an answer to is how then do we redesign the, system that we feel or that we think will be able to get us to that level where we can, get massive gain from LMS and agents.</p><p><strong>Shane</strong>: let&#8217;s do that. One of the things that we need to always recognize is we need to anchor our language in a way that everybody else gets a shared language. That we are using. So when you say traditional data system, I have in my head what I think a traditional data system is, and it&#8217;s probably not the same as yours.</p><p>Because mine&#8217;s based on, 30 years ago, 20 years ago, 10 years ago, there&#8217;s been, broad iterations of traditional data systems. What I always suggest people do right, is when they&#8217;re, they&#8217;ve got a mental picture in their head, they&#8217;ve got a map. It&#8217;s always useful to describe the map.</p><p>In a way that everybody goes, oh, that&#8217;s what you mean. But let&#8217;s start off, what do you wanna do? Do you wanna do, why do we need to redesign traditional data systems? Or why can&#8217;t we leverage them for lms? Take me away.</p><p><strong>Mayowa</strong>: I think we should start with the why, we designed the data system. I think one thing that is really important is that we need to know that LLMs or agent, whichever way you want to call it, they&#8217;re different from humans, right? if you have used agents or if you&#8217;ve used LLMs, but whether it&#8217;s strategy or clo, whichever one, and then you use it around, data, you will know that they don&#8217;t ask once, they don&#8217;t ask once and then analyze. There might be a need for you to, ask again, change some things. And so because of that, an attempt for these agents or LLM to provide an answer they speculate. That means they try to do some form of exploratory work, so that makes them different from human.</p><p>So because they&#8217;re depending on whatever it is that you give to them, for them to now, think. So that in itself tells you immediately that they behave deriving from humans. And so it is the natural way we get information from data system cannot be the same way that these agents get, information for us.</p><p>I don&#8217;t know if that makes sense.</p><p><strong>Shane</strong>: Yeah, so let me just play it back and see if I understand</p><p>What you&#8217;re saying. So if we look at the lms, the foundational models, they&#8217;ve been trained on large knowledge bases of text.</p><p>And so the early days, we&#8217;d ask &#8216;em a question, they would search that knowledge base and they&#8217;d come back with an answer.</p><p>And all we did was we&#8217;d ask a question, it would look into itself like a human would if it only could look at its memory. So it would go what do I know about that? Here&#8217;s an answer. And then what we found was actually we wanna provide context to it. We wanna provide some additional information that foundational model may not have or may not have brought to the front of its memory.</p><p>And so we started seeing techniques like rack, and prompt engineering ways of giving it additional instructions on how we want it to behave. Because the way my dentist behaves is different to the way a data analyst behaves. The prompts were behave like a dentist, behave like a data analyst.</p><p>And then the second thing is, here&#8217;s some other information you probably don&#8217;t have that again, a human would ask, if you said to me how many customers have we got? I&#8217;m probably gonna say to you, what&#8217;s your definition of a customer and where do we hold the data about a customer so I can go look at it and count it for you?</p><p>And you want to count today, right? Not last week or next month. And it&#8217;s okay to count it to one, So 1,001 customers is okay. Yeah. So there&#8217;s a bunch of questions. As a human, you would ask if you&#8217;re an expert in your domain. And so this idea of context and reinforcement is passing that back.</p><p>So we pass the LLEM prompts to tell it how to behave the data we want it to look at if it doesn&#8217;t have access to it already. So we give it access and then the context, additional information we know is valuable to it. So is that what you mean by, The change in what we have to provide versus a SQL statement, which is select this from there, and all we gotta do is give it the code and the data and the machine runs that code on that data and we&#8217;re done.</p><p>nothing else needs to happen. We don&#8217;t need to tell it anything else. Just run this code.</p><p><strong>Mayowa</strong>: I think the key words you rightly explained it, but I think the keyword there is that because LMS are going to be, or let me say agents let me let, for, let&#8217;s just use agents, all right? because agents are gonna be one of the ways we interact with data, when it comes to LLM, agents are gonna be like the main use case, right? So I think the key word is that since agents are not human, there&#8217;s a higher chance that in an attempt to provide an accurate response to your queries they speculate. That mean they spend a lot of time, doing some exploratory work, and that in itself have impact on the data system. So because there&#8217;s gonna be a lot of redundant queries, things that are not necessary that you will need to fine tune and all of those things. And so because of that I think just, like you said, they have been trained on this massive test that might not even be relevant to what you&#8217;re saying.</p><p>But then they have to re something back for you. So I think the key word here is that they speculates, and in all of these, it&#8217;s an attempt to just give you a response.</p><p><strong>Shane</strong>: Okay so again if I play it back, if we took a human behavior, it&#8217;s like me saying to you, here&#8217;s my one terabyte data warehouse. go and tell me how many customers I&#8217;ve got. But by the way, there&#8217;s no table called customer.</p><p><strong>Mayowa</strong>: Yeah.</p><p><strong>Shane</strong>: And now you&#8217;ve gotta do a whole lot of exploration, right?</p><p>You&#8217;ve gotta go and try and figure out what&#8217;s a customer called? What table does it live in? How&#8217;s it defined? And you&#8217;re gonna go and do all this work. And that work takes human time and it takes compute from the system because I&#8217;m gonna be writing queries. I think what you are saying is with the L lms, it&#8217;s the same, If we say to them, here&#8217;s a big blob of data. With no context and no constraints and go answer this question. cause with the reasoning models, and reasoning in quotes,</p><p>Going through and building itself a curriculum. It&#8217;s building a manifest. It&#8217;s saying, I&#8217;m gonna go try that.</p><p>That didn&#8217;t work. I&#8217;m gonna go try that. That didn&#8217;t work. And it&#8217;s trying lots of things. And because it&#8217;s a machine, yes, we sometimes see it tell us what we&#8217;re doing. But it&#8217;s doing a whole of work under the covers. And one of the other things about that is token cost. Because every time it does a task, it&#8217;s using a token that costs us money.</p><p>If we change the way our data&#8217;s platforms are structured, we can make the LMS more efficient, more effective, and require them to do less thinking because we&#8217;re giving them hints of what they should use. And when is that the angle you&#8217;re taking?</p><p><strong>Mayowa</strong>: Yeah, correct. The fact that you mentioned the cost thing, make me remember, some of the reasons , so I wrote here you pay for waste. and this is the, one of the reasons why I feel like we need to redesign this is because enterprises, businesses don&#8217;t want to pay for waste, Redesigning this system is how we convert, this speculation to speed. Because it&#8217;s important for us to know that businesses whether small or large, not in any way we allow waste, So it&#8217;s important for us to know that agents. They don&#8217;t just query, they probe and they continue to, and there are a lot of redundancy at the end of the day, if we don&#8217;t, get this system right, because it is in the nature of agent to probe instead of just getting that.</p><p>Yeah. So yeah you&#8217;re absolutely correct. Cost is a very important </p><p><strong>Shane</strong>: And one of the things is at the moment we&#8217;re not paying the true cost of those tokens. We pay $20. Okay, now we have to pay 200 a month. Okay, now we&#8217;re getting some constraints from Claude and those tools where we can&#8217;t just run everything forever before we run out of our, our limit of usage.</p><p>But we&#8217;re still not paying the true cost. And so while we can do things that are lazy right now and they work and don&#8217;t cost us a lot, it&#8217;s gonna change, right? And so, you&#8217;re gonna start getting the $30,000 bill and we might as well start designing our systems now to be cost effective and efficient.</p><p>And I think the other one is speed. If you think about back to your power bi Tableau days, there was a rule of how long you could let a user wait before that report rendered, it was a second or two, and then they start going, this is too slow. And then you think about how much engineering we put in to summarize the data or to give us that performance.</p><p>And now you go into an agent, an LLM, and you ask it a question, and then when it&#8217;s doing its reasoning models, it&#8217;s really interesting to watch how it just sits there and gives you some feedback. It&#8217;s working and it&#8217;s not dead, but it&#8217;s taking way longer to bring back counter customer, which I could have had on a Tableau dashboard in less than a second with what, with, a blog code.</p><p>So we&#8217;re still experimenting where these agents are best fitting. Okay. So what we&#8217;re saying is things have to be changed, Because just throwing an LLEM on top of your current data platform is not gonna be the most cost efficient, not gonna be the fastest it is gonna.</p><p>And that waste will hurt us. Okay.</p><p>What&#8217;s next?</p><p><strong>Mayowa</strong>: I think we can now talk about what the current data system looks like, right? What I term traditional and like you said, traditional in this case, maybe means several things to several people, right? But for me, traditional just mean the current system where we are able to use languages like structure Korean language, SQL, I, so right now the way we work is we leverage. Platforms, databases, Postgres, name it. All right. And then we just run a query and then we get a result, and that&#8217;s what the current system looks like, right? But then when you look at the way agent works I think the right word to use is that agents don&#8217;t just use queries. what they do is not just querying. What they do is more like investigating, probing. they want to probe. So we need systems. That are beyond, of course, SQL is very important. Many times people say SQ is gonna be very, we are, we&#8217;re always gonna be, I see people learning SQL every day.</p><p>I still see a lot of university teaching sq l in their curriculum. So I&#8217;m not saying SQL is going anywhere, but I&#8217;m saying we need system that will accommodate for other things like natural language. Because these agents will need much more than SQL, like we said, they would need contests.</p><p>We need to provide some kind of contest. So there might be, maybe there&#8217;s gonna be a need for us to have interfaces that allow for not just the structure query language, but then maybe natural language. So the current data system that we have does not, have that yet. So there might be a need for us to. Incorporate these kind of interfaces, To accommodate for these agents to perform better.</p><p><strong>Shane</strong>: In the past we&#8217;ve always had data catalogs.</p><p>So we&#8217;ve had this idea of a catalog that set across the technical metadata, the technical context, the technical structure. So I have these tables, I have these columns, I have these values.</p><p>And then from a governance and stewardship point of view, we always knew that adding context into that catalog had value.</p><p>So that table holds a record for customer. That table holds names, therefore it&#8217;s PII that is the table that you query. If you want a single list of customer. These other ones they don&#8217;t hold everything or they&#8217;ve got duplicates, There&#8217;s all this language, this descriptive stuff that was useful.</p><p>But what we learned was nobody ever does it. The cost of creating that context was way higher than the value in using it. And yes, we had big organizations where it got mandated and they were certain industries where you had to do it. And yes, we got big tools that automated some of it and that, but in my experience, data catalogs was the tool you bought and two years later you turned off or nobody used it.</p><p>I think with LMS and agents, I think that those types of capabilities where we, we bring this natural language context in as text, as description of information is highly valuable. But I&#8217;m not convinced that data catalogs of old are the right place to do it. And the reason I say that is I think they&#8217;re disconnected.</p><p>They&#8217;re exhaust they hoover up exhaust. So we hoover up some technical metadata and then we ask somebody. As an extra task to go and add the business context, the operational context of, how many rows are in there, what&#8217;s the data quality like and they don&#8217;t even have this idea of a gen context, you can&#8217;t really, at the moment store prompts or hints for an LLM for an agent, I don&#8217;t think they&#8217;re fit for, I don&#8217;t think they&#8217;ll fit for purpose in the original days, but I don&#8217;t think they&#8217;re fit for purpose now. So I&#8217;m with you. I think that there will be a new paradigm coming out about how we hold this language, this description, this context against everything to do with data.</p><p>And then that is highly valuable to the agents. And then the way we capture it, to me, it&#8217;s gotta be done as close to the creation of the code or the data as possible. So data engineers who are doing the work. Have it in their brain. They know how the accounting customer, I&#8217;m gonna create a dim, I&#8217;m gonna create a hub and set.</p><p>I know the logic because I&#8217;ve asked, and I&#8217;ve done that work. They&#8217;re the ones that we have to make it really easy for them to capture that tribal knowledge into a place that the agent can go and find it. Or maybe we just have probes in our head and then the LLM could ask us. &#8216;cause when you work in an organization, you probably do it.</p><p>You go, oh, I don&#8217;t know how we calculated that. And you go and ask Bob. And you know, you, you know somebody in your team that you can go and ask and you get that tribal knowledge outta their head. Maybe, we&#8217;ll, I&#8217;ve been facetious here, but maybe we&#8217;ll have probes in our head where the LMS ask us.</p><p>Yeah. So is that what you mean, that idea of that really rich, descriptive stuff? Yeah.</p><p><strong>Mayowa</strong>: So like I said, it&#8217;s summary. I think it&#8217;s important for us to take advantage of these agents gonna be a need for us to have a paradigm shift from the normal, SQL against databases. There&#8217;s gonna be more things that are involved, like you mentioned, maybe catalog that maybe need to have the data system itself.</p><p>Maybe the databases might have something that stores a lot of context in memory, just to provide more context. But I think one thing that is important for this agent is that they need grounding, need to have a lot of information for them to act, anyway, so the current data system, we have to evolve to the point where we have some of these components, integrated into the system for us to take advantage of the agent.</p><p><strong>Shane</strong>: Let&#8217;s just take that comment around probing from the way that you meant it, not me putting a probe in my head for the LL to get my knowledge. And the traditional way we&#8217;ve always done it is we&#8217;ve taken data from source systems and we&#8217;ve bought it into one place and we&#8217;ve typically put it in a database of sorts that, like you said, we can write SQL and we can get an answer.</p><p>And yes, we played around with no SQL databases and Hadoop and whole of other stuff, but we keep coming back to a database column that stores data in a certain way and allows us to use this standard language to get the data back out. Seems to be a valuable way of working. And then we&#8217;ve now got this introduction of MCs where we can effectively.</p><p>Create almost an API or a, a network port or an HTDP kind of endpoint for the LMS to talk to a system and get a response, So I need this, give it back to me.</p><p>And it&#8217;s not writing sql. it might talk to the MCP server and the MCP server might write SQL to talk to the database and get back and then hand it over.</p><p>so I look at it and I go, do we know that there&#8217;s a bunch of use cases where probing won&#8217;t work? And I can come to those in a minute, but why wouldn&#8217;t we just fundamentally change the way we work? So rather than grabbing the data from the source systems and put it in one place, why wouldn&#8217;t we just expose these MCP services and allow the LLM agents to always just do a one and done query?</p><p>Have you looked at it at also actually remove the data warehouse? No. We&#8217;ll come to the use cases. Why? That probably won&#8217;t work at the moment, but I&#8217;m just intrigued by that. It&#8217;s, just go away, ask the question, give the answer. If the context is bound around it, if the context is the thing we care about, that&#8217;s the pet.</p><p>And where the data live really is the cattle. It&#8217;s an interesting architecture change to what&#8217;s been 40 years of my life. </p><p><strong>Mayowa</strong>: So I think is an interesting one actually, but of course, we can talk from ante toward the reason why that is not gonna be possible. But I think it doesn&#8217;t stop the fact that it make us start looking at things in a different way the way, we used to work.</p><p>like you said, I don&#8217;t think there&#8217;s anything stopping us from doing that. Even from hindsight you start thinking about the cost effect and, a lot of things that might, but I think that there&#8217;s absolutely nothing stopping us from exploring that data warehouse today and this is me just digressing. In my experience, the whole idea behind data warehouse, which is consolidating data from different sources and putting it in one place. When you look at it, if we ask ourself this question, has it really, solved the problem? Because I&#8217;ve seen places where there are still information that you can&#8217;t find in the data warehouse.</p><p>That warehouse supposed to be the source of truth, but there are still information that you can&#8217;t find in the data warehouse. So if that is still a problem, so what if we have, like you described, if we have a sea of data where we can just throw these CPS to, get information.</p><p>I think it&#8217;ll be about, but then. there will be a lot of things that goes into making that happen. </p><p><strong>Shane</strong>: And a high chance of waste. Yeah. A high chance of</p><p>Wasted cost , slow. All those things that you talked about at the beginning that actually we need to focus on at the moment. I think whatever we designed would give us those problems.</p><p>But that&#8217;s at the moment because we&#8217;re in the beginning of this whole new wave and so we haven&#8217;t really done the rethinking, the re-engineering of what it would look like.</p><p>So where&#8217;d you get to in terms of, if you started with a blank piece of paper, what would you do right now to build a data platform capability that is LLM, agent friendly.</p><p><strong>Mayowa</strong>: The first thing is that need a different query interface. Just, like I explained, we need a different query interface and this interface should accommodate for both natural language, And also the structure, query language, SQL and the rest, this interface accommodate for that. But at the same time I think to avoid some of the things that I&#8217;ve talked about, like redundancy like waste and all of those things. I think there&#8217;s gonna be a need for us to have the data system being able to, determine what query or probe needs to be executed. Just to prevent that waste. Because even when we have this interface that accommodate for several mps that allow agent to run, different queries against the data system, it is important for us to also put, some measures in place to know what query to execute. Because at the end of the day, whatever you get back, it&#8217;s not gonna be all useful.</p><p>So I&#8217;ll say an example, so if you say something like, oh give me the sea strain of the sea strain of maybe battery in the United States. An agent is just probably gonna run a query against different web servers, different webs and just print. Everything is not gonna be useful for you.</p><p>But when you take that into an enterprise environment where you need to manage costs, where you need to ensure that you maximize and optimize your queries, that you&#8217;ll see that is gonna be a big problem. So for me, I think the first thing I will look at is that we need a different query interface.</p><p>That we need to find a way that the data system, determines what query to execute. So for example, the database know, if there&#8217;s a way to, of course there&#8217;s gonna be a need to implement maybe metadata storage and all of those things to give an idea of what currently exists within the data system. So that determines immediately this is what is available. So if you run a query that is not relevant to what is in the device, you just don&#8217;t execute it. So that is the way it is in my head right now. So we need a new interface. We need something within the data system to determine what query need to be executed. is the way I will start designing.</p><p><strong>Shane</strong>: What&#8217;s interesting is this idea that when we move from humans to machine, we get infinite scale. So that example you used of, what is the sales trend of batteries in the US</p><p>A human, I have some natural constraints. I have natural constraints of time. I can&#8217;t spend two years going and finding that number.</p><p>I have a natural constraint of knowledge, right? I&#8217;m gonna Google it. I&#8217;m gonna find some websites, and then I&#8217;m probably gonna even run outta time or get bored, and I&#8217;m gonna stop. Whereas when we talk about , the agents, it&#8217;ll scale. If you let it right it&#8217;ll search every website in the world.</p><p>If you let it, if you till it to, it can run a thousand human, I don&#8217;t know. What is it? CPUs for the computer, miles per hour for a car? What&#8217;s a agent cycle for? For number of human hours. So it, it can run a lot of human hours in a short amount of time for a cost.</p><p>Anything we do that is a Pattern based on a human constraint of I don&#8217;t have time, I can&#8217;t scale myself, that disappears. And we&#8217;ve gotta be really cognizant of that, or we&#8217;re gonna get massive waste again, like you say, I think the other thing is this idea of, if you look at human behavior, if you&#8217;re in a data team and you&#8217;re in a team of five to nine, you&#8217;re gonna find that normally you&#8217;ll have a data modeling expert, you&#8217;ll have a data collection, source system expert, you&#8217;ll have a data engineering expert, you have this experts and you are naturally gonna go and ask the expert for some help. Hey, I am looking at this and I need to model it. And I know we, we are a data vault is our standard. So can you just. Gimme help because I&#8217;ve never done it. I don&#8217;t do it very often. Or can you peer review And one of the things we&#8217;ve found as we&#8217;re building this out is that when we had a single agent, so our agent&#8217;s called 80 when she was the only agent we used to flutter. So we would say, go and do this work. And she&#8217;s just time slicing her skills all over the place. And we always got back an okay response as soon as we then broke her out into other agents.</p><p>So we have 80 80, the data modeler, 80, the engineer who goes and figures out the transformation rules and then 80 debos. And so we talked to Ada Deboss and she knows about these other versions of herself. Then she goes, oh, okay, the next thing I need to do is model that data. And she talks to a, the data modeler.</p><p>And what that means is we provide a, the data modeler, a bounded context. We give her very specific prompts around You are a data modeler. You&#8217;re not an engineer, you&#8217;re not the boss, you&#8217;re not the bi you, this is your job. These are your skills. We give her a bounded subset of context. you can look at the data structures but you can&#8217;t go and do data quality tests.</p><p>Because we&#8217;re talking about, in this case, conceptual modeling. That&#8217;s not your job. Another 80 will then take what you&#8217;ve done and make it better or do her task. And what we find is that specialization of skills and then that handoff, which happens in a team, but you don&#8217;t really see it, if you then programmatically do that with the agents, we seem to get better responses.</p><p>So I&#8217;m with you. I think this idea of. What interface, what language, what set of skills, what persona can be bounded and then handed off all over the place. And that&#8217;s, I think, where we&#8217;re gonna need to, I still think sql to query data if we need to do that</p><p>is gonna be done.</p><p>However, we&#8217;ve had great success giving images to the lms, not data,</p><p>but it&#8217;s expensive. It&#8217;s wasteful. So if I give it a photo of a an image screen capture of an Excel spreadsheet,</p><p>It&#8217;s gonna do really well with it. But it cost me a lot more tokens than if I give it the CSP.</p><p>So I think it&#8217;s, again, that trade off between outta the possible right now and then the cost and waste of doing cool stuff. </p><p><strong>Mayowa</strong>: So I think part of the reason why I also mentioned that it&#8217;s, important for us to have this interface that accommodate for both natural language and also SQL is, right now, part of the way I work is I actually have bunch of queries that I&#8217;ve written in the past.</p><p>And I just dumped them in my LLM dump a lot of them. And right now I just ask questions. And because this LM already have history of my queries that I&#8217;ve used, TPV, how to calculate TPV, how to calculate TPC and all of those things. So I just say, Hey, can you give me, develop a query to, and it does that and then gimme the output.</p><p>So I think part of, what you&#8217;re saying is, providing contest is. There&#8217;s still gonna be a lot of work to be done around, contest, so giving queries that we&#8217;ve used in the past, or maybe anything that can provide more grounding. We actually, help resolve some of these issues.</p><p>I&#8217;ve seen it work firsthand. It&#8217;s working perfectly for me. There are a lot of requests that I don&#8217;t spend my time developing the query anymore. My, LLM just, does that for me, </p><p><strong>Shane</strong>: And that&#8217;s a really good point, is this multimodal thing. Joe Reese talks about multimodal arts. And so this idea that no organization actually uses one data modeling technique for their warehouse, They may say they&#8217;re dimensional star schemas, but there&#8217;s some relational stuff in there.</p><p>There&#8217;s some sorts. Yeah. We always have more than one modeling technique, and I think we&#8217;re gonna end up with more than one language. And that multimodal language is gonna be important. And so I, I&#8217;ll give you an example we. Presented all our documentation to 80 for our product.</p><p>And so we describe what we call change rules, which are transformation code. And we have a set of language you have to use for it. And so we described, how the rules work, what the structure is, what the structure of the language is, and there&#8217;s some examples,</p><p>if you want to go and pivot, do this if you wanna un-pivot, right? But it was all in text. And so when you know, somebody comes in to our platform and they go, I need to transform this data and they just write in plain text, I need to transfer this data, how would I do it? She came back with an okay answer.</p><p>But it was just, okay. And sometimes it was just wrong, she would hallucinate and we&#8217;re like, yeah, that&#8217;s not how our product works. And then what we found was effectively we run a multi-tenancy architecture. So we stood up another tenancy, or one we already had for our partners called Alliance, and we pushed example transformation code, our example, change rules across every customer we had.</p><p>We took the rule itself, not their data, and we said, this is a rule we&#8217;ve applied before. So your, exact same example of yours, of here&#8217;s a blob of code, this is a rule that we&#8217;ve used before. And we then created an MCP server that 80 could see, to see those rules.</p><p>That&#8217;s when we broke off to 80 the change rule.</p><p>And so all she does is if you ask her how you can transform this data in natural language, she will go and search those rules. But she&#8217;s effectively searching a code repository that is highly opinionated and highly structured, the language is the same. What&#8217;s in the language is different. And our the response back we get is so much better now.</p><p>And so the thing we found though was while we had the rule, the logic the language of the rule, we had no context. We never wrote down</p><p>Why we were doing. So as soon as we added that, a natural language of this rule is to take stats New Zealand data, which is being summarized as columns and unpivot it to rows so that we can use it later when we need a row per record.</p><p>Then she&#8217;s ah, now she&#8217;s getting both the, the thing. the same as you are saying is that if you take your blobs of code that you use on a regular basis, and I think you, when you talk about TPV and TPC, you are saying there&#8217;s some metrics in there, if you then in that code, put in the definition of the metric, TBV equals and then a bunch of texts, and that&#8217;s as a comment in that code, then the LMS are gonna get both a structured piece of code and context.</p><p>They&#8217;re gonna get sequel and they&#8217;re gonna get natural language. And I think that&#8217;s where we&#8217;ll end up. </p><p><strong>Mayowa</strong>: going back to your question, if you ask me what are the things I need to, think about if I&#8217;m going to design what this system should be, I think the interface is gonna be very important. And also, how to manage what query needs to be executed to avoid waste, redundancy. And, I think that is the way I&#8217;m gonna think about it.</p><p><strong>Shane</strong>: So let&#8217;s take that example where you&#8217;ve got you and blobs of code that you have found highly valuable and you&#8217;ve already tested putting that code as a reinforcement model, given that context of the code to an LLE and been able to ask questions and get back the help that you needed.</p><p>And let&#8217;s say that you do extend it out where you actually define what those metrics are in, in natural language, so it has a richer context. And then you&#8217;ve got three other people in your team. Who need to do the same thing, like right now, what would you do? Where would you store it?</p><p>Where would you surface it? what interface would you use to create it, to share it? Like how would that work for you right now?</p><p><strong>Mayowa</strong>: Now the way it works, this is just for my personal use. I&#8217;ve not had any reason to share with anybody. But I think that also, point to what I was talking about when it comes to, conceptualizing what this data system should look like, I think there&#8217;s gonna be a need for us to have part of the data system, maybe the database, a part of it that stores these I don&#8217;t know whether it&#8217;s gonna be a meta store, or whatever, that stores this information. And the reason being that regardless of who is running this query or who is submitting a probe, they have access to the same information.</p><p>And that information can help provide more grounding the LLM or the agent, or MCPO, whichever one that you&#8217;re thinking. I think that part of what we need to think about is a part of the data system. Maybe this can be the database. Now maybe this open a new opportunity for, database research to see how we can, if part of we can store, information that provides more grounding to JLLM.</p><p>So that is the way I&#8217;m gonna think about it, but right now I&#8217;ve not had any reason to actually explore. </p><p><strong>Shane</strong>: it&#8217;s an interesting one because if we think about the fact that we&#8217;ve always stored our operational data, O-O-O-T-P data separately from our analytical data,</p><p>Our OAP data, that&#8217;s because in the past we never really had a technology that allowed both to be stored efficiently in the same place and queried, they were two different query pads, two different storage patterns, and that&#8217;s not true anymore. We&#8217;ve got tools out there like single store that say I don&#8217;t work with &#8216;em, so I dunno if it&#8217;s true, but they say they do that, yet nobody&#8217;s really adopting it, that I can see, we still keep it separate.</p><p>And so when we start bringing this idea of a context store Victor databases, we seem to be the thing, but actually a lot of the times you don&#8217;t have to use those. You can use anything. So I&#8217;m intrigued like you, to see whether we end up with yet another database,</p><p>We now end up with.</p><p>Databases that store the data and the context side by side. It is an intriguing place to be. And then do we end up with a thing that I call the context plane, which is the idea of a shared centralized layer of context? Or do we end up with a context grid? The idea that context is stored next to the data and then something else federates it, provides a grid type architecture. And so again, we&#8217;re at the early days where lots of people were exploring and saying what works and what doesn&#8217;t. So just on that, just to close it out, so this is something you&#8217;ve been thinking about in your spare time, right? You&#8217;re not part of a software company building this, your organization.</p><p>You&#8217;re not part of a team building it out for them. This is something that just natural, inquisitive, nature going, yeah, that&#8217;s cool.</p><p><strong>Mayowa</strong>: Yeah. Pure, natural, inquisitive. I&#8217;ve read a lot of papers. There&#8217;s this interesting journal that came out of Berkeley. I can&#8217;t remember what the title is now, but it&#8217;s very interesting. They did a very fantastic job around this conversation too.</p><p>So yeah, for me, I&#8217;m not part of any team actively developing something in that space, but it is just something that I&#8217;m just interested in. </p><p><strong>Shane</strong>: And the next step is you&#8217;re gonna start writing. You&#8217;re gonna start writing your thoughts, starting write your explorations.</p><p><strong>Mayowa</strong>: That&#8217;s part of what I&#8217;m doing. By the time you release this I want to listen to this conversation again because some of the things that you said, I think that are really interesting. I want to look into them more deeply. So yeah, I&#8217;m just documenting my thoughts right now.</p><p><strong>Shane</strong>: And my advice to people is document lightly and document early. So share lightly, share early. don&#8217;t wait for this podcast to come out, document what you think right now and then write it up, share it, and then think about it a bit more and then write it up and share it. because people often find the journey of the way you think, as interesting as the answer. And what I say is by writing, it&#8217;s forcing you to think with clarity. you think, you know something, you&#8217;re like, oh yeah, I think I know how that&#8217;s gonna work. And then you write it down and you&#8217;re like, yeah, no that&#8217;s bollocks, that&#8217;s not gonna work.</p><p>So that art of writing just makes you think, &#8216; writing is structured it has beginning, middle, and end. it&#8217;s gonna force you to story, tell to yourself and validate what you think is gonna happen as, so yeah, my recommendation to everybody is, write small bits.</p><p>Push it out early, it helps you think it&#8217;ll change. And that&#8217;s okay. it&#8217;s not a book, It&#8217;s not something that you can&#8217;t change. It&#8217;s Hey, I thought this and now I think this, and I think that&#8217;s better than what I thought before. But I had to jump from A to B2C to D to get to E.</p><p>Alright when this does come out, how do people find what you are writing? Have you worked out where you&#8217;re gonna publish it?</p><p><strong>Mayowa</strong>: If I&#8217;m gonna post, it&#8217;s gonna be on my LinkedIn if I&#8217;m gonna leverage any other platform. I also put it on my LinkedIn. So the time I&#8217;m done with this, I think everybody can find it on my LinkedIn. </p><p><strong>Shane</strong>: Most of us are now writing on Substack &#8216;cause LinkedIn sucks for long form content. So I&#8217;ll encourage you to create a Substack yeah. And then till everybody on LinkedIn, that&#8217;s where the long form content is. &#8216;cause LinkedIn&#8217;s kind of killed the ability, which is really sad. &#8216;cause actually I&#8217;d rather just write in one place.</p><p>Excellent. People can see how you&#8217;re exploring and what you&#8217;re learning and what you&#8217;re sharing. I look forward to it.</p><p><strong>Mayowa</strong>: Yeah. Thank you.</p><p><strong>Shane</strong>: I hope everybody has a simply magical day.</p><h2>&#171;oo&#187;</h2><div class="pullquote"><p><em>Stakeholder - &#8220;Thats not what I wanted!&#8221; <br>Data Team - &#8220;But thats what you asked for!&#8221;</em></p></div><p>Struggling to gather data requirements and constantly hearing the conversation above?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0Bu2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0Bu2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg" width="387" height="342" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:342,&quot;width&quot;:387,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:19725,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/160520537?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!0Bu2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Want to learn how to capture data and information requirements in a repeatable way so stakeholders love them and data teams can build from them, by using the Information Product Canvas.</p><p>Have I got the book for you!</p><p>Start your journey to a new Agile Data Way of Working.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://adiwow.com/168&quot;,&quot;text&quot;:&quot;Buy the Agile Data Guide now!&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://adiwow.com/168"><span>Buy the Agile Data Guide now!</span></a></p><h2>&#171;oo&#187;</h2>]]></content:encoded></item><item><title><![CDATA[Building Data Services with AI with Jason Taylor]]></title><description><![CDATA[AgileData Podcast #79]]></description><link>https://agiledata.info/p/building-data-services-with-ai-with</link><guid isPermaLink="false">https://agiledata.info/p/building-data-services-with-ai-with</guid><dc:creator><![CDATA[Shagility]]></dc:creator><pubDate>Sat, 10 Jan 2026 08:27:16 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/1482968f-f2c6-49b1-ba24-8a4cfdd4e5d4_800x800.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Join Shane Gibson as he chats with Jason Taylor a former quant researcher who turned towards the light (or dark) side of data, to explore the practicalities and pitfalls of building data services using AI</p><blockquote><p><strong><a href="https://agiledata.substack.com/i/183787179/listen">Listen</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/183787179/google-notebooklm-mindmap">View MindMap</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/183787179/google-notebooklm-briefing">Read AI Summary</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/183787179/transcript">Read Transcript</a></strong></p></blockquote><p></p><h2>Listen</h2><p>Listen on all good podcast hosts or over at:</p><p><a href="https://podcast.agiledata.io/e/building-data-services-with-ai-with-jason-taylor-episode-79/">https://podcast.agiledata.io/e/building-data-services-with-ai-with-jason-taylor-episode-79/</a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://podcast.agiledata.io/e/building-data-services-with-ai-with-jason-taylor-episode-79/&quot;,&quot;text&quot;:&quot;Listen to the Podcast Episode on Podbean&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://podcast.agiledata.io/e/building-data-services-with-ai-with-jason-taylor-episode-79/"><span>Listen to the Podcast Episode on Podbean</span></a></p><p></p><p></p><blockquote><p><strong>Subscribe:</strong> <a href="https://podcasts.apple.com/nz/podcast/agiledata/id1456820781">Apple Podcast</a> | <a href="https://open.spotify.com/show/4wiQWj055HchKMxmYSKRIj">Spotify</a> | <a href="https://www.google.com/podcasts?feed=aHR0cHM6Ly9wb2RjYXN0LmFnaWxlZGF0YS5pby9mZWVkLnhtbA%3D%3D">Google Podcast </a>| <a href="https://music.amazon.com/podcasts/add0fc3f-ee5c-4227-bd28-35144d1bd9a6">Amazon Audible</a> | <a href="https://tunein.com/podcasts/Technology-Podcasts/AgileBI-p1214546/">TuneIn</a> | <a href="https://iheart.com/podcast/96630976">iHeartRadio</a> | <a href="https://player.fm/series/3347067">PlayerFM</a> | <a href="https://www.listennotes.com/podcasts/agiledata-agiledata-8ADKjli_fGx/">Listen Notes</a> | <a href="https://www.podchaser.com/podcasts/agiledata-822089">Podchaser</a> | <a href="https://www.deezer.com/en/show/5294327">Deezer</a> | <a href="https://podcastaddict.com/podcast/agiledata/4554760">Podcast Addict</a> |</p></blockquote><div id="youtube2-NdE5zcDbICo" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;NdE5zcDbICo&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/NdE5zcDbICo?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>You can get in touch with Jason via <a href="https://www.linkedin.com/in/jasonbennetttaylor/">LinkedIn</a></p><div class="pullquote"><p><strong>Tired of vague data requests and endless requirement meetings? The Information Product Canvas helps you get clarity in 30 minutes or less?</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://agiledataguides.com/ipc&quot;,&quot;text&quot;:&quot;Fix Your Data Requirements&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://agiledataguides.com/ipc"><span>Fix Your Data Requirements</span></a></p></div><h2>Google NotebookLM Mindmap </h2><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Q8Eg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff724a3fa-cd60-4392-8192-a305bb5e28fd_3944x6939.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Q8Eg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff724a3fa-cd60-4392-8192-a305bb5e28fd_3944x6939.png 424w, https://substackcdn.com/image/fetch/$s_!Q8Eg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff724a3fa-cd60-4392-8192-a305bb5e28fd_3944x6939.png 848w, https://substackcdn.com/image/fetch/$s_!Q8Eg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff724a3fa-cd60-4392-8192-a305bb5e28fd_3944x6939.png 1272w, https://substackcdn.com/image/fetch/$s_!Q8Eg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff724a3fa-cd60-4392-8192-a305bb5e28fd_3944x6939.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Q8Eg!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff724a3fa-cd60-4392-8192-a305bb5e28fd_3944x6939.png" width="1200" height="2111.5384615384614" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f724a3fa-cd60-4392-8192-a305bb5e28fd_3944x6939.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:2562,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:1395075,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/183787179?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff724a3fa-cd60-4392-8192-a305bb5e28fd_3944x6939.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Q8Eg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff724a3fa-cd60-4392-8192-a305bb5e28fd_3944x6939.png 424w, https://substackcdn.com/image/fetch/$s_!Q8Eg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff724a3fa-cd60-4392-8192-a305bb5e28fd_3944x6939.png 848w, https://substackcdn.com/image/fetch/$s_!Q8Eg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff724a3fa-cd60-4392-8192-a305bb5e28fd_3944x6939.png 1272w, https://substackcdn.com/image/fetch/$s_!Q8Eg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff724a3fa-cd60-4392-8192-a305bb5e28fd_3944x6939.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p></p><p></p><h2>Google NoteBookLM Briefing</h2><h2><strong>Executive Summary</strong></h2><p>This document synthesizes a discussion on the intersection of Artificial Intelligence and data services, drawing from a conversation between  Shane Gibson and Jason Taylor (JT). The core thesis is that while AI, particularly Large Language Models (LLMs), has dramatically lowered the barrier to entry for building sophisticated data services&#8212;especially in parsing unstructured data&#8212;it simultaneously demands a renewed focus on rigorous process engineering, human oversight, and robust evaluation.</p><p>Key takeaways include the shift in the data profession from role-based identities to a more fluid, skills-based approach, where transferable skills are paramount. The conversation categorizes data services into three primary use cases: internal business health, customer-facing data access, and external monetization, with AI impacting all three. A central argument is that unstructured data parsing is now a &#8220;largely solved problem,&#8221; thanks to models like Gemini that can interpret complex documents and even images with remarkable accuracy.</p><p>However, this technological advancement introduces significant risks. The concept of &#8220;blast radius&#8221;&#8212;the potential negative impact of an error&#8212;is critical in determining the appropriate level of AI automation, from human-in-the-loop &#8220;assisted AI&#8221; to fully autonomous systems. The speakers warn against &#8220;vibe coding&#8221; and the tendency to treat AI as infallible magic, citing high-profile failures (e.g., Deloitte, a lawyer using ChatGPT) as cautionary tales. The &#8220;maker-checker&#8221; paradigm is presented as a crucial process framework for ensuring quality and accountability. The discussion concludes that data professionals must apply their foundational principles of logging, testing, and healthy paranoia to the AI domain, continuously evaluating models and cross-validating outputs to build trust in these non-deterministic systems.</p><p>--------------------------------------------------------------------------------</p><p><strong>1. The Evolving Data Career: From Roles to Skills</strong></p><p>The dialogue begins by examining the career trajectory within the data field, highlighting a fundamental shift away from rigid job titles toward a focus on underlying skills and attributes.</p><p><strong>1.1. Breadth vs. Depth and The PhD Dilemma</strong></p><p>The transition from a specialized quantitative (&#8221;quant&#8221;) researcher to a broader data professional serves as a key example. This move is framed as a strategic choice between depth (e.g., heavy statistics, requiring a PhD to compete at the highest levels) and breadth (a wider data skillset).</p><ul><li><p><strong>Market Dynamics:</strong> The market often favors broader skillsets, enabling professionals to handle more of a project&#8217;s lifecycle end-to-end. As Shane notes, &#8220;...it&#8217;s easier if you&#8217;ve got a broad set of skills to be able to pick up a gig or do a role.&#8221;</p></li><li><p><strong>The &#8220;PhD Barrier&#8221;:</strong> In highly specialized fields like quantitative finance, a PhD can be a de facto requirement. JT comments on this pragmatically: &#8220;I don&#8217;t have a PhD and competing against PhDs sucks.&#8221; This has historical roots in the 1980s and 90s when finance began recruiting physics PhDs for their expertise in signal processing, which was analogous to financial market analysis.</p></li><li><p><strong>Stereotypes vs. Reality:</strong> While the market may have stereotypes about needing a PhD for certain roles, the speakers question the universal necessity, pointing out that &#8220;not all PhDs are the same.&#8221;</p></li></ul><p><strong>1.2. Attribute-Based Career Planning</strong></p><p>A core argument is that professionals should focus on their inherent attributes and preferred activities rather than chasing job titles, which can be defined differently across organizations.</p><ul><li><p><strong>Focus on Skills, Not Roles:</strong> JT strongly advocates for this approach: &#8220;I hate that the role-based mentality... is for somehow perpetuated.&#8221; This is reinforced by Shane&#8217;s example of survival analysis skills from genetics being applied to supermarket product placement.</p></li><li><p><strong>Data Persona Templates:</strong> Shane is developing a book on &#8220;data persona templates,&#8221; a skills-based framework. By analyzing job ads with a custom GPT agent, he has found that despite varied job descriptions, the underlying skill requirements often distill down to just three core personas.</p></li></ul><p><strong>Key Quote:</strong> <em>&#8220;data scientists are just quants, or quants are just data scientists with more subject matter expertise. Like it&#8217;s all kind of the same thing.&#8221;</em> - Jason Taylor</p><p><strong>2. Defining and Monetizing AI-Powered Data Services</strong></p><p>The conversation defines a &#8220;data service&#8221; primarily as a data-centric offering that generates revenue, distinguishing it from internal data teams that support a non-data primary business (e.g., selling ice cream).</p><p><strong>2.1. A Taxonomy of Data Use</strong></p><p>Shane proposes a three-category framework for the use of data in an organization:</p><p>1. <strong>Internal Use:</strong> Understanding and growing the business.</p><p>2. <strong>Customer Support:</strong> Enabling customers to access their own data (e.g., in a SaaS platform or bank).</p><p>3. <strong>External Monetization:</strong> Exposing data externally to generate revenue, which can include direct data sales or enabling partners.</p><p>The focus of &#8220;data services with AI&#8221; is primarily on the second and third categories, particularly where data is enriched or processed for monetization.</p><p><strong>2.2. Models of Data Services</strong></p><p>JT outlines several models for companies that sell data:</p><ul><li><p><strong>Pure Enrichment:</strong> A customer sends their data, the service does &#8220;something fancy&#8221; to it, and sends it back. The process is monetized.</p></li><li><p><strong>Raw Material Sales:</strong> Collecting and selling data, often via methods like web scraping.</p></li><li><p><strong>Integration:</strong> Providing specialty knowledge on how to integrate and organize disparate datasets.</p></li></ul><p>Companies like Bloomberg are cited as examples that successfully combine all three models.</p><p><strong>2.3. The Impact of Generative AI</strong></p><p>Generative AI introduces a new dynamic: non-determinism. Unlike traditional services that sell a predetermined, consistent product, AI-based services sell something that &#8220;may be variable at times.&#8221; This fundamentally changes the nature of the product and the processes required to ensure its quality.</p><p><strong>3. Unstructured Data Processing: A &#8220;Solved Problem&#8221;</strong></p><p>A significant portion of the discussion centers on the claim that AI has made the parsing of unstructured and semi-structured data a &#8220;solved problem.&#8221;</p><p><strong>Key Quote:</strong> <em>&#8220;the one that&#8217;s exploded the most by a massive amount has been unstructured or structured data parsing... I feel like that&#8217;s a solved problem. Now do, maybe that&#8217;s extreme.&#8221;</em> - Jason Taylor</p><p><strong>3.1. From Tesseract to Gemini</strong></p><p>The progress in this area has been substantial. In the past, extracting text from a PDF with tools like Tesseract was challenging, and even training specialized models like Google&#8217;s Doc AI yielded good but not &#8220;that good&#8221; results.</p><p>Now, modern models like Gemini Pro can process complex documents&#8212;including financial statements, org charts, and diagrams within PDFs&#8212;with &#8220;remarkable accuracy.&#8221; JT notes his surprise when he drops a document in and says, &#8220;give me everything,&#8221; and the model understands the content and structure exceptionally well. This has massively lowered the operational barrier to accessing this data.</p><p><strong>3.2. The Diminishing Moat of Domain Expertise</strong></p><p>Historically, the competitive advantage (or &#8220;moat&#8221;) for data service companies like Bloomberg or LexisNexis wasn&#8217;t just providing the raw data (which is often public), but the &#8220;many years of highly skilled and trained... professionals augmenting that raw data... with context.&#8221; This organization and curation is what created value.</p><p>LLMs are now diminishing this moat. They have &#8220;come a long way&#8221; and can infer much of the context that previously required thousands of human experts. While there is still value in tribal knowledge&#8212;&#8221;not everything we know is written down&#8221;&#8212;the gap has narrowed significantly.</p><p><strong>3.3. The Power of Visual Interpretation</strong></p><p>A key advancement is the ability of LLMs to interpret documents visually, not just as raw text.</p><ul><li><p><strong>Image-Based RAG:</strong> Processing image data (e.g., a screenshot of a report) instead of just the text can be &#8220;wildly more beneficial&#8221; because the model picks up on subconscious cues like layout, organization, and what else is on the page.</p></li><li><p><strong>Use Case: Report vs. Dashboard:</strong> Shane describes a project where an LLM successfully categorized 8,000 legacy reports by analyzing screenshots. The model learned the human-like heuristic: &#8220;If I see a single table of data, it&#8217;s a report. If I see multiple Widgety objects, it&#8217;s a dashboard.&#8221;</p></li></ul><p><strong>4. The Imperative of Human Oversight and Process Engineering</strong></p><p>Despite the power of AI, the speakers stress that its non-deterministic and fallible nature makes human oversight and robust processes more critical than ever.</p><p><strong>4.1. Blast Radius and Appropriate Automation</strong></p><p>The concept of <strong>&#8220;blast radius&#8221;</strong> dictates the level of risk and, therefore, the necessary level of human involvement.</p><ul><li><p><strong>Low Blast Radius:</strong> A mistake in a marketing campaign might result in spam.</p></li><li><p><strong>High Blast Radius:</strong> A mistake in pharmaceutical trial data could lead to a death.</p></li></ul><p>This leads to a hierarchy of AI implementation:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1yP3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3a3aa1b-0047-4540-8ca3-8cf86b6a6e97_843x157.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1yP3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3a3aa1b-0047-4540-8ca3-8cf86b6a6e97_843x157.png 424w, https://substackcdn.com/image/fetch/$s_!1yP3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3a3aa1b-0047-4540-8ca3-8cf86b6a6e97_843x157.png 848w, https://substackcdn.com/image/fetch/$s_!1yP3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3a3aa1b-0047-4540-8ca3-8cf86b6a6e97_843x157.png 1272w, https://substackcdn.com/image/fetch/$s_!1yP3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3a3aa1b-0047-4540-8ca3-8cf86b6a6e97_843x157.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1yP3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3a3aa1b-0047-4540-8ca3-8cf86b6a6e97_843x157.png" width="843" height="157" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c3a3aa1b-0047-4540-8ca3-8cf86b6a6e97_843x157.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:157,&quot;width&quot;:843,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:27112,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/183787179?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3a3aa1b-0047-4540-8ca3-8cf86b6a6e97_843x157.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1yP3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3a3aa1b-0047-4540-8ca3-8cf86b6a6e97_843x157.png 424w, https://substackcdn.com/image/fetch/$s_!1yP3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3a3aa1b-0047-4540-8ca3-8cf86b6a6e97_843x157.png 848w, https://substackcdn.com/image/fetch/$s_!1yP3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3a3aa1b-0047-4540-8ca3-8cf86b6a6e97_843x157.png 1272w, https://substackcdn.com/image/fetch/$s_!1yP3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3a3aa1b-0047-4540-8ca3-8cf86b6a6e97_843x157.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><strong>4.2. Failures of Blind Trust: &#8220;Vibe Coding&#8221;</strong></p><p>The discussion warns against the dangerous trend of &#8220;vibe coding&#8221;&#8212;uncritically accepting and deploying AI-generated output. High-profile failures serve as cautionary tales:</p><ul><li><p>A lawyer who used ChatGPT for a legal filing, which included fabricated case citations.</p></li><li><p>Deloitte being forced to repay a government agency half a million dollars after using AI to generate a report that &#8220;hallucinated a whole lot of case studies.&#8221;</p></li></ul><p><strong>Key Quote:</strong> <em>&#8220;If you hired a genius level person... Would you read their work after they generated it...? I don&#8217;t give a fuck how smart you are. I&#8217;m reading what you put together... I&#8217;m accountable. So why in any of these circumstances would you not check this stuff?&#8221;</em> - Jason Taylor</p><p><strong>4.3. The &#8220;Maker-Checker&#8221; Paradigm</strong></p><p>The solution to managing AI&#8217;s fallibility lies in process engineering. The <strong>&#8220;maker-checker&#8221; paradigm</strong>, a common process in manufacturing and finance, is proposed as an essential model for AI workflows. One agent (human or machine) creates the output (the &#8220;maker&#8221;), and a separate agent reviews and validates it (the &#8220;checker&#8221;). This builds in accountability and a review system, much like code reviews (PRs) in software engineering.</p><p><strong>5. Evaluation, Testing, and Trust in Non-Deterministic Systems</strong></p><p>The conversation highlights a cognitive dissonance where seasoned data professionals often forget their core principles of testing and validation when working with AI.</p><p><strong>5.1. The Underinvestment in &#8220;Evals&#8221;</strong></p><p>&#8220;Evals&#8221; (evaluations) are the AI equivalent of software testing. This is seen as a &#8220;massively important&#8221; but &#8220;under invested area.&#8221;</p><ul><li><p><strong>Complexity of Testing AI:</strong> Testing an AI system is more complex than traditional code because there are more moving parts that can change: the underlying LLM model (which vendors can update), the prompt, the RAG context documents, and subtle variations in the input data.</p></li><li><p><strong>Methods for Evaluation:</strong></p><ul><li><p> <strong>LLM as a Judge:</strong> Using one LLM (e.g., Claude) to evaluate the output of another (e.g., Gemini).</p></li><li><p><strong>Testing at Scale:</strong> Running a large number of tests, including edge cases and &#8220;chaos engineering&#8221; style random inputs, to understand the model&#8217;s boundaries.</p></li><li><p><strong>Ad Hoc Testing:</strong> Even simple measures like asking the same question multiple times to check for consistency in the answers is &#8220;better than nothing.&#8221;</p></li></ul></li></ul><p><strong>5.2. Logging and Healthy Paranoia</strong></p><p>Data professionals are trained to &#8220;log the shit out of everything,&#8221; yet often fail to apply this discipline to AI systems. Logging the reasoning path of an LLM is crucial for debugging and understanding its behavior, especially when an unexpected answer is produced.</p><p>A <strong>&#8220;healthy degree of paranoia&#8221;</strong> is described as a beneficial trait for data professionals. This involves an inherent distrust of outputs and a commitment to cross-validation. JT states, &#8220;I still crosscheck things. When I write code with LLMs, I read all of it, like all of it, I see my role as I am the reviewer.&#8221;</p><p><strong>6. The AI Toolkit and Professional Practices</strong></p><p>The speakers discuss their personal toolkits and workflows, revealing practical strategies for leveraging AI effectively.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8UfS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb721d8e1-7619-41f8-9b15-583fdfd7db61_817x218.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8UfS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb721d8e1-7619-41f8-9b15-583fdfd7db61_817x218.png 424w, https://substackcdn.com/image/fetch/$s_!8UfS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb721d8e1-7619-41f8-9b15-583fdfd7db61_817x218.png 848w, https://substackcdn.com/image/fetch/$s_!8UfS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb721d8e1-7619-41f8-9b15-583fdfd7db61_817x218.png 1272w, https://substackcdn.com/image/fetch/$s_!8UfS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb721d8e1-7619-41f8-9b15-583fdfd7db61_817x218.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8UfS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb721d8e1-7619-41f8-9b15-583fdfd7db61_817x218.png" width="817" height="218" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b721d8e1-7619-41f8-9b15-583fdfd7db61_817x218.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:218,&quot;width&quot;:817,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:31147,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/183787179?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb721d8e1-7619-41f8-9b15-583fdfd7db61_817x218.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8UfS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb721d8e1-7619-41f8-9b15-583fdfd7db61_817x218.png 424w, https://substackcdn.com/image/fetch/$s_!8UfS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb721d8e1-7619-41f8-9b15-583fdfd7db61_817x218.png 848w, https://substackcdn.com/image/fetch/$s_!8UfS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb721d8e1-7619-41f8-9b15-583fdfd7db61_817x218.png 1272w, https://substackcdn.com/image/fetch/$s_!8UfS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb721d8e1-7619-41f8-9b15-583fdfd7db61_817x218.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>A key professional practice is to <strong>always review AI-generated code</strong>. A common red flag is when the code references an outdated model version (e.g., Gemini 1.5 when 2.5 Pro is current), which is a &#8220;dead giveaway&#8221; that the user did not read the code.</p><p><strong>7. Future Outlook: Tribal Knowledge and Creativity</strong></p><p>The dialogue concludes by speculating on the future of knowledge and human creativity in an AI-dominated landscape.</p><ul><li><p><strong>Gravitation to the Mean:</strong> LLMs, by their nature, gravitate toward the mean or average of their training data. This could create a problematic feedback loop where human thought becomes less diverse.</p></li><li><p><strong>The Pendulum Swing to the Arts:</strong> As AI automates more rote, scientific, and predictable tasks (&#8221;sciences&#8221;), there will be a &#8220;pendulum swing&#8221; where society places a higher value on uniquely human traits like creativity, randomness, and artistic expression (&#8221;the arts&#8221;). JT states, &#8220;I am very bullish on the arts.&#8221;</p></li><li><p><strong>The Future of Tribal Knowledge:</strong> While some may try to hoard their proprietary knowledge behind paywalls, the speakers hope that technology will continue to lower the barrier to recording and sharing information. This could accelerate the advancement of human knowledge, as more ideas are documented and built upon. The belief is that we have not yet reached a point where &#8220;all thought has been explored.&#8221;</p></li></ul><p></p><div class="pullquote"><p><strong>Tired of vague data requests and endless requirement meetings? The Information Product Canvas helps you get clarity in 30 minutes or less?</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://agiledataguides.com/ipc&quot;,&quot;text&quot;:&quot;Fix Your Data Requirements&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://agiledataguides.com/ipc"><span>Fix Your Data Requirements</span></a></p></div><p></p><h2>Transcript</h2><p><strong>Shane</strong>: Welcome to the Agile Data Podcast. I&#8217;m Shane Gibson.</p><p><strong>JT</strong>: I am Jason Taylor, or you can call me jt. Either one&#8217;s fine.</p><p><strong>Shane</strong>: Hey, jt. Thanks for coming on the show. I think today we&#8217;re gonna have a bit of a chat around building data services with ai. But before we do that, why don&#8217;t you give the audience a bit of background about yourself</p><p><strong>JT</strong>: Yeah, sure. I guess the simplest way that I usually explain it is beginning of my career was more quant research and data, and then I just gradually went towards the data. I don&#8217;t know if that&#8217;s towards the light or towards the dark, but um, let&#8217;s see. I worked facts at Palantir. Usually everybody wants to talk about buy side, whole bunch of different places.</p><p>Now I play in startup land because I have a death wish. I don&#8217;t know. Yeah working on a bunch of fun stuff, </p><p><strong>Shane</strong>: what made you come from Quant To pure data.</p><p><strong>JT</strong>: That&#8217;s a good question. And, and quant is a data gig in a lot of ways, right? And Joe Reese started in this as well, doing more quant research at one point in his career. Like I, I think there are different types of quants but I think a good majority of them, and as people have become more technical over time, they&#8217;ve become more data oriented.</p><p>Almost like sometimes I joke that data scientists are just quants, or quants are just data scientists with more subject matter expertise. Like it&#8217;s all kind of the same thing. But , why I think the very practical answer is I don&#8217;t have a PhD and competing against PhDs sucks. </p><p><strong>Shane</strong>: I think it&#8217;s interesting. It&#8217;s this idea of breadth versus depth for me. I see people start out with heavy statistics, right? That&#8217;s what they love and that&#8217;s what they get into, and , that&#8217;s a very specific set of skills. And then often they&#8217;ll go into breadth. They&#8217;ll extend their skillset out and become more data and a little bit less steady.</p><p>And , my perception is because that&#8217;s where the market is it&#8217;s easier if you&#8217;ve got a broad set of skills to be able to pick up a gig or do a role and do more of the work end to end yourself than if you are a, a specialist with a really deep set of skills. I hadn&#8217;t quite thought about the PhD side, so I suppose, I always talk about, are you gonna be in the top 5% of your skillset? And I suppose and that area, to do that, you have to be a PhD</p><p><strong>JT:</strong> So first of all, let me start by saying I very much subscribe to the thing you said where I want to capitalize on my strengths and there are certain things I know I&#8217;m good at. And actually Google came out with this cool tool where you can actually talk to it about a job and it will tell you it&#8217;s all AI oriented, but it will actually tell you attributes and things. I think you like to self describe your attributes and it helps pinpoint potential jobs that it believes in for you. And I, it&#8217;s actually really smart and I very much subscribe to this of attribute based or characteristic based job placement, if you will, as opposed to like, you know, there&#8217;s a ton of people that are like, oh, I wanna be a data engineer or a data scientist.</p><p><strong>JT</strong>: And it&#8217;s great. Those roles are wildly different at different places. What things do you like to do? What attributes? So long-winded way of saying I very much subscribe to this. I think that there are certain things that I gravitated towards in terms of the attributes I like and the activities I like, like I like being analytical sometimes to a fault, which I think we all do in the data space to some degree. But then I also think the PhD thing maybe a little bit of, terrible history. But yeah, at some point in time, from a finance trading perspective people started leaning into math go figure. And there&#8217;s portions of finance that have always been in, in that space.</p><p>But at one point, I think it was in like the eighties if, yeah, more the eighties and May, maybe early nineties was when. They started pulling in like physics professors and the like, because people started looking across the aisle effectively and saying, Hey, you&#8217;re doing signal processing and studying these things, and hey, that&#8217;s the same as finance.</p><p>Like it&#8217;s the same thing. So people started looking at that type of math and that type of process that they were doing. And that&#8217;s, I think people started leaning into that, right? there&#8217;s a piece of this which is, I feel and I&#8217;m sure this will relate to AI really fast, but it&#8217;s just stereotypes, right?</p><p>Like people are like, oh, I need a PhD to do this. And I&#8217;m like, do you really, do you know what you&#8217;re doing? Do you think all PhDs are the same? But ultimately that&#8217;s a good chunk of the market. I&#8217;m not gonna discredit people having a PhD. I think that&#8217;s awesome. I definitely. Feel like I would&#8217;ve wanted to do that at some point in my life had I not made certain decisions.</p><p>But yeah it&#8217;s hard. There, those people are smart and it&#8217;s, especially on paper from a recruiting perspective you know how it&#8217;s some recruiters are just like, oh, here are the qualifications I was given, yay or nay. And that&#8217;s like</p><p><strong>Shane</strong>: I think I think the recruitment process is broken. I think that whole industry is gonna be disrupted. I think we&#8217;re gonna move to more community or closed network based recruitment. Very much </p><p><strong>JT</strong>: Aren&#8217;t we there though? Aren&#8217;t we already there? </p><p><strong>Shane</strong>: yes we are for some, but yeah. If I look at, over here, the way the government over here recruits, my standard joke is somebody in a government agency generates a job request job description with ai, to which then the candidate uses AI to generate the CV to match the job ad to which then the recruitment agent then uses AI to see whether the CV matches the job ad, I was talking to somebody and they were advertising for an administrator. Office kind of administrator&#8217;s part of their process. And they were saying they had five perfect cvs, perfect matches all the experience, all the skills. And when they went to interview them, none of them had done anything near what the CV said they had.</p><p>So I think this personal recommendation, this personal network is gonna become more and more important. And the idea of skills \ being transportable. I remember many years ago when I was working at SaaS and I was watching one of the projects, one of the consulting teams did for a supermarket.</p><p>And they ended up using survival analysis to figure out placement of take away hot meals with something else. And, I didn&#8217;t do well at stats at school. I didn&#8217;t enjoy it. So it was always funny when I went to work for a company that was pure statistics. And I was like how does this work?</p><p>And they said imagine you got, a tall man and a short woman, and they have a child. They, there&#8217;s survival statistics on, which genes are gonna survive and will the child be tall or short? That&#8217;s what we use to decide whether we should put beer or Coca-Cola next to those meals.</p><p>And I&#8217;m like, I still don&#8217;t understand it, but it makes sense. And so it&#8217;s this idea that actually, like you said with the finance stuff, these skills you get in another domain and then they are really useful in an alternate domain. So focus on the skills, not the role.</p><p>and that&#8217;s where the value is. Yeah, </p><p><strong>JT</strong>: Yeah, no, a hundred percent. I hate that that&#8217;s perpetuated. I hate that the role-based mentality for a lot of people is for somehow perpetuated that&#8217;s just how people think about career</p><p><strong>Shane</strong>: I&#8217;m currently perspiring on my second book, which is around data persona templates. And so one of the things I&#8217;m doing is I need examples for the book. So what I&#8217;ve started doing is downloading job ads for data professionals. And then I&#8217;ve created basically a chat GBT agent based on the book for a whole bunch of prompting.</p><p>And I&#8217;m telling it to take those job ads and create the persona template for me.</p><p>And the persona template is skill-based, it&#8217;s all about skills. It&#8217;s really intriguing to take all these different jobs that look different, and then when you boil them down to what is the persona it typically comes up with only three,</p><p>Three core ones, there&#8217;s variations of it, but I&#8217;m intrigued by that. But this one&#8217;s not about data personas. This one&#8217;s about building data services with ai. Why don&#8217;t you explain what the hell you mean by that? And we&#8217;ll go from there.</p><p><strong>JT</strong>: What do I mean by that? There&#8217;s a very easy rabbit hole here, which is like, what is a data product, which I&#8217;m going to avoid. We are not going to go down that hole</p><p><strong>Shane</strong>: Ah, come on. Pedantic semantics says 45 minutes of us arguing a definition of a definition.</p><p><strong>JT:</strong> Terrible. No, I&#8217;m not playing that game. What are definitions, Shane? Can we, yeah. Define definition. . If you ever see Joe Reese, make sure you ask him to define definition. Joe loves debating semantics if you didn&#8217;t know that already. So if you see him or hear him please ask him these questions.</p><p>He loves them. Anyways no. I&#8217;ll start with an interesting divide that I was actually talking to someone earlier about. I always find it interesting because the vast majority of the data space, usually when you talk to other people in data professions, they&#8217;re usually in some sort of supporting role, I&#8217;m gonna call it. And when I say supporting role, I mean that the data itself is not the revenue generating aspect of the company, Very basic examples. I work at a company that sells ice cream, so I help &#8216;em figure out how to best sell ice cream. But still the main thing that you&#8217;re selling their ice cream, right? There is an entire world of people like us that sell data. And I&#8217;ve lived in that world for a very long time. It largely focuses around things like finance, because finance is an industry that has to consume data in order to operate. There are other industries, I think marketing and advertising and things like that are also in, in this scope.</p><p>But there&#8217;s an entire industry of people that just sell data. So I think that, especially from a data services with ai, I think of a couple different things. There are the people that are pure enrichment based. Send me your data. I will do something fancy and send it back to you. There are the people just selling the raw material. Here&#8217;s the data I collected in some capacity. A lot of those people are doing web scraping, not all of them. And then there&#8217;s other people that do integration style work, and if you think about a Bloomberg, which is relatively a household name at this point, they do all of them, which is interesting, right?</p><p>They&#8217;re both consuming from the public domain. They&#8217;re also have, specialty knowledge around how to integrate the data, and then also how to enhance it if you give it to them. But now we&#8217;re in a new world, not to say generative AI is brand new. I think it&#8217;s relatively common knowledge.</p><p>Maybe not for everybody that generative AI&#8217;s been around for a little while. It&#8217;s just now very mainstream, very accessible. There&#8217;s 10 cajillion startups around it. But it&#8217;s very interesting because , of course the key. Aspect of it, which is it&#8217;s non-deterministic. So now I&#8217;m not selling a predetermined thing to some degree, I&#8217;m selling something that may be variable at times.</p><p>I think that, especially, services around AI today, there&#8217;s definitely no shortage of web scraping companies. , I think the one that&#8217;s exploded the most by a massive amount has been unstructured or structured data parsing, that&#8217;s exploded. And I feel and I&#8217;m curious of your opinion, and I&#8217;ll pause here. I feel like that&#8217;s a solved problem. Now do, maybe that&#8217;s extreme. Do you agree?</p><p><strong>Shane</strong>: I think it&#8217;s like when people tell me data collections a solved problem. And every time we onboard a new customer, two or three of their source systems ones we know will and then there is always an outlier system where there&#8217;s a hundred customers in the world and the APIs are really badly documented.</p><p>And the data structures are a nightmare for us to try and understand. And I sit there and go solve problem my ass. I just wanna go back though. There&#8217;s there&#8217;s an interesting one. One of, one of the things I do when I kind of work with companies and I&#8217;m helping teams out we start this idea of this playbook.</p><p>And the idea of a playbook is basically a bunch of slides and they have two purposes. The primary purpose is to explain how the data teams work for any new data team member. So when you onboard, you read this and you you get a feeling of how they&#8217;re structured, how their workflows, what the culture of the team is.</p><p>And the second one is if you&#8217;re a stakeholder in the organization, it tells you how the team works so you understand how to engage with them how long you&#8217;re gonna pretty much wait, what your role is and that data work. And one of the slides that becomes really common for me now, right at the beginning, is saying that I can categorize use of data in three ways internal use to understand the health of the business and grow the business data to support the customers where the customer&#8217;s actually accessing their own data.</p><p>So typically as software as a service or a bank or insurance company. And the third one is where data is used externally for monetization. And that might be selling data that might be enabling partners to use whatever you have. If you&#8217;re an insurance company and you&#8217;ve got insurance agents out there that I treat that as external data, you&#8217;re giving them access to that data outside your organization based on your customers to make more money.</p><p>So I kind of like that framing, so what we are talking about when you talk about data services with ai, you are talking about that last one, data being exposed externally, securely to make money somehow. And it may not be selling data, but it&#8217;s definitely we are exposing that data and sharing it to monetize it.</p><p>Is that what you&#8217;re talking about?</p><p><strong>JT</strong>: I think that is definitely correct. I also think it&#8217;s the middle one too which is like exposing their own data, if I heard that correctly.</p><p>The exposing their own data, I think is another one. Like a very common is just enrichment, it&#8217;s already my data. You may be adding or doing something to it, that then I&#8217;m monetizing the process as opposed to the data itself.</p><p>If.</p><p><strong>Shane</strong>: Okay. I heard a podcast a while ago that was intrigued me, and it was around a massive US company that had digitized and augmented all the lawyer case study ti case history, thingies can&#8217;t remember which one it was. I seem to think LexisNexis, but I&#8217;ve probably got that wrong. And what was interesting here, what was interesting was, and it&#8217;s coming back to this idea of semi-structured or structured what they were saying is,</p><p><strong>JT</strong>: too.</p><p><strong>Shane</strong>: Yeah. You define definition. Actually, I have a definition of a, and then I was doing a group thing with Ramona and Chris Gamble.</p><p>And Ramona and I have a disagreement on the definition of structure that unstructuredand And it&#8217;s all around csv, right?</p><p>Anyway so what&#8217;s interesting about this is you think okay, we&#8217;ve got all this case history stuff, and it&#8217;s in books and there&#8217;s probably digital versions of those books. And so it&#8217;s a sole problem. Now I could go and grab all that content and digitize it and create a service that competed with them.</p><p>And yes, they&#8217;ve got market share. The, how do I find the market in that, potentially, but I, I get the impression it&#8217;s not cheap, it&#8217;s for lawyers. Nothing with lawyers is cheap. So I could probably disrupt them and Uber them, or Netflix them right?</p><p><strong>JT</strong>: Yeah.</p><p><strong>Shane</strong>: What the key was the augmentation.</p><p>It was the many years of highly skilled and trained legal professionals augmenting that raw data, even though it was unstructured with context and that context is where their moat was. So before we get onto, , that idea of, is the problem solved? Is that what you are seeing though, is that once you get this data and then you add additional context to it, that creates the value, that creates the moat. That&#8217;s harder for anybody else to actually go and breach.</p><p><strong>JT</strong>: Yeah I&#8217;m just gonna start outright by saying yes, I agree with that, and I think that&#8217;s, again I&#8217;ll refer to Bloomberg, FactSet, s and p, et cetera, because that&#8217;s the world I know, which is, public company filings are public, right? You can go and you can download Apple or anybody else&#8217;s, 10 K or Q or whatever other filing. Cool, neat. There is some value there in being able to access that and make it easy to consume and blah, blah, blah. No discount to that. And there, there&#8217;s actually cool open source people doing that Exactly now, which I&#8217;ll come back to. But yes, I the work that they do to organize it, right? To your point on semantics and governance and all this stuff, it&#8217;s that organization that actually gives value to the data.</p><p>it&#8217;s not about just serving it up, it&#8217;s about making it usable with other stuff, or potentially integrating it, so on and so forth. So and is that all hardcore domain expertise? Some of it, not all of it. I imagine in the legal space it is far more, but the part I would want to come back to is that getting access to the data and the value provided in that just operationally dramatically lower, massively lower.</p><p>I don&#8217;t think anybody disagrees with that. I think that that integration, that domain expertise has also become more accessible. I think that even though, in that particular case, and maybe it was Lexi Nexus or something, has all these domain experts there I, I do also think that, lMS have come a long way. They can infer a lot of these things. It doesn&#8217;t mean they&#8217;re perfect, but it does mean that we&#8217;ve accelerated what usually took hundreds or thousands of individual people to fill those gaps. So the moat has diminished significantly.</p><p>Do I think that there&#8217;s still differentiation or IP and domain expertise? I absolutely do. things I&#8217;ve been talking to people about, like I I&#8217;m very comfortable with that idea. There&#8217;s a very basic idea, right? Not everything we know is written down, period. it&#8217;s just not written down, or it&#8217;s a little bit more out there. And at the end of the day, large language models, for the most part, gravitate towards a mean, that&#8217;s by definition what they do. Do I think that means they can&#8217;t learn these things? Not really. And I&#8217;m using learn very loosely in that context. Do I think it means they can&#8217;t learn these things?</p><p>Not necessarily, but I do think the tribal knowledge, et cetera. This is what people are trying to do with rag everything else, right? It&#8217;s just how do I shove my tribal knowledge into the thing so that it has the things it needs to do this. But yeah I, that is definitely what we&#8217;re talking about.</p><p>I think that the expectations and obviously what you can accomplish today is wildly different. And I think that especially from what you can do, the bar is tremendously lower. But I think the expectations are wildly off the chart. I think to the earlier point, unstructured versus structured There was a period of time when shoving a PDF into pick your favorite model, whatever that was sucked a lot. Even if the PDF was pretty modern, you could extract the text off great, cool, still not great. And we&#8217;ve come a substantial distance from that where, I, on a regular basis am processing PDFs just as things I&#8217;m doing as part of my day to day. And I really like Gemini. I&#8217;ll plug them. I don&#8217;t mind. I really like Gemini&#8217;s Pro model for the vast majority of things that I plug in there and bear in mind, these are all kinds of interesting financial statements from a variety of different providers. It could be org charts.</p><p>I&#8217;ve found org charts in there, which has been cool, diagrams, all kinds of weird shit. And they, with remarkable accuracy, just pull it off. And my favorite part is that, and not that I&#8217;m doing this professionally, but I think the first thing I always try and do just for absolute giggles, is I drop the document and I say, give me everything. And then I&#8217;m just like, let&#8217;s see what happens. Fuck it. Let&#8217;s see how good or how well it understands things. And I&#8217;m continuously surprised because, it wasn&#8217;t that long ago that, again you&#8217;d have a tesseract or one of these other platforms that&#8217;s very widely used and , widely accepted.</p><p>And even if you trained a model like Doc AI or any of these things, they were good, but they weren&#8217;t that good. Like I couldn&#8217;t just drop random shit in and be like, Hey, gimme, gimme the stuff. And then That&#8217;s awesome.</p><p><strong>Shane</strong>: I remember zero days when, we talked about dropping an invoice or a receipt and having it just turn up in your accounting system with accuracy. And this is 10, 15 years ago from memory. Back then it was a hard complex problem.</p><p>Now it&#8217;s not. One of the things I think we&#8217;ve gotta be really conscious is, is this idea of blast radius.</p><p>So what I used to always say in the data space, ghost of data past is, I&#8217;ll work on a marketing campaign. Because, the data we&#8217;re gonna get is gonna be crap, and therefore the results we&#8217;re gonna get are gonna be okay for moving a lever, but they&#8217;re not gonna be accurate. But if I was working on pharmaceutical data for a drug trial, then , it has to be different because the blast radius are getting that wrong.</p><p>When somebody dies, the blast radius of an incorrect marketing campaign is you spam somebody. And I think this is gonna be the same with using AI against data services. The standard, you&#8217;ve heard about, probably heard about the one where a lawyer used AI for the thing to the judge and Yeah.</p><p>And and there&#8217;s a use case in Australia where Deloitte&#8217;s had to pay back half a million dollars to a government agency because they used AI to generate their very expensive review document. And it hallucinated a whole lot of case studies. So I think we&#8217;ve gotta be careful about where we use it.</p><p>But I can see that the domain knowledge it has now from all the data and tribal knowledge it stole is useful, right? If the blast radius is acceptable, that actually it&#8217;s good enough to look at that, apply some context and it&#8217;s cheaper and faster than a human doing it, and the impact of it being non-deterministic and getting it wrong.</p><p><strong>JT</strong>: But on that note, and let&#8217;s, we can tie this easily back to governance and among other things and the, I guess the world that I&#8217;ve come from is very easy. In a lot of contexts to generate, I&#8217;ll use the Deloitte example because it&#8217;s fun to pick on them, to generate a report and just shove it out the door. Why do you do that? Fuck if I now, but it&#8217;s very easy to do that. And this is my current complaint with vibe coding as well, right? It&#8217;s just that people hit the button, they say, ah, it&#8217;s fucking magic. And then they shove it out the door and it&#8217;s for the love of God, do you read your prs?</p><p>Do you edit your own writing? Please go back and read what came out of the random black box. Places I&#8217;ve worked, especially, when we were at Palantir&#8217;s amazing. At shipping things fast, right? But I was with a bunch of people, I won&#8217;t say where, but like I was with a bunch of people.</p><p>I&#8217;d have to kill you if I had told you where. But and they were phenomenal engineers, very good at understanding the problem write code very fast. And I caught a couple times where I&#8217;m like, guys, just read each other&#8217;s stuff. Like it&#8217;s not, it doesn&#8217;t take, I promise it doesn&#8217;t take that long. And then, we got there really quickly. It wasn&#8217;t a big deal, but and they&#8217;re, again, all exceptional. So it was a fairly easy thing to do. But I know tons of people that are engineers that don&#8217;t read their prs or don&#8217;t have automated checks in place, or not linting, like all this.</p><p>Do you use Grammarly? If you write do you spell check shit? Like for the love of God? If you&#8217;re writing a paper that has citations, fucking check the citations. This is like basic stuff, and I think that people are getting so jazzed about the fact thinking it&#8217;s literal magic and just forgetting everything.</p><p>They&#8217;re like, what planet am I on? Hit the send button. Go for it. And it&#8217;s just yeah, just take. You saved 90% of your time that you would&#8217;ve otherwise spent writing it. You can spend another couple minutes just making sure it didn&#8217;t spit out garbage. And, I&#8217;ll rant on this for two more seconds.</p><p>If it was a person, if you hired a genius level person, let&#8217;s assume that you hired somebody tomorrow to help you with your job, that is a certified genius. Would you read their work after they generated it or would you just say here, okay, cool, and just submit it to your boss or your customer? I don&#8217;t give a fuck how smart you are. I&#8217;m reading what you put together, right? I need to know what it says. I need to know what I&#8217;m represent I&#8217;m accountable. So why in any of these circumstances would you not check this stuff like that? It just, it&#8217;s mind blowing to me</p><p><strong>Shane</strong>: It&#8217;s an interesting one because as we know, l LM is non-deterministic. And so people go how do you trust that it&#8217;s doing the right analysis? And Juan Cicada had a great comment many years ago where he said how do you trust the human?</p><p>And I sat there and I was thinking, yeah, the number of times I&#8217;ve seen a data analyst come up with a number and nobody peer reviewed it, and we trust it because a human wrote some code, and I suppose the code&#8217;s deterministic, you can go and see the code and run it time and time again and get the same response right or wrong.</p><p>That response is, but it, that&#8217;s not the point. The point was you trusted that number, made a business decision and nobody peer reviewed the process or the code. And I&#8217;ll go back to , one thing you know is definitely with ai vibe coding at the moment, if you wanna see how dangerous it is. We&#8217;re a Google platform, so I love Google.</p><p>Actually I love their platform. I love their technology. I hate their partnering and I hate their marketing &#8216;cause it&#8217;s the worst in the world. But anyway just go on to Reddit onto the Google Cloud subreddit and watch how many students are going and buy coding with the Gemini API and pushing their code to a public Git repository and then getting whacked with a three to $30,000 bill within two days  because their API key is publicly available and people are just scanning, get repos and grabbing those keys and slamming them.</p><p>Somebody should run their eye over the, and it&#8217;s like you just watch the unintended consequences of this democratization. But let&#8217;s go back to that image one. &#8216;Cause it, it&#8217;s interesting for me. So one of the things we did with one of our customers a while ago, they were moving from a legacy platform to a new data platform.</p><p>And they had, oh, I can&#8217;t remember what it was, but something like 8,000 Cognos dashboard reports. So they&#8217;re built over 20 years and </p><p><strong>JT</strong>: People watch every single one though too.</p><p><strong>Shane</strong>: They&#8217;re all active. When they asked which ones could we get rid of, the answer was none. And they&#8217;d spend a couple of months with a small team of really good data analyst bas trying to document them.</p><p>So all they really wanted to understand was how big&#8217;s the data estate, right? H how many of these do we have? What do they look like? Which ones do we migrate, rebuild, or migrate first? Which ones don&#8217;t? And what we ended up doing is we ended up building a tool called disco. Effectively they exported out all the Cogness reports as X ml.</p><p>Yep. That was definitely a disco </p><p><strong>JT</strong>: I&#8217;m dancing for anybody </p><p><strong>Shane</strong>: Uh, that&#8217;s right. Yeah. I have a habit of making t-shirts. So we have a T-shirt for 80 80. The disco people can buy it online if they really can. And so what we did was they basically pushed the XML files to us and then disco when and scanned it.</p><p>And we did a whole bunch of prompt engineering based on some patents to say, what&#8217;s the data model underneath them? What&#8217;s the information product canvas? What action and outcome do we think&#8217;s been taken, right? So we generated all this context and then pushed that back into a database so they could query it.</p><p>And that worked, right? There was some engineering we had to do because doing it for one XML file manually works, do it for a thousand repeatedly. You have to loop through. But the blast radius was small, right? Because really what they wanted is insight. And then what happened was they came back and they said, we documented the reports that were copied,</p><p>so this report looked like that report, but it had a new filter. And this report looked like that report, and it had one more column. So where people had just cloned the report and that helped them deju. But they came back with an interesting question and they said can you tell us which are reports and which are dashboards?</p><p><strong>JT</strong>: Oh, </p><p><strong>Shane</strong>: So, Hmm. Okay. And there was some business reasons why they wanted to do that. So what we tested was them uploading screenshots of the reports, dashboards. And what&#8217;s interesting is, yeah, Gemini and I think this, were back then, we, this is pre-pro 2.5, but even then it was good. It basically.</p><p>Did what a human did and said, if I see a single table of data, it&#8217;s a report. If I see multiple Widgety objects, it&#8217;s a dashboard. And it went through and categorized them and I was like, holy shit, that just makes sense. And the other thing I&#8217;ve been doing is uploading the information product Canvas as a screenshot.</p><p>So building a canvas with a stakeholder, taking a screenshot of it and putting it into the lms and then saying, give me the metrics, give me the business model give me a physical model. A whole lot of questions around it. Whereas in the past, what I&#8217;d do is scrape out the text for that and put the text objects in, whereas now I can just put an image in there and I get almost as good a response.</p><p>Now the key is the blast radius for what I want to do is understanding, , I want to understand quickly versus I&#8217;m not gonna go tell it to build an information product and deploy it. But yeah, I cut out a whole lot of effort and it feels magical.</p><p><strong>JT</strong>: So I&#8217;m gonna, I&#8217;m gonna do two things. One yeah. Image based LLM use is awesome. I saw someone recently note how we&#8217;ve been talking about rag, but doing it purely on I image data as opposed to on the text itself has been wildly more beneficial. And that&#8217;s because, there&#8217;s subconscious cues there, there&#8217;s things that we pick up on when you look at the layout of the text, what else is on the page, how the text is organized.</p><p>And it&#8217;s not just about looking at the text itself. And that&#8217;s been hugely beneficial. So that&#8217;s, whenever I do data file processing today, extraction, structured, unstructured, that kind of stuff, it&#8217;s all visual oriented. I try to avoid scraping entirely. Now, I&#8217;ll give you a kicker, which I don&#8217;t think this is IP at all, but like a kicker is a Excel.</p><p>I can&#8217;t pdf an Excel document that&#8217;s just extra stupid, right? You can&#8217;t do images of Excel. But Excel is a wildly interesting thing. This is where we can get into structured unstructured, right? Excel under the hood&#8217;s, really what XML or that weird format that it uses, right? In all I would define that as semi-structured.</p><p>Other people may fight me on this, that&#8217;s fine. But I would define that as semi-structured. Because it has structure inherently, but it&#8217;s also variable in nature. So I consider that semi. But those documents are hard to understand because, hell, I&#8217;ve seen too many really shitty financial models that are just like 20 tables in one tab and I&#8217;m like, oh, for the love of God, why did you do this? Who are you and what kind of chaotic person? Organize your shit, like gimme a break. This is insane. No. Scrolling around thousands of rows left or right. This is wild. And you&#8217;ve seen these, like people build financial models, just the most ass backwards ways on the planet. But visually interpreting them, assuming that you can get away without the pagination or anything like that is, very good. It&#8217;s way better from a visual standpoint. But I think that the big thing, and I&#8217;ll circle it back to the main topic here, If you think about, now there are companies out there that sell data where, that have a data oriented process, right? And that is their main revenue driver. And then you think about shoving a large language model into that process in some capacity, mostly probably because it&#8217;s unlocking some new features for you, or you&#8217;re moving faster, you don&#8217;t need humans, blah, blah, blah. In some ways It&#8217;s not really different than it was before. And I know that sounds really weird to say, but the reason I&#8217;m saying that is because if this is your product you always had, whether you acknowledged it or not, you always had a need to set up proper process to ensure you have a quality product. So for me, when I think about large language models and their use in any of these processes, it&#8217;s all process engineering. Yes, context engineering, blah, blah, blah, blah. But like it&#8217;s process engineering really that we&#8217;re talking about. And especially. Once you start talking about multiple agents, I&#8217;m air quoting agents because that&#8217;s a different bag of tricks. But once you get, multiple autonomous processes interacting, like it&#8217;s all process architecture, right? You&#8217;re, I think the most common one that a lot of people talk about is a whole maker checker paradigm of you make it, I check it. That&#8217;s how it works forever and always.</p><p>We don&#8217;t cross lines that works reasonably well. There&#8217;s some sort of, accountability structure and review system and so on and so forth. Prs have the same thing and people still put in shitloads of additional automation, but it&#8217;s process structure, right? So even if my entire data product just for hypothetical sake is me just. Shoving a prompt and hitting play repeatedly and then sending it to somebody , you should still have some sort of review system. You should still have some sort of checks to make sure it&#8217;s not garbage, every major manufacturing, et cetera. Everybody has this and that&#8217;s why like, I still think the lawyer and the Deloitte example, I&#8217;m sure the PowerPoint Deloitte put out was huge and it had a billion references and it was blessed by the Pope and shit.</p><p>Like I, I&#8217;m sure it, it had all this stuff so it probably make it really hard. But we&#8217;re data people, right? Rip all the fucking things off, go cross validate them with a deterministic process flag. The ones that don&#8217;t like you, you can bootstrap your probabilistic process with deterministic shit.</p><p>It doesn&#8217;t have to just be like, everything&#8217;s tossed to the wind where you&#8217;re using a new tool just hail Mary and pray. It makes money and VCs will pay you I don&#8217;t understand that mentality, we&#8217;ve been doing this for a long time.</p><p>I am definitely the first one out the door to use AI and LLMs for things, but it doesn&#8217;t mean I&#8217;m gonna let it, drive my car care for my kid. Like I, I want</p><p>some structure and controls around no different than a human right? And I think it&#8217;s a lot of people poo poo the idea of treating LMS as human , making them human-like. But I think that it&#8217;s a very good analogy, it corroborates my feeling towards vibe coding, which is , why in the fuck would you approach an engineer and just say, build me a website, and walk away and think it&#8217;s just gonna work.</p><p>Like they&#8217;re gonna build something. </p><p><strong>Shane</strong>: it&#8217;s even, worse now though, because you read people saying, my boss vibe coded over the weekend and gave me the code and told me to push it to production. Like there, there&#8217;s a problem. </p><p><strong>JT</strong>: I have a feeling that&#8217;s clickbait though.</p><p><strong>Shane</strong>: Yeah, probably. Although, I&#8217;ve met some managers. One of, one of the questions I&#8217;ve got around that Deloitte thing though was the first prompt in their agent always you telling it what the latest shape to use in the document, I like where pyramids this week or where circles or where matrixes, because you gotta change the shape of every  document that, that was dark as.</p><p>And by the way, </p><p><strong>JT</strong>: That&#8217;s where all the money comes </p><p><strong>Shane</strong>: the interesting thing is this idea of make a checker. Is that what it make </p><p><strong>JT</strong>: Make? checker. Yeah. It&#8217;s a process paradigm.</p><p><strong>Shane</strong>: yeah. Is around factorization. It&#8217;s about repeatability. It&#8217;s around deterministic.</p><p>And then we have artists, we have craftspeople that make things that are just one and done. they make it once and it is not deterministic.</p><p>It is probabilistic. It is a piece of furniture</p><p><strong>JT</strong>: I&#8217;m gonna fight you on this. I&#8217;m gonna fight you on this. I agree with you that if I am painting a painting, that it is easy to think about. I&#8217;m painting the painting and it just, I paint it and it&#8217;s over. But I don&#8217;t know, I don&#8217;t know if you do any art Shane, but I cook I&#8217;ll relate this to cooking. Do you cook?</p><p><strong>Shane</strong>: Yes.</p><p><strong>JT:</strong> Do you taste your food while you&#8217;re cooking?</p><p><strong>Shane</strong>: Yes.</p><p><strong>JT</strong>: Good. That is a good thing you should do. It&#8217;s not exactly make checker, but if you have a partner or someone you&#8217;re cooking for, sometimes you have them taste it, It doesn&#8217;t mean you have to have them check it just at the end and you get to redo the whole fucking thing. But at least having some process that ensures you are not going off the rails entirely. I make a lot of random stuff. A lot of the stuff I make I&#8217;ve never made before just because it&#8217;s fun and I always taste it midway through because you never know what might go wrong or you might, there&#8217;s little tweaks, more salty, more spicy, too spicy, that kind of stuff. But the make checker paradigm, I. that is a very particular paradigm . There are probably some ways that that Pattern is implemented where it&#8217;s you finish your thing, give it to me, I review, say whether you suck or not, hand it back to you. Might also be versions of that where it&#8217;s mid points,right? Like</p><p><strong>Shane</strong>: Okay so let&#8217;s take that, make a checker idea and say that the process, that even an artist, a craft person checks it themselves, right? They may have another person that&#8217;s as experienced as them, or, but they are checking, right? They always checking their work.</p><p><strong>JT</strong>: Hopefully to some degree. </p><p><strong>Shane</strong>: And within, gen ai, I have three types.</p><p>I have what I call ask ai, which is where you ask it a question, you get back a response, you ask a question and you&#8217;re chatting with it. And then you go off and make the decision as a human, right? And ideally get another human to, to check your work. Assisted AI is where it&#8217;s watching what you are doing and it&#8217;s going based on what I know, you might wanna think about this,</p><p>so it&#8217;s prompting you. You can listen to it, you can ignore it, but you carry on, you finish that task. And automated AI is where the machine does it and no human&#8217;s ever involved. Yeah. It just happens. And so if we take that idea of PDFs And if we think about code being deterministic And LMS being probabilistic, and we think about if I wanted to just upload a PDF and get some tribal knowledge back about it. That is a probabilistic problem </p><p>I can put it up there. I&#8217;ll give you some stuff. I&#8217;m in the loop. I&#8217;m gonna make some decisions. So therefore it&#8217;s an ask kind of feature and the blast radius of me getting it wrong lives with me,</p><p>and am I make checker paradigm. If I was automating that PDF to go into my finance system and put in the number, then maybe I move to an assisted model, I upload it. The machines identifies all the fields for me. It comes back and goes, this is invoice number, this is the tax amount, this is the total amount, this is the supplier.</p><p>You happy, click go. So it&#8217;s a system, it&#8217;s automating all that Rossi ship, but I&#8217;m still making the final call of Yeah, that&#8217;s right.</p><p>Versus if I take it to fully automated. That&#8217;s when I&#8217;m dropping in a thousand invoices. They&#8217;re going into my finance system and a payment has being made, and I am not in that loop.</p><p>At that stage. In my head, I go back to code, I go back to deterministic code that is looking at specific places on the layer of that invoice and saying that is the number, and if there&#8217;s no number there, don&#8217;t take it from anywhere else. And so I would say at the moment, I would not use an automated gen AI solution in that use case.</p><p>That&#8217;s only because I haven&#8217;t tried it lately. Like you said, when I uploaded images two years ago, it sucked. I upload them a year ago. It&#8217;s got amazing. I haven&#8217;t, done it this week. It&#8217;s probably gonna freak me out. So where would you sit, right? When do you jump from assisted human in the loop?</p><p>Make a checker to let the bloody thing take this unstructured or semi-structured data and just human out of the loop.</p><p><strong>JT</strong>: I&#8217;ll say a couple things. One, there&#8217;s the very obvious part that I&#8217;m gonna state because everybody&#8217;s gonna say crap about this, but security, obviously there&#8217;s a security element to this. We&#8217;re talking about financial statements, blah, blah, blah. Let&#8217;s remove that just for argument&#8217;s sake. So I&#8217;m gonna repeat again, we are removing the security concern here and the data privacy and all that shit, just to have a hypothetical conversation before people are like, eh, privacy, blah, blah, blah.</p><p><strong>Shane</strong>: But hold on. What&#8217;s your definition of security?</p><p><strong>JT</strong>: ask you, go call, call. I&#8217;m gonna give you Joe Reese&#8217;s phone number. You can call him. He loves to debate semantics. Anyways this is gonna sound funny. I don&#8217;t know where that line is, and I am actively and consistently trying to do it the automated way as much as possible, and I often equate a lot of these things to like meditation, right? This is all about building good practices. You have to set up the conventions in your mind, build the muscle memory to do the thing that you may not have done before. I&#8217;m with you it&#8217;s very easy for us as engineers to think about Hey, I wanna rip this one cell.</p><p>I know it&#8217;s in the same place every time off this document. Write the code to do it right? And don&#8217;t get me wrong, there&#8217;s an over-engineering element to this of throwing an LLM at a problem like that is definitely a bazooka. At an anthill, like a hundred percent. There&#8217;s also a time to market, I&#8217;ll call it component of this, which is how fast can you write that code compared to how fast I can go to a website and upload a file and ask a question.</p><p>I&#8217;ll quick draw with you and I&#8217;m willing to bet that I&#8217;ll win. And I&#8217;ll still get the same answer, right? And then there&#8217;s a middle ground if you really wanna fuck around, which is have the l lm write the code to do the deterministic thing. That&#8217;s a whole nother like I don&#8217;t have to have the LLM just do the work. I can have the LLM write the process to do the work. And then you get a little bit of, a little bit of both, Because it&#8217;s a deterministic process that was generated very fast. The, this whole name of the game is speed, how fast can I do X activity?</p><p>That&#8217;s our North Star. If we&#8217;re talking about expense parsing I literally just did this the other day, right? I dropped a PDF onto a platform and it read in the expense. Cool. Did I validate it? I didn&#8217;t actually, that&#8217;s not entirely true. I knew what the number was before I dropped it in and it happened to produce the right number.</p><p>And I was like, cool. So that&#8217;s my pseudo maker checker. Most of these platforms these days. And you made a comment before about. Trust. And trust and determinism, trust and code, right? Especially when we think about people. A lot of that trust is just based on transparency, Transparency, auditability, the ability to go back, And this is a very common paradigms in data. Like, how do I roll something back? How do I undo something? We were talking about this yesterday or the other day, right? Especially, from a version control has given us this wonderful sense of security. I can go back, I know what happened, blah, blah, blah, blah, blah, I can yell at Shane &#8216;cause Shane fucked it up. So we have blames. I think that, in this world where an LLM can do the work for you, again, from a cutting down time perspective, it does cut down the time. Maybe I put the document side by side with it, which is very common, of the old platforms and the new ones, Great. Look at the document. Here&#8217;s the value, here&#8217;s where the value came from now. Looking at a form, let&#8217;s just make this more complicated. &#8216;cause it&#8217;s fun, If it&#8217;s a financial statement, And there are a bunch of companies out there that just straight do this.</p><p>This is their business, right? Talking about AI data services. Their whole job is to take documents and PowerPoints, I won&#8217;t name names, but documents from investment funds as an example and pull off the values. Now there&#8217;s definitely an intelligent person out there saying, why the fuck don&#8217;t they just put the values out on an API?</p><p>And that&#8217;s a great question, but that would be logical and God knows none of this shit makes any sense. I know a lot of them put out these documents and they probably got fucking pictures on it and all kinds of stuff, and they&#8217;re, and hell, if they&#8217;re all the same, they&#8217;re definitely all different because why would they be the same?</p><p>That would make sense. there&#8217;s whole businesses centered around just ripping these documents, doing the OCRE, whatever. It&#8217;s one thing if it&#8217;s a table. And this kind of goes to the facts that Bloomberg stuff too. If it&#8217;s a table and you&#8217;re just like, I always want sell one, column one, row one, give me the number every time. Not super complicated, not a lot of assumptions that need to be made. And I think assumptions is one of the big things I think about when I think about LLMs and probabilistic patterns and things like that. And I can give you my convention there, but the number of assumptions it needs to make. Also relates to how much context you give it, how clearly you can describe the things it needs to do. So my usual grid of this is on one side it&#8217;s the number of decisions that you&#8217;d need to make. And then the other side is how much information you&#8217;ve given it, That&#8217;s any process.</p><p>It&#8217;s straight up any process. it&#8217;s for a human or a machine or anything at all. I could say, Shane, go make me a cake. I&#8217;ve given you no information. You have to make shit loads of assumption. You could make a cake out of mud and theoretically you&#8217;ve delivered, you&#8217;ve given me a cake, Or you could have made me a Lego cake for all I care.</p><p>That will also suffice, But that doesn&#8217;t necessarily meet what I have for expectations. So again long-winded piece here, but. If we did something more complex from a document processing perspective, A, hey, give me the revenue and the revenue&#8217;s in the cell, but then there&#8217;s four adjustments in footnotes and 12 other little things that you need to take into consideration. that&#8217;s how it gets complicated real fast. And we know in financial statements, this happens all the time. Like accounting for a lot of people seems like it&#8217;s a very route thing, and it is actually way more creative than you realize. So yeah, that&#8217;s where this stuff gets creative.</p><p>So to my process and doing things like this I always start small tasks. I always try to automate it if I can, if I&#8217;m comfortable with the security, et cetera. I always try. I have to start there to understand the bounds of what can be done or not done. And sometimes it&#8217;s process architecture to repeat that, or sometimes it&#8217;s how much information, am I clearly articulating the instructions, which is prompt engineering for lack of a better term, which is for humans, just fucking communication, which is always comical to me. I think that there&#8217;s no balance in some respect. You should always be trying to see what it can do again, I don&#8217;t think we&#8217;ve fully adapted to understanding I can use this every day. And that&#8217;s why I think it&#8217;s helpful for a lot of us to think of them as humans because it&#8217;s just oh, if I had an assistant, you&#8217;d know instantly what you&#8217;d have it do short of pick up my laundry, which it can&#8217;t do.</p><p><strong>Shane</strong>: I think there&#8217;s there&#8217;s an interesting kind of thread in there that I&#8217;ve been thinking about, , and I&#8217;ll just unpick it. When we built our data platform, because we&#8217;ve been building data platforms for years as consultants, there were a bunch of core patterns that we knew were valuable.</p><p>And because we pay for the cloud infrastructure and cost, not our customer, we are hyperfocused on cost of that Google Cloud stuff, because it&#8217;s our money. We log everything. . Every piece of code that runs, we log the code that was run, how long it took you, we have this basically big piece of logging where we can go back and ask questions.</p><p>You, oh, are we seeing a spike on the service? Which customer you or customers is it one customer? Is it a volume problem? Is it a seasonal thing? Is it across all customers and Google are changing their pricing model, which they do on a regular basis. And so that&#8217;s just baked into our DNA. Yeah. Log the shit out of everything.</p><p>&#8216;cause at some stage we&#8217;re gonna have to go and ask a question of those logs and we need that data. When we started moving into, AIing with our agent 80 we logged some stuff, but not everything. And as we saw our partners start to use her for really interesting use cases, they started saying, here&#8217;s some source data.</p><p>What kind of data model should I start modeling out? &#8216;cause we haven&#8217;t seen it. They have a piece of data and they wanna transform it. And so they&#8217;re saying, what we call transformations, we call &#8216;em change rules. What should the change rule look like? Talk me through how to create it.</p><p>And we just we saw those questions happening &#8216;cause we were logging the questions and, we started doing more context engineering to give her better access to things that gave her better answers. And what we didn&#8217;t do was when we moved to the, and air quotes here, reasoning models.</p><p>We didn&#8217;t ask her to log the reasoning path </p><p>Every time we asked her a question ourselves when we were testing something, we would ask a second question strike away of, how did you get that? Because we wanna understand where she&#8217;s grabbing the context, the prompt from. &#8216;cause that&#8217;s what we wanted to tweak.</p><p>And what was interesting is, our principle was log everything up, the kazoo, and yet we move into this AI world and we didn&#8217;t do it. And it&#8217;s and it&#8217;s of course once we saw. Once we saw our partners asking questions and we&#8217;re like, how the hell did she get that answer?</p><p>&#8216;cause it&#8217;s not the one we wanted her to have. And then we ask her the same question and we get a completely different answer. We&#8217;re like, okay. And so then what do we do? We just put, into the prompt or into the framework, effectively log the path for every question you answer. And now we have a richness.</p><p>So that&#8217;s the first thing is it&#8217;s really weird how as data professionals, we have this baked in principles and patterns for our entire life. And then as soon as we move into this AI world, somehow we forget what we always do. The second point&#8217;s around complexity. And so one of the things I do when I&#8217;m coaching teams is I will ask them to draw me a map.</p><p>Draw me a map of your architecture, draw me a map of your workflow, draw me a map of your data sources. Like just draw me maps, Because I, I&#8217;m a visual person. And essentially what you say about, giving an L-L-E-M-A layout, a map, and distance between things similarity or clustering of things.</p><p>They are visual indicators as humans, we&#8217;re really good at using.</p><p><strong>Shane</strong>: And so when I&#8217;m working with teams, the reason I want a map is the number of nodes and links. The number of boxes. The size of the map will gimme an instant identification or understanding of complexity. You have 115 source systems there that are going through 5,000 DBT pipelines.</p><p>I&#8217;m gonna go, that is a complex problem. You show me your team structure and it&#8217;s got, four layers of teams all handing off to each other and they&#8217;re in pods and squads and there&#8217;s 150 of them. You&#8217;ve got a complex business organization and team topology.</p><p>And so what&#8217;s interesting, because I was just thinking about as you&#8217;re talking about it, is if I take those reasoning paths and I basically do some simple statistics And I cluster two things, how many steps did it take that will infer the complexity of the context it&#8217;s trying to use in the task it&#8217;s trying to do.</p><p>And the second one is clustering around reusable paths where it&#8217;s constantly doing the same thing. Means that is almost a deterministic behavior, Because it&#8217;s constantly going through the same path. Where we see an edge case, an outlier, where it&#8217;s gone through a completely different path, we are like, Ooh, is that because, different question.</p><p>You just went and hallucinated for some weird ass reason. Or, yeah, there&#8217;s something interesting there because it&#8217;s different and we know different has value, which comes back to one of the things around eval, So there&#8217;s lots of work and it&#8217;s a new hot thing in the market is how  do I eval my, </p><p>Yeah.</p><p>We used to call it tests, right? And, we know what data people like testing go back to my point of principles and patterns that we apply for our data work every day. And then in the AI world, we don&#8217;t, let&#8217;s talk about the ones we never apply. And now you get this whole idea of judging, So the idea of, if I asked the LLM the same question four times, and then I go and determine whether the answers are similar, and if they are the same or very close, then I should have more trust in the deterministic capability of that answer.</p><p>And that&#8217;s a presumably, I haven&#8217;t actually bothered to go and see any research papers to say if that&#8217;s true or not.</p><p>But is that what you do? Do you tend to use that as an eval process when you are doing all this work to apply AI to reduce your effort?</p><p><strong>JT</strong>: Yeah. I always try and set up test harnesses of whatever kind, and I put evals in that same bucket. I agree. And, in. Our book club, We&#8217;re reading AI Engineering by Chip, which is a great book, talking all about the different evaluation methods and types, et cetera. I agree with the fact that I think this is a under invested area. I think it is a extraordinarily important area. And I equate it to testing in code, I think that there&#8217;s, a ton of code out there that isn&#8217;t tested for whatever ridiculous reason, And it&#8217;s always, people wanna move fast.</p><p>There&#8217;s businesses, blah, blah, blah. But, even if that means ad hoc testing, which is what you&#8217;re noting right now, run it a couple times and see the answer, that&#8217;s better than nothing. I&#8217;ve done a variety of methods now. I do LLM as a judge. I am still back and forth on this. So short of this, if you&#8217;re unfamiliar for anybody else it, it&#8217;s effectively, kinda like an adversarial, I have a system that generates something and then I have another LLM that more or less evaluates it.</p><p>I&#8217;ve done it with different LLMs evaluating others. So I might have Gemini as my model and I&#8217;ll use Claude as my evaluator or multiple evaluators, things like that. Or have it asked different questions from an evaluation standpoint, I&#8217;ve done that a variety of different ways.</p><p>That&#8217;s been very useful. I think anything where you can run your tests. At scale. Scale might be, that&#8217;s a very overloaded term. A any, anything where you can run a lot of tests, not spot tests, A lot of tests and a lot of extreme tests, I always tell people the QA joke about the bar, you know that joke, right? </p><p><strong>Shane</strong>: I may do, but tell it in </p><p><strong>JT</strong>: I&#8217;ll tell it anyways. I&#8217;m gonna butcher it. But the joke goes something like, somebody builds a bar, qa tester tests the bar orders one beer order, 99 beers, order a million beers, everything works fine. First guy in the door says, where&#8217;s the bathroom bar explodes? Great joke. Very appropriate in this circumstance, right? Of you don&#8217;t know what people are gonna ask. And I understand that large language models obviously are an NLP thing. It works on natural language, blah, blah, blah, blah. There&#8217;s many things in that case that are open-ended, so to say. And I think that the chat paradigm, especially as a UX paradigm, introduces a wild world of open-endedness, That, and anything can happen, And I talk to people that are building, again, AI products, and I&#8217;m like, do you want me to come to your product and ask you what my favorite color is? Because what what&#8217;s your thing gonna do then? You may have built this wonderful chat bot for finance or something, and I&#8217;m gonna walk up to it and be like, how do I make a chocolate souffle? And it might gave me a wrong answer. I might be like, this AI is terrible. I walked up to my finance AI and asked it where to drive to for lunch and it didn&#8217;t work. I was like no shit, but there&#8217;s a piece of that. I was just like, it&#8217;s not intended to work. So the open-endedness is pretty wild. But I do appreciate, there&#8217;s a bit of, this was just like almost chaos engineering, which is just I just wanna let it like just blast it with like random stuff. I also think that, having a prebuilt dataset, you should have some tests almost like unit tests to some degree. Ask it a question, you should get an answer. Then the next problem to your point is how do I evaluate whether the answer&#8217;s correct or not? And that, that is hard in and of itself. But there are conventions, ways you look for certain keywords, values, things like that. So I do think that evaluation&#8217;s massively important. It&#8217;s very easy to spot check a couple questions and just be like, yeah, it&#8217;s good. Move on with your life. And then you&#8217;re handing people free cars because they fucked with your chat bot, So it&#8217;s massively important, And I think that the more we can in a manner I speaking, beat the shit out of the machine and like literally test it to the nth degree. Even if you are using a language model, I think that&#8217;s helpful because the language model&#8217;s gonna come up with shit you probably didn&#8217;t even think of, like jack the temperature up and just tell it, ask random questions and make sure it doesn&#8217;t ask the same question twice through all I care and then you, this is our make checker a little bit can go review and say, Hey, this make any fucking sense. Or did it just spit out random garbage? I think all that&#8217;s super important. Like evaluation I think is very understated right now.</p><p>I think people are catching on a little bit, but. You said it before it&#8217;s very comical right now that we&#8217;ve got this new toy and everybody just forgets who the fuck they are and where they are, and they&#8217;re just like, yeah, cool. Let&#8217;s do it. You&#8217;re just you&#8217;re,</p><p>Yeah. You&#8217;re a seasoned professional. </p><p><strong>Shane</strong>: Ah, And an organization. I was talking to somebody the other day and they&#8217;re in a large organization where security is key. And then they were saying that their cloud provider, somehow the LLM part of the platform got turned on for the whole organization.</p><p><strong>JT</strong>: Ooh, fun.</p><p><strong>Shane</strong>: Wow. And so it&#8217;s interesting how this AI thing seems to change, behavior.</p><p>People, you  and again, </p><p><strong>JT</strong>: that case might be a mistake, just like a human error.</p><p><strong>Shane</strong>: yeah, maybe. But you do see, organizations that care about security and governance and privacy, and then you see people in those organizations grabbing things and putting them in their personal LLM. cause it&#8217;s so easy now to take a screenshot, and yes, they&#8217;re doing the wrong thing, but it&#8217;s amazing how human behavior is so different. when we talk about evals, we&#8217;ve gotta go back to complexity. Let&#8217;s just go back to data. If I have a piece of data and I have a piece of code and I have a, an assumption or an assertion I can define what the assertion is.</p><p>I can run that code against that data and the data doesn&#8217;t change, the code doesn&#8217;t change, and it can tell me whether that assertion is right or wrong. That&#8217;s three moving parts in that test suite. Why we don&#8217;t do that a lot really intrigues me. Now let&#8217;s talk about that within Gen ai.</p><p>Oh, actually no hold on. I&#8217;ve got. The data or the thing that I&#8217;m using, right? The input, PDF piece of data, I&#8217;ve got the question that I ask it, which is effectively the code to a degree and I&#8217;ve got my answer, my assertion that I expect. Ah, but actually I&#8217;ve got an LLM model, right? Which may or may not have been updated by the vendor without me knowing about it.</p><p>Ah, I&#8217;ve got a prompt. I&#8217;ve got a piece of text that&#8217;s actually embedded in that process that may or may not have changed or be interpreted differently. Ah, I&#8217;ve got rag or context, I&#8217;m pushing at some other stuff. That may or may not have changed. Ah, the PDF&#8217;s exactly the same except the dollar value for the invoice moved down 25 pixels.</p><p>Now that&#8217;s not gonna make a difference or does it? This is where we get to is that actually now if you do nodes and links for the process, we have lots more of them. And therefore, the complexity of what we&#8217;re testing the things that can change is massive. And talking about change again, one of the human natures I&#8217;ve found is once I start using a model or a tool, I get stuck.</p><p>I use chat gpt a lot for writing, because that&#8217;s what I started out with. I use perplexity for searching now. Over Google. So I perplexity. Now I don&#8217;t Google. We use Gemini for our platform because we&#8217;re bound to the Google Cloud. I use Claude a little bit for MCP testing, but I don&#8217;t use it a lot because I don&#8217;t code and I&#8217;m sitting there going, I can&#8217;t remember the last time I actually tried to decide when I would work out whether there are better models for what I want to do.</p><p>For the thing that I use chat GPT for. So how do you deal with that? Given the models are changing all the time and that the models are, tend to be good at specific tasks how often do you reevaluate the tool or model that you are using in the work that you do?</p><p><strong>JT</strong>: not as much as I think I should. But I evaluate the new ones as they come out. So , when GPT five came out, like I reactivated my open AI subscription, I really don&#8217;t like open ai. I don&#8217;t know what it is. I maybe they, I just like, feel like they&#8217;re like evil empire or something.</p><p>I, like everybody else started using their products when they came to market and, even before chat days. And I&#8217;m not gonna be that guy that&#8217;s just oh, I&#8217;ve been doing this before. But I used that stuff before and it was interesting and I was very curious about it, especially given like the work that I do. when, Chachi PT came out, I was definitely all over it and I was very fond of it. So I will tell you that I very often use for understanding information. I use Gemini, I appreciate that one. I am a Claude fucking junkie when it comes especially to coding, And I&#8217;m a heavy cursor user, like very heavy cursor user. And I will tell you my favorite thing that has come out period, that no offense to you, Shane, but I have definitely done some work while we&#8217;ve been talking, right? And this is my favorite thing. I have my ticketing system wired up to background agents or cursor.</p><p>So like I write. Reasonably verbose tickets that explain exactly what I wanted to do. And for context, like the scope of these tickets is like, Hey, you were taking in parameters one, two, and three. I don&#8217;t want that. I want it as one parameter that looks like this. And then I want you to do this with that. That&#8217;s the scope of tickets that I&#8217;m writing because I want to be in control of it. And it&#8217;s like a micromanagement approach. I write the ticket, I say Go and it writes it. And I just keep on with my day. So I could be like walking my kid or my dog or somewhere and I&#8217;ll just fire fucking tickets off.</p><p>I&#8217;ll be working all the time. It&#8217;s amazing. That&#8217;s my favorite feature by like a absolute mile and a half. And I love Claude 4.5. I think it&#8217;s amazing. And the code quality&#8217;s phenomenal and I&#8217;m probably paying them a disgusting amount of money. But it&#8217;s, I think it&#8217;s totally worth it right now because I can move epically fast and multitask.</p><p>so Gemini for information, Claude for code open AI for nothing unless I like, just want an alternative opinion. And I definitely consider them all like different people to some degree. I very much think of them like people, they all have their different quirks .</p><p>Some things are good, some things are bad. Some things are like, especially from a code perspective, some of the models are more overeager than others. Some of them wanna try and cover more edge cases than others. And I, sometimes I&#8217;m just like, no, just fucking do what I told you to do. I don&#8217;t need you to do 20 other things.</p><p>And I have all kinds of rules set up. So I have it set up in a way that it will do the extra things I told it to do and not the things I don&#8217;t want to do to some degree, which is very nice. I&#8217;ll make a separate point. And I&#8217;ve always been this way for better or for worse, but a healthy degree of paranoia I think is always good as data professionals, like healthy paranoia. There is unhealthy paranoia, but healthy paranoia. I am a big believer and I&#8217;ve met many people on my career and I think a lot of us agree with that level of just distrust, inherent distrust, And it&#8217;s not to say that we don&#8217;t trust everybody. I actually think I&#8217;m a reasonably trusting person, which I&#8217;m sure someone could take advantage of me just hearing that. But I still look up the things that I asked the LLM to do, to put this into context. I still crosscheck things. When I write code with LLMs, I read all of it, like all of it, I see my role as I am the reviewer. I am the checker, I always joke with people, it writes faster than me. That&#8217;s why I use it. It types faster than me, but I am always reviewing what it did. If it does extra things, great brownie points.</p><p>Take some, extra credits so you can, whatever, I&#8217;m very content with that. But I absolutely read everything as much as humanly possible. And sure, I do random experiments where I just tell it, build me some shit and I don&#8217;t really care. And that&#8217;s a fun experiment. But I&#8217;ve worked on a project recently that I&#8217;ll tell you about where, without going into too much detail, the project wrote something in a language I&#8217;m less familiar with and I read some of it and I definitely spent some time. I used an LLM to do this and to teach me about the other libraries, teach me about the other things that I&#8217;m just not as familiar with. So I spent some time doing that in, different LLMs, different context windows. So like hopefully there, there&#8217;s some, call it arb that&#8217;s happening there where, I&#8217;m hopefully getting the right information.</p><p>I also do Google things still. I don&#8217;t perplexity things, I think I perplexity things just sometimes &#8216;cause I wanna see what the answer is. But I do still Google shit. Cause , I&#8217;m not gonna trust it unless it&#8217;s like if it, if I asked it to write something in Beam, I better, I&#8217;m gonna go to the fucking beam website and check to see that it did it right.</p><p>And especially from a code perspective. And they may have fixed this, but my favorite like red flag reading somebody else&#8217;s generated code. One of my favorite red flags. There&#8217;s plenty of them. The comments and all this other shit. My favorite one for a long time was that, especially when you had it generate an AI model or something that used the language model, it usually wasn&#8217;t trained on enough information to know what the latest model was.</p><p>So it was always a really big dead giveaway when you generated code to use like Gemini, and it puts 1.5 in there. And I&#8217;m like, you didn&#8217;t read this I know you didn&#8217;t read this because it&#8217;s using a model that we all know is a year or two old now, but you didn&#8217;t spend any time just looking like that&#8217;s a really basic thing to look at. And you didn&#8217;t even look at it, , I usually demerits for that person. That&#8217;s a very glaring thing for me because it&#8217;s just read the code. It&#8217;s not that hard, I promise. It&#8217;s not that even if you skim it, it&#8217;s not that bad, it, it&#8217;s better to do that.</p><p>So to the point of do I evaluate the tools, do I do that? Yeah I absolutely still do. I definitely still spend time Learning about the stuff outside of my use of it as well. So that way again, same as a person, Even if you meet the smartest person on the planet, I still wanna have my own authority.</p><p>I still want to ha fact check people. This, maybe there&#8217;s a very existential philosophical comment here about between fake news and the internet and or the web, sorry, Juan, the web, So I&#8217;m a big proponent of cross validating things. I just believe that and sure, not everybody has time to do shit like that.</p><p>I definitely don&#8217;t do it for everything. There&#8217;s some things I just take face value, but I do believe in that, And I have, I have a little kid right now and I&#8217;m in the phase of why, which I love, I really love this</p><p>phase. I&#8217;m sorry, my, I think my wife hates it, but I love it because. he asks why he&#8217;s not at the annoying why phase, but he asks why for a lot of things. And I take it as an opportunity to go look them up </p><p>and we talk about things. And if he asks a subsequent why, I&#8217;ll look that up too. I don&#8217;t give a fuck. I love looking things up. But you, you should feel more empowered to do so. and I feel like people just are short attention span. They&#8217;re like, just don&#8217;t fucking care.</p><p>Shane: I think you&#8217;re right. I think that idea of trust, but verify and, coming back to your Gemini 1.5, they probably don&#8217;t know there is a Gini 2.5 because they haven&#8217;t done the work, They haven&#8217;t done the reps to become an expert or any experienced in it to know that actually there&#8217;s a difference between 2.5 PRO and 2.5.Flash.</p><p><strong>JT</strong>: Yep.</p><p>Or flashlight, or there&#8217;s flash with dates after them.</p><p>Like there, there&#8217;s all different fucking shit.</p><p><strong>Shane</strong>: Yeah, so just wanna look at time. I just wanna close out with two questions for you. So right at the beginning we were talking about the lawyer stuff and,</p><p>Expertise and tribal knowledge. And given that all the LMS have gone and stolen tribal knowledge that was publicly available and often privately available,do you reckon the world&#8217;s gonna move, that humans who have tribal knowledge are gonna stop documenting it in a way that it can be found?</p><p>Because actuallythe value now is tribal knowledge.</p><p><strong>JT</strong>: I wanna say no from like a humanity standpoint, but I do think that, think the world will become a lot more polarized. I&#8217;m gonna explain this in a weird way, and I&#8217;m not intending it. I know that there&#8217;s a political backdrop and everything as we&#8217;re recording this, but if the model inherently gravitates towards the mean, always gravitates towards stereotypes, that kind of thing, and we are having a feedback loop to ourselves on this, our own opinions are going to be potentially squeezed, Like kurtosis and all that fun stuff, if you wanna get into that. Do I think that people will stop? I think people will use these tools in more automated fashions and then that will create its own probably problematic feedback loop. I do, however, think that the pendulum will inevitably swing the other way where. I think people will still use these tools, still use tech to create content that maybe they didn&#8217;t record before. I don&#8217;t have a substack or any of this stuff. Yes, I&#8217;m talking to somebody about recording a podcast, which is why the fuck not even though we&#8217;re on or no. It&#8217;s mostly from I wanna help their business kind of thing.</p><p>But I think that hopefully technology will lower the barriers of entry to recording more. I hope that inevitably forces the pendulum to swing the other way at some point in time. And I remember reading an article recently, something about someone saying how there&#8217;s no second internet, there&#8217;s no second web. Really, again, I apologize. I like Juan&#8217;s perspective on this. It is the web, not the internet. And I I, I make actually a mental note, like always to correct myself. So thank you Juan for that. But there, there&#8217;s no second web like we&#8217;ve trained these things on so much information. At some point in time I do think we&#8217;ll start recording more and then there&#8217;s a very interesting dynamic of, okay, now everything&#8217;s recorded. To your point, tribal knowledge at all. What&#8217;s left? I&#8217;d like, and maybe this is overly optimistic and potentially naive, but I&#8217;d like to think that if we continue to use these tools of whatever capacity, we&#8217;ll advance our knowledge faster and therefore there will be more knowledge and more interesting ideas and creative ideas and I&#8217;ll emphasize something different that is related to this question, which is. I am a huge proponent of the fact that I think that especially as the pendulum swings back and forth, I am very bullish on the arts, to your point of painting earlier, Art is inherently creative. It&#8217;s inherently random. It&#8217;s inherently wild. There&#8217;s no predicting that you can generate pictures and stuff.</p><p>We all know how shitty those come out. And some of them come out great, but for better or for worse, art is something I don&#8217;t think a lot of us understand. We still don&#8217;t understand how we work. And I think that in comparison to science and science, I apologize to anybody that&#8217;s in science. I&#8217;m not trying to demean your field, but I think that I&#8217;m gonna call sciences things that are route in some nature.</p><p>They have predictive process. End results, so on and so forth. So I&#8217;m very bullish on art and I think the sciences, quote unquote, are gonna be marginalized away, and that&#8217;s part of this whole cycle in knowledge, et cetera, et cetera, as it learns more. But I do hope that technology encourages more people to, maybe just do random shit, write down more, write down random ideas, write down everything. I don&#8217;t think that we&#8217;ve gotten to a point where all thought has been explored, right? That&#8217;s what we&#8217;re talking about to some degree. or all possibilities of thought or captured. </p><p>Have you seen Jepson stuff?</p><p><strong>Shane</strong>: nah.</p><p><strong>JT</strong>: Look, okay. I will plug Vox right now because I, it&#8217;s one of those things that you watch it and you&#8217;re like, whoa, like this is cool because. He&#8217;s trying to replicate, like evolution, an evolutionary thought as like part of his process. And it&#8217;s like a, I&#8217;m sorry Jepson if you&#8217;re hearing this, but it&#8217;s like a very sophisticated Monte Carlo in some respects. It&#8217;s try everything. Or random forest, whatever you wanna say. And it&#8217;s way more sophisticated than that. I&#8217;m generalizing here, oversimplifying, but I do think that there&#8217;s something that we have that the machine doesn&#8217;t yet in its creativity and that&#8217;s art. And I think that&#8217;s, and I hope that there&#8217;s a pendulum swing. And don&#8217;t get me wrong, I think that the vast majority of society is gonna gravitate towards me and it&#8217;s gonna be some massive in acidification of fucking everything.</p><p>But I do think that for people that are creative thinkers like that, this is the time.</p><p><strong>Shane</strong>: I agree. I I think about data products without defining them or defining the definition of defining. And I&#8217;ve reviewed two or three that are currently being drafted around that. Some are what I would call bringing product thinking to data, and some are around a thing called data product.</p><p>That&#8217;s how I categorize them after reading the draft content. I know a couple of other people that are starting to write something around product thinking with data or data products as a thing. And my view is good. The fact that there&#8217;ll be five or 10 books in a similar subject space, the way they&#8217;re telling the stories is different.</p><p>the storytelling is different for every book, even though it is relatively in the same domain or subject.</p><p>And that&#8217;s great. &#8216;cause what that means is, they can write them &#8216;cause writing is easier now, and they can publish them because publishing is easier now. And then I can read them and I always pick up something new, something I didn&#8217;t know, something that entertains me or educates me.</p><p>And then I assimilate that into how I think, so I&#8217;m with you push more out. But I think we will see some people try to put paywalls up because that&#8217;s the trouble and knowledge is how they make money. interesting times. I think the other one and we&#8217;re outta time for this one, but I think given the fact that tokens are subsidized so heavily at the moment And we use the AI tools or the gene AI tools. To automate stuff for us because it is faster and it is cheap enough that we don&#8217;t have to care. It&#8217;s gonna change. And when we get out with the true cost of those things, then people who have built good automation who have built things that are efficient and optimized and aren&#8217;t lazy processes or lazy code they&#8217;re gonna be better.</p><p>And the people that just put a thousand PDFs in there with no context, no prompts and get an answer, probably your products will be one of the many AI startups that die a thousand deaths,</p><p><strong>JT</strong>: Or maybe gets bought. </p><p><strong>Shane</strong>: Yeah. And then let&#8217;s say, yeah, there is a difference between being bought and being Acqua hired.</p><p><strong>JT</strong>: Valid. Valid.</p><p><strong>Shane</strong>: So yeah. Again, what&#8217;s your definition of buying? Or what&#8217;s your </p><p><strong>JT</strong>: Yeah. </p><p><strong>Shane</strong>: of different lighting? alright. </p><p><strong>JT</strong>: you put a flame to the, yeah.</p><p><strong>Shane</strong>: Just to close it out if people wanna find you and find what you are thinking you&#8217;ve already said you don&#8217;t have a Substack, you might have a podcast</p><p><strong>JT</strong>: Website. LinkedIn&#8217;s the easiest way to find me.</p><p><strong>Shane</strong>: LinkedIn and Practical Data Discord. Because </p><p>you are one of the more active people in the community. That&#8217;s how we met. </p><p><strong>JT</strong>: correct. </p><p><strong>Shane</strong>: join us. It&#8217;s free. </p><p><strong>JT</strong>: It&#8217;s a great, it&#8217;s a great community. I always advertise it to people. I&#8217;m like, just yeah who do you talk to about your business? Come talk to us.</p><p><strong>Shane</strong>: Yeah. And we will talk back.</p><p><strong>JT</strong>: Sometimes we might just tell you to fuck off, but that&#8217;s fine.</p><p>It&#8217;s also cool. </p><p>We&#8217;re very unhinged.</p><p><strong>Shane</strong>: You might get a meme you&#8217;re gonna get a meme </p><p><strong>JT</strong>: we&#8217;re a very unhinged group. </p><p><strong>Shane</strong>: Thanks for that. That&#8217;s been an interesting chat. We kinda went all over the place, it&#8217;s good to talk to somebody that&#8217;s actually using AI to build data services to in production and ex and monetizing it as a, as, as much as possible, &#8216;cause that is an art or a science one of the two.</p><p><strong>JT</strong>: art.</p><p><strong>Shane</strong>: Alright, excellent. I hope everybody has a simply magical day. I.</p><h2>&#171;oo&#187;</h2><div class="pullquote"><p><em>Stakeholder - &#8220;Thats not what I wanted!&#8221; <br>Data Team - &#8220;But thats what you asked for!&#8221;</em></p></div><p>Struggling to gather data requirements and constantly hearing the conversation above?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0Bu2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0Bu2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg" width="387" height="342" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:342,&quot;width&quot;:387,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:19725,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/160520537?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!0Bu2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Want to learn how to capture data and information requirements in a repeatable way so stakeholders love them and data teams can build from them, by using the Information Product Canvas.</p><p>Have I got the book for you!</p><p>Start your journey to a new Agile Data Way of Working.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://adiwow.com/168&quot;,&quot;text&quot;:&quot;Buy the Agile Data Guide now!&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://adiwow.com/168"><span>Buy the Agile Data Guide now!</span></a></p><h2>&#171;oo&#187;</h2>]]></content:encoded></item><item><title><![CDATA[Business Language Driven ]]></title><description><![CDATA[AI is bringing back the art of discovering business reality, business language, core business concepts and core business processes]]></description><link>https://agiledata.info/p/business-language-driven</link><guid isPermaLink="false">https://agiledata.info/p/business-language-driven</guid><dc:creator><![CDATA[Shagility]]></dc:creator><pubDate>Fri, 09 Jan 2026 15:45:29 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ch6g!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7c13045-7e3c-493c-b7cf-b44b19bae361_1373x777.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Few things are really new, sometimes we just need to remember the valuable things we used to do in our previous Ways of Working.</p><h2>DORO : Define Once, Reuse Often </h2><p>As I prep my presentation for Data Day Texas in a few weeks and also work on content for both the Pink and Blue Books I find that I am browsing, referring and reusing patterns and patterns templates that I have found, refined or defined over the last decade or two.</p><p>My method of storing these are ad-hoc at best, I think this year I will try to put them all into a Obsidian repository and then hooking up something like Claude to see if it makes finding and reusing things a little easier.</p><p>The results of that experiment will be a post for a different day, for now I find browsing the slides I have used for the many presentation, training and mentoring sessions I have delivered over the decade the best place to find things I can potentially reuse.</p><p>And as I do that I realise there are a few foundational slides that I use repeatably.</p><p>Some of these are the &#8220;maps&#8221; I have documented briefly here:</p><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:155308917,&quot;url&quot;:&quot;https://agiledatawow.substack.com/p/intro-agile-data-way-of-working-blueprints&quot;,&quot;publication_id&quot;:798992,&quot;publication_name&quot;:&quot;The Agile Data Big Book of Ways of Working&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!QW2I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3f22b18-e014-4ada-b07a-7f76e10704a0_1280x1280.png&quot;,&quot;title&quot;:&quot;Intro - Agile Data Way of Working Blueprints&quot;,&quot;truncated_body_text&quot;:&quot;A big NO to frameworks and methodologies from me!&quot;,&quot;date&quot;:&quot;2025-01-21T04:47:30.647Z&quot;,&quot;like_count&quot;:3,&quot;comment_count&quot;:0,&quot;bylines&quot;:[{&quot;id&quot;:2774203,&quot;name&quot;:&quot;Shagility&quot;,&quot;handle&quot;:&quot;shagility&quot;,&quot;previous_name&quot;:&quot;ADI&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f09a2d19-6707-4ef9-a4e3-a5e770fb640f_1406x853.jpeg&quot;,&quot;bio&quot;:&quot;I help data and analytics teams change the Way they Work in a Simply Magical Way&quot;,&quot;profile_set_up_at&quot;:&quot;2022-07-03T07:55:44.645Z&quot;,&quot;reader_installed_at&quot;:&quot;2022-07-03T07:55:25.828Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:736779,&quot;user_id&quot;:2774203,&quot;publication_id&quot;:798992,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:798992,&quot;name&quot;:&quot;The Agile Data Big Book of Ways of Working&quot;,&quot;subdomain&quot;:&quot;agiledatawow&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Combining the best of agile, product and data patterns together to craft a new way of working&quot;,&quot;logo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/c3f22b18-e014-4ada-b07a-7f76e10704a0_1280x1280.png&quot;,&quot;author_id&quot;:2774203,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#9D6FFF&quot;,&quot;created_at&quot;:&quot;2022-03-13T20:45:36.345Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Shagility&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:null,&quot;is_personal_mode&quot;:false}},{&quot;id&quot;:896480,&quot;user_id&quot;:2774203,&quot;publication_id&quot;:952247,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:952247,&quot;name&quot;:&quot;Agile Data N&#8217; Info&quot;,&quot;subdomain&quot;:&quot;agiledata&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Simply Magical content about Agile Data Ways of Working&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b8892c64-a0c7-4c7b-9f49-a73be5280f22_1280x1280.png&quot;,&quot;author_id&quot;:2774203,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#FF6B00&quot;,&quot;created_at&quot;:&quot;2022-06-25T05:33:33.273Z&quot;,&quot;email_from_name&quot;:&quot;Data.N.Info@AgileData&quot;,&quot;copyright&quot;:&quot;AgileData.io Limited&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}},{&quot;id&quot;:2855394,&quot;user_id&quot;:2774203,&quot;publication_id&quot;:2810971,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:2810971,&quot;name&quot;:&quot;Information Product Canvas&quot;,&quot;subdomain&quot;:&quot;informationproductcanvas&quot;,&quot;custom_domain&quot;:&quot;informationproductcanvas.agiledataguides.com&quot;,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Information Product Canvas\na pattern template, to quickly discover and capture, data and information requirements, \nin a repeatable way, so stakeholders love them and data teams can build from them&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/68ded0e0-62cf-497b-812e-8be9bbbe0629_855x855.png&quot;,&quot;author_id&quot;:2774203,&quot;primary_user_id&quot;:131335767,&quot;theme_var_background_pop&quot;:&quot;#9A6600&quot;,&quot;created_at&quot;:&quot;2024-07-21T19:30:03.601Z&quot;,&quot;email_from_name&quot;:&quot;Shane Gibson (Shagility) from Agile Data Guides&quot;,&quot;copyright&quot;:&quot;Agile Data Guides&quot;,&quot;founding_plan_name&quot;:&quot;Free Book&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}},{&quot;id&quot;:5552097,&quot;user_id&quot;:2774203,&quot;publication_id&quot;:5443082,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:5443082,&quot;name&quot;:&quot;Data Persona Template&quot;,&quot;subdomain&quot;:&quot;datapersonatemplate&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Data Personas&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/85a1bb1b-6a61-4c40-b7a3-521a9a924805_1280x1280.png&quot;,&quot;author_id&quot;:2774203,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2025-06-24T22:16:36.075Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Agile Data Guides&quot;,&quot;founding_plan_name&quot;:&quot;Free Book&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}},{&quot;id&quot;:6645834,&quot;user_id&quot;:2774203,&quot;publication_id&quot;:6512167,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:6512167,&quot;name&quot;:&quot;Data Team Design&quot;,&quot;subdomain&quot;:&quot;datateamdesign&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Data Team Design&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d02012c5-3c47-4c62-b6eb-872bbbd17238_1280x1280.png&quot;,&quot;author_id&quot;:2774203,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2025-10-09T03:48:21.118Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Shagility&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}},{&quot;id&quot;:7065871,&quot;user_id&quot;:2774203,&quot;publication_id&quot;:6923446,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:6923446,&quot;name&quot;:&quot;Modeling Business Concepts&quot;,&quot;subdomain&quot;:&quot;modelingbusinessconcepts&quot;,&quot;custom_domain&quot;:&quot;modelingbusinessconcepts.agiledataguides.com&quot;,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Modeling Business Concepts&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/91fdf95d-96db-40a1-a2ec-c9f4b1a0060f_1280x1280.png&quot;,&quot;author_id&quot;:2774203,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2025-11-15T12:27:38.041Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Shagility&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}}],&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null,&quot;status&quot;:{&quot;bestsellerTier&quot;:null,&quot;subscriberTier&quot;:1,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;subscriber&quot;,&quot;tier&quot;:1,&quot;accent_colors&quot;:null},&quot;paidPublicationIds&quot;:[10845,1473069],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:false,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://agiledatawow.substack.com/p/intro-agile-data-way-of-working-blueprints?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!QW2I!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3f22b18-e014-4ada-b07a-7f76e10704a0_1280x1280.png"><span class="embedded-post-publication-name">The Agile Data Big Book of Ways of Working</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">Intro - Agile Data Way of Working Blueprints</div></div><div class="embedded-post-body">A big NO to frameworks and methodologies from me&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">a year ago &#183; 3 likes &#183; Shagility</div></a></div><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:183656225,&quot;url&quot;:&quot;https://agiledatawow.substack.com/p/problem-people-diagrams&quot;,&quot;publication_id&quot;:798992,&quot;publication_name&quot;:&quot;The Agile Data Big Book of Ways of Working&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!QW2I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3f22b18-e014-4ada-b07a-7f76e10704a0_1280x1280.png&quot;,&quot;title&quot;:&quot;Problem People Diagrams&quot;,&quot;truncated_body_text&quot;:&quot;When I start out developing a new training course, a presentation for a conference, putting together a mentoring session or starting on content for a new &#8220;an Agile Data Guide&#8221; I like to think about the core problem I am trying to help solve.&quot;,&quot;date&quot;:&quot;2026-01-06T11:29:40.001Z&quot;,&quot;like_count&quot;:2,&quot;comment_count&quot;:0,&quot;bylines&quot;:[{&quot;id&quot;:2774203,&quot;name&quot;:&quot;Shagility&quot;,&quot;handle&quot;:&quot;shagility&quot;,&quot;previous_name&quot;:&quot;ADI&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f09a2d19-6707-4ef9-a4e3-a5e770fb640f_1406x853.jpeg&quot;,&quot;bio&quot;:&quot;I help data and analytics teams change the Way they Work in a Simply Magical Way&quot;,&quot;profile_set_up_at&quot;:&quot;2022-07-03T07:55:44.645Z&quot;,&quot;reader_installed_at&quot;:&quot;2022-07-03T07:55:25.828Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:736779,&quot;user_id&quot;:2774203,&quot;publication_id&quot;:798992,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:798992,&quot;name&quot;:&quot;The Agile Data Big Book of Ways of Working&quot;,&quot;subdomain&quot;:&quot;agiledatawow&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Combining the best of agile, product and data patterns together to craft a new way of working&quot;,&quot;logo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/c3f22b18-e014-4ada-b07a-7f76e10704a0_1280x1280.png&quot;,&quot;author_id&quot;:2774203,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#9D6FFF&quot;,&quot;created_at&quot;:&quot;2022-03-13T20:45:36.345Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Shagility&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:null,&quot;is_personal_mode&quot;:false}},{&quot;id&quot;:896480,&quot;user_id&quot;:2774203,&quot;publication_id&quot;:952247,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:952247,&quot;name&quot;:&quot;Agile Data N&#8217; Info&quot;,&quot;subdomain&quot;:&quot;agiledata&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Simply Magical content about Agile Data Ways of Working&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b8892c64-a0c7-4c7b-9f49-a73be5280f22_1280x1280.png&quot;,&quot;author_id&quot;:2774203,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#FF6B00&quot;,&quot;created_at&quot;:&quot;2022-06-25T05:33:33.273Z&quot;,&quot;email_from_name&quot;:&quot;Data.N.Info@AgileData&quot;,&quot;copyright&quot;:&quot;AgileData.io Limited&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}},{&quot;id&quot;:2855394,&quot;user_id&quot;:2774203,&quot;publication_id&quot;:2810971,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:2810971,&quot;name&quot;:&quot;Information Product Canvas&quot;,&quot;subdomain&quot;:&quot;informationproductcanvas&quot;,&quot;custom_domain&quot;:&quot;informationproductcanvas.agiledataguides.com&quot;,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Information Product Canvas\na pattern template, to quickly discover and capture, data and information requirements, \nin a repeatable way, so stakeholders love them and data teams can build from them&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/68ded0e0-62cf-497b-812e-8be9bbbe0629_855x855.png&quot;,&quot;author_id&quot;:2774203,&quot;primary_user_id&quot;:131335767,&quot;theme_var_background_pop&quot;:&quot;#9A6600&quot;,&quot;created_at&quot;:&quot;2024-07-21T19:30:03.601Z&quot;,&quot;email_from_name&quot;:&quot;Shane Gibson (Shagility) from Agile Data Guides&quot;,&quot;copyright&quot;:&quot;Agile Data Guides&quot;,&quot;founding_plan_name&quot;:&quot;Free Book&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}},{&quot;id&quot;:5552097,&quot;user_id&quot;:2774203,&quot;publication_id&quot;:5443082,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:5443082,&quot;name&quot;:&quot;Data Persona Template&quot;,&quot;subdomain&quot;:&quot;datapersonatemplate&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Data Personas&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/85a1bb1b-6a61-4c40-b7a3-521a9a924805_1280x1280.png&quot;,&quot;author_id&quot;:2774203,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2025-06-24T22:16:36.075Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Agile Data Guides&quot;,&quot;founding_plan_name&quot;:&quot;Free Book&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}},{&quot;id&quot;:6645834,&quot;user_id&quot;:2774203,&quot;publication_id&quot;:6512167,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:6512167,&quot;name&quot;:&quot;Data Team Design&quot;,&quot;subdomain&quot;:&quot;datateamdesign&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Data Team Design&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d02012c5-3c47-4c62-b6eb-872bbbd17238_1280x1280.png&quot;,&quot;author_id&quot;:2774203,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2025-10-09T03:48:21.118Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Shagility&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}},{&quot;id&quot;:7065871,&quot;user_id&quot;:2774203,&quot;publication_id&quot;:6923446,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:6923446,&quot;name&quot;:&quot;Modeling Business Concepts&quot;,&quot;subdomain&quot;:&quot;modelingbusinessconcepts&quot;,&quot;custom_domain&quot;:&quot;modelingbusinessconcepts.agiledataguides.com&quot;,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Modeling Business Concepts&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/91fdf95d-96db-40a1-a2ec-c9f4b1a0060f_1280x1280.png&quot;,&quot;author_id&quot;:2774203,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2025-11-15T12:27:38.041Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Shagility&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}}],&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null,&quot;status&quot;:{&quot;bestsellerTier&quot;:null,&quot;subscriberTier&quot;:1,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;subscriber&quot;,&quot;tier&quot;:1,&quot;accent_colors&quot;:null},&quot;paidPublicationIds&quot;:[10845,1473069],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:false,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://agiledatawow.substack.com/p/problem-people-diagrams?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!QW2I!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3f22b18-e014-4ada-b07a-7f76e10704a0_1280x1280.png"><span class="embedded-post-publication-name">The Agile Data Big Book of Ways of Working</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">Problem People Diagrams</div></div><div class="embedded-post-body">When I start out developing a new training course, a presentation for a conference, putting together a mentoring session or starting on content for a new &#8220;an Agile Data Guide&#8221; I like to think about the core problem I am trying to help solve&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">3 months ago &#183; 2 likes &#183; Shagility</div></a></div><h2>Business Language Driven</h2><p>Another piece of content I reuse a lot is this slide:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ch6g!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7c13045-7e3c-493c-b7cf-b44b19bae361_1373x777.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ch6g!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7c13045-7e3c-493c-b7cf-b44b19bae361_1373x777.png 424w, https://substackcdn.com/image/fetch/$s_!ch6g!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7c13045-7e3c-493c-b7cf-b44b19bae361_1373x777.png 848w, https://substackcdn.com/image/fetch/$s_!ch6g!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7c13045-7e3c-493c-b7cf-b44b19bae361_1373x777.png 1272w, https://substackcdn.com/image/fetch/$s_!ch6g!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7c13045-7e3c-493c-b7cf-b44b19bae361_1373x777.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ch6g!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7c13045-7e3c-493c-b7cf-b44b19bae361_1373x777.png" width="1373" height="777" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c7c13045-7e3c-493c-b7cf-b44b19bae361_1373x777.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:777,&quot;width&quot;:1373,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:86555,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/183654985?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7c13045-7e3c-493c-b7cf-b44b19bae361_1373x777.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ch6g!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7c13045-7e3c-493c-b7cf-b44b19bae361_1373x777.png 424w, https://substackcdn.com/image/fetch/$s_!ch6g!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7c13045-7e3c-493c-b7cf-b44b19bae361_1373x777.png 848w, https://substackcdn.com/image/fetch/$s_!ch6g!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7c13045-7e3c-493c-b7cf-b44b19bae361_1373x777.png 1272w, https://substackcdn.com/image/fetch/$s_!ch6g!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7c13045-7e3c-493c-b7cf-b44b19bae361_1373x777.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I use this slide in almost every presentation I do that relates to data work or data modeling.</p><p>From memory this slide was based on content from Lawrence Corr that we reused as part of the one day &#8220;BEAM&#8221; course we developed and delivered as part of OptimalBI in NZ, the data and analytics consulting company I founded a few decades ago, .</p><p>It shows a fundamental principle that we apply as part of the Agile Data Way of Working.</p><p>The principle is we work with data by first focussing on the Business Language and then focussing on the source systems data structure or reporting needs after that.</p><p>I have talked to this slide many many times, but never actually created a reusable description for the principle itself.</p><h2>The principle of being Business Language Driven</h2><p>So let&#8217;s have a go at creating an initial definition now. </p><h3>Using the Pattern of the Agile Manifesto</h3><p>Here is a definition of the principle using the pattern from the Agile Manifesto:</p><blockquote><p><strong>Business language grounded in business reality</strong></p><p>The highest priority is to define and deliver information using the language of the business as it actually operates, so actions are based on shared understanding rather than system or technical abstractions.</p></blockquote><p></p><h3>Using the Pattern of the Agile Data Blueprint</h3><p>A component of the Agile Data Blueprint I deliver for organisations is a list of Data Management principles.  These are documented using a pattern I think I got from TOGAF.</p><p>Here is defintion of the principle using that pattern:</p><blockquote><p><strong>Name<br></strong>Business Language Driven</p></blockquote><blockquote><p><strong>Statement<br></strong>Information, data and actions are expressed in the language of the business as it actually operates, not in the language of systems or technology.</p></blockquote><blockquote><p><strong>Rationale<br></strong>Using business language anchors data work in business reality, reducing misinterpretation, rework, and translation overhead between teams. When data work reflects real business concepts and behaviour, teams align faster, reduce translation errors, and achieve better outcomes.</p></blockquote><blockquote><p><strong>Implications<br></strong>Stakeholders must invest time to assist in clearly identifying, defining and agreeing core business concepts and core business processes, while Data Teams must design, model and name data using those definitions consistently. This may require additional upfront collaboration and discipline, but it lowers long-term costs by simplifying communication, improving trust, and increasing reuse across analytics, reporting, and automation.</p></blockquote><p>*Both these definitions were created with the assistance of my ChatGPT friend for the initial version and then edited by me.</p><p>IMHO both these definitions still need a few more iterations to tighten them up.</p><h2>Information Value Stream</h2><p>If we think about the pattern of Business Language Driven from the lens of the Information Value Stream:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!x3VZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1aafb8a7-42dc-4514-a883-24fe93a1d20f_2806x1203.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!x3VZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1aafb8a7-42dc-4514-a883-24fe93a1d20f_2806x1203.png 424w, https://substackcdn.com/image/fetch/$s_!x3VZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1aafb8a7-42dc-4514-a883-24fe93a1d20f_2806x1203.png 848w, https://substackcdn.com/image/fetch/$s_!x3VZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1aafb8a7-42dc-4514-a883-24fe93a1d20f_2806x1203.png 1272w, https://substackcdn.com/image/fetch/$s_!x3VZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1aafb8a7-42dc-4514-a883-24fe93a1d20f_2806x1203.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!x3VZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1aafb8a7-42dc-4514-a883-24fe93a1d20f_2806x1203.png" width="1456" height="624" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1aafb8a7-42dc-4514-a883-24fe93a1d20f_2806x1203.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:624,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:121630,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/183654985?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1aafb8a7-42dc-4514-a883-24fe93a1d20f_2806x1203.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!x3VZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1aafb8a7-42dc-4514-a883-24fe93a1d20f_2806x1203.png 424w, https://substackcdn.com/image/fetch/$s_!x3VZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1aafb8a7-42dc-4514-a883-24fe93a1d20f_2806x1203.png 848w, https://substackcdn.com/image/fetch/$s_!x3VZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1aafb8a7-42dc-4514-a883-24fe93a1d20f_2806x1203.png 1272w, https://substackcdn.com/image/fetch/$s_!x3VZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1aafb8a7-42dc-4514-a883-24fe93a1d20f_2806x1203.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The source system data structures and the last mile reporting requirements all sit on the right of this diagram, ideally in the Design step, but often in the Build step.</p><p>We want Data Teams to start by focussing on the left hand side of the Value Stream.  </p><blockquote><p>I keep wanting to use the term &#8220;Data Product Team&#8221; instead of &#8220;Data Team&#8221; as I write this content but in reality it is just Data Teams applying product patterns and pattern templates.  They are still a Data Team.</p></blockquote><ul><li><p>We want them to understand the Problem that needs to be solved.</p></li><li><p>We want them to Ideate with Stakeholders on the options to potentially solve it.</p></li><li><p>We want them Discover which of those Ideas is the most Valuable, Usable, Viable, and Feasible.</p></li><li><p>We then want to prioritise the value of solving this problem compared to the value of all the other problems we could solve first.</p></li></ul><p>To do these things we need to use Business Language not the language of the source system structures or the last mile reporting requirements.</p><p>Stakeholders either don&#8217;t know, understand or care about the source system data structure.</p><p>They understand and care about the Business Reality.  They speak in the language of the business, not the language of the data structures or the technology.</p><p>With help Stakeholders might be able to articulate the Core Business Concepts and Core Business Processes that are contained within that Business Reality.  Data Teams should count themselves lucky if they do. </p><h2>Different data practitioner personas, different  language of requirements</h2><p>In my experience each of the data practitioner personas typically have a language to capture requirements based on their preferred way to work.</p><ul><li><p>Personas that align with that of a Data Engineer will often start the process of understanding the requirements by wanting to review the structure of the Source Systems which the data that is needed is created within.</p></li><li><p>Personas that align with that of a Data Scientist will often want to follow a pattern based on Exploratory Data Analysis (EDA) starting by exploring the data values themselves.</p></li><li><p>Personas that align with that of a BI Developer will often want to start by wire-framing the final Information Product delivery type, for example a Dashboard.</p></li><li><p>Personas that align with that of a Business Analyst will often want to start with documenting and understanding the business processes.</p><p></p></li></ul><p>Stakeholders will often align with the persona that best matches their career path or will default to wanting to start by describing the final Information Product output, such as the Report or Dashboard they want to use.</p><h2>Data Teams have trained Stakeholders wrong</h2><p>One of the mistakes we have made as data practitioners over the last few decades is teaching Stakeholders to talk to us in the language of Reports and Dashboards.</p><p>When Data Teams ask a Stakeholder what their &#8220;requirements&#8221; are, Stakeholders will often describe the Dashboards they think they need delivered.  They are giving the Data Team a solution as a requirement not the problem to be solved.</p><p>That is not the Stakeholders fault.</p><p>Data Teams have trained them to think in the language of Dashboards, as that has been the hammer they have delivered to hit every little nail, aka every problem the Stakeholders have.</p><p>Data Teams have trained Stakeholders to be fluent in the language of the data platforms most common last mile delivery type, rather than the Data Team learning a new language.</p><p>This focus on legacy dashboards to define requirements and deliver solutions will bite Data Teams in 2026 as data platforms move away from legacy BI tools that build multi use Dashboards that are reused to solve multiple business problems and towards GenAI generated &#8220;One shot&#8221; BI Apps that solve a single problem well, and is not reused for anything else.</p><h2>Missing T skills</h2><p>Another common problem that drives Data Teams to focus on source system structures or reporting requirements, rather than Business Language and Business Reality is the loss of &#8220;Business Analysis&#8221; skills from the core Data Teams.</p><p>I wrote about Skills vs Roles a while ago:</p><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:99217729,&quot;url&quot;:&quot;https://agiledatawow.substack.com/p/team-skills-vs-roles&quot;,&quot;publication_id&quot;:798992,&quot;publication_name&quot;:&quot;The Agile Data Big Book of Ways of Working&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!QW2I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3f22b18-e014-4ada-b07a-7f76e10704a0_1280x1280.png&quot;,&quot;title&quot;:&quot;Team - Skills vs Roles&quot;,&quot;truncated_body_text&quot;:&quot;It is a silo problem&quot;,&quot;date&quot;:&quot;2023-01-27T03:48:15.388Z&quot;,&quot;like_count&quot;:0,&quot;comment_count&quot;:0,&quot;bylines&quot;:[{&quot;id&quot;:2774203,&quot;name&quot;:&quot;Shagility&quot;,&quot;handle&quot;:&quot;shagility&quot;,&quot;previous_name&quot;:&quot;ADI&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f09a2d19-6707-4ef9-a4e3-a5e770fb640f_1406x853.jpeg&quot;,&quot;bio&quot;:&quot;I help data and analytics teams change the Way they Work in a Simply Magical Way&quot;,&quot;profile_set_up_at&quot;:&quot;2022-07-03T07:55:44.645Z&quot;,&quot;reader_installed_at&quot;:&quot;2022-07-03T07:55:25.828Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:736779,&quot;user_id&quot;:2774203,&quot;publication_id&quot;:798992,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:798992,&quot;name&quot;:&quot;The Agile Data Big Book of Ways of Working&quot;,&quot;subdomain&quot;:&quot;agiledatawow&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Combining the best of agile, product and data patterns together to craft a new way of working&quot;,&quot;logo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/c3f22b18-e014-4ada-b07a-7f76e10704a0_1280x1280.png&quot;,&quot;author_id&quot;:2774203,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#9D6FFF&quot;,&quot;created_at&quot;:&quot;2022-03-13T20:45:36.345Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Shagility&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:null,&quot;is_personal_mode&quot;:false}},{&quot;id&quot;:896480,&quot;user_id&quot;:2774203,&quot;publication_id&quot;:952247,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:952247,&quot;name&quot;:&quot;Agile Data N&#8217; Info&quot;,&quot;subdomain&quot;:&quot;agiledata&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Simply Magical content about Agile Data Ways of Working&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b8892c64-a0c7-4c7b-9f49-a73be5280f22_1280x1280.png&quot;,&quot;author_id&quot;:2774203,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#FF6B00&quot;,&quot;created_at&quot;:&quot;2022-06-25T05:33:33.273Z&quot;,&quot;email_from_name&quot;:&quot;Data.N.Info@AgileData&quot;,&quot;copyright&quot;:&quot;AgileData.io Limited&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}},{&quot;id&quot;:2855394,&quot;user_id&quot;:2774203,&quot;publication_id&quot;:2810971,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:2810971,&quot;name&quot;:&quot;Information Product Canvas&quot;,&quot;subdomain&quot;:&quot;informationproductcanvas&quot;,&quot;custom_domain&quot;:&quot;informationproductcanvas.agiledataguides.com&quot;,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Information Product Canvas\na pattern template, to quickly discover and capture, data and information requirements, \nin a repeatable way, so stakeholders love them and data teams can build from them&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/68ded0e0-62cf-497b-812e-8be9bbbe0629_855x855.png&quot;,&quot;author_id&quot;:2774203,&quot;primary_user_id&quot;:131335767,&quot;theme_var_background_pop&quot;:&quot;#9A6600&quot;,&quot;created_at&quot;:&quot;2024-07-21T19:30:03.601Z&quot;,&quot;email_from_name&quot;:&quot;Shane Gibson (Shagility) from Agile Data Guides&quot;,&quot;copyright&quot;:&quot;Agile Data Guides&quot;,&quot;founding_plan_name&quot;:&quot;Free Book&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}},{&quot;id&quot;:5552097,&quot;user_id&quot;:2774203,&quot;publication_id&quot;:5443082,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:5443082,&quot;name&quot;:&quot;Data Persona Template&quot;,&quot;subdomain&quot;:&quot;datapersonatemplate&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Data Personas&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/85a1bb1b-6a61-4c40-b7a3-521a9a924805_1280x1280.png&quot;,&quot;author_id&quot;:2774203,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2025-06-24T22:16:36.075Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Agile Data Guides&quot;,&quot;founding_plan_name&quot;:&quot;Free Book&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}},{&quot;id&quot;:6645834,&quot;user_id&quot;:2774203,&quot;publication_id&quot;:6512167,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:6512167,&quot;name&quot;:&quot;Data Team Design&quot;,&quot;subdomain&quot;:&quot;datateamdesign&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Data Team Design&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d02012c5-3c47-4c62-b6eb-872bbbd17238_1280x1280.png&quot;,&quot;author_id&quot;:2774203,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2025-10-09T03:48:21.118Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Shagility&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}},{&quot;id&quot;:7065871,&quot;user_id&quot;:2774203,&quot;publication_id&quot;:6923446,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:6923446,&quot;name&quot;:&quot;Modeling Business Concepts&quot;,&quot;subdomain&quot;:&quot;modelingbusinessconcepts&quot;,&quot;custom_domain&quot;:&quot;modelingbusinessconcepts.agiledataguides.com&quot;,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Modeling Business Concepts&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/91fdf95d-96db-40a1-a2ec-c9f4b1a0060f_1280x1280.png&quot;,&quot;author_id&quot;:2774203,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2025-11-15T12:27:38.041Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Shagility&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}}],&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null,&quot;status&quot;:{&quot;bestsellerTier&quot;:null,&quot;subscriberTier&quot;:1,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;subscriber&quot;,&quot;tier&quot;:1,&quot;accent_colors&quot;:null},&quot;paidPublicationIds&quot;:[10845,1473069],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://agiledatawow.substack.com/p/team-skills-vs-roles?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!QW2I!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3f22b18-e014-4ada-b07a-7f76e10704a0_1280x1280.png" loading="lazy"><span class="embedded-post-publication-name">The Agile Data Big Book of Ways of Working</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">Team - Skills vs Roles</div></div><div class="embedded-post-body">It is a silo problem&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">3 years ago &#183; Shagility</div></a></div><p>And as part of that I talked about the shape of skills in a data team.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zLkP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7aae3ad-e8b2-4935-86d0-0d0d955033f5_1060x890.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zLkP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7aae3ad-e8b2-4935-86d0-0d0d955033f5_1060x890.png 424w, https://substackcdn.com/image/fetch/$s_!zLkP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7aae3ad-e8b2-4935-86d0-0d0d955033f5_1060x890.png 848w, https://substackcdn.com/image/fetch/$s_!zLkP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7aae3ad-e8b2-4935-86d0-0d0d955033f5_1060x890.png 1272w, https://substackcdn.com/image/fetch/$s_!zLkP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7aae3ad-e8b2-4935-86d0-0d0d955033f5_1060x890.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zLkP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7aae3ad-e8b2-4935-86d0-0d0d955033f5_1060x890.png" width="1060" height="890" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d7aae3ad-e8b2-4935-86d0-0d0d955033f5_1060x890.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:890,&quot;width&quot;:1060,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zLkP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7aae3ad-e8b2-4935-86d0-0d0d955033f5_1060x890.png 424w, https://substackcdn.com/image/fetch/$s_!zLkP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7aae3ad-e8b2-4935-86d0-0d0d955033f5_1060x890.png 848w, https://substackcdn.com/image/fetch/$s_!zLkP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7aae3ad-e8b2-4935-86d0-0d0d955033f5_1060x890.png 1272w, https://substackcdn.com/image/fetch/$s_!zLkP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7aae3ad-e8b2-4935-86d0-0d0d955033f5_1060x890.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>*I think its probably about time to review this list of skills and see if it needs to be iterated.</p><p>If you look at Data Team members who have traditionally have a role of Data Engineer or BI Developer, you will see skills similar to the diagram below.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nlDK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb424ac8-87bf-45dd-bd83-7ba6250711a6_1060x890.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nlDK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb424ac8-87bf-45dd-bd83-7ba6250711a6_1060x890.png 424w, https://substackcdn.com/image/fetch/$s_!nlDK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb424ac8-87bf-45dd-bd83-7ba6250711a6_1060x890.png 848w, https://substackcdn.com/image/fetch/$s_!nlDK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb424ac8-87bf-45dd-bd83-7ba6250711a6_1060x890.png 1272w, https://substackcdn.com/image/fetch/$s_!nlDK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb424ac8-87bf-45dd-bd83-7ba6250711a6_1060x890.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nlDK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb424ac8-87bf-45dd-bd83-7ba6250711a6_1060x890.png" width="1060" height="890" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eb424ac8-87bf-45dd-bd83-7ba6250711a6_1060x890.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:890,&quot;width&quot;:1060,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nlDK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb424ac8-87bf-45dd-bd83-7ba6250711a6_1060x890.png 424w, https://substackcdn.com/image/fetch/$s_!nlDK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb424ac8-87bf-45dd-bd83-7ba6250711a6_1060x890.png 848w, https://substackcdn.com/image/fetch/$s_!nlDK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb424ac8-87bf-45dd-bd83-7ba6250711a6_1060x890.png 1272w, https://substackcdn.com/image/fetch/$s_!nlDK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb424ac8-87bf-45dd-bd83-7ba6250711a6_1060x890.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Compare it to the typical skills for somebody that typically had a role of Business Analyst below.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AEDZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f5be9eb-fd64-49b0-86e6-b6b27bb634a3_1060x890.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AEDZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f5be9eb-fd64-49b0-86e6-b6b27bb634a3_1060x890.png 424w, https://substackcdn.com/image/fetch/$s_!AEDZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f5be9eb-fd64-49b0-86e6-b6b27bb634a3_1060x890.png 848w, https://substackcdn.com/image/fetch/$s_!AEDZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f5be9eb-fd64-49b0-86e6-b6b27bb634a3_1060x890.png 1272w, https://substackcdn.com/image/fetch/$s_!AEDZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f5be9eb-fd64-49b0-86e6-b6b27bb634a3_1060x890.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AEDZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f5be9eb-fd64-49b0-86e6-b6b27bb634a3_1060x890.png" width="1060" height="890" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2f5be9eb-fd64-49b0-86e6-b6b27bb634a3_1060x890.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:890,&quot;width&quot;:1060,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!AEDZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f5be9eb-fd64-49b0-86e6-b6b27bb634a3_1060x890.png 424w, https://substackcdn.com/image/fetch/$s_!AEDZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f5be9eb-fd64-49b0-86e6-b6b27bb634a3_1060x890.png 848w, https://substackcdn.com/image/fetch/$s_!AEDZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f5be9eb-fd64-49b0-86e6-b6b27bb634a3_1060x890.png 1272w, https://substackcdn.com/image/fetch/$s_!AEDZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f5be9eb-fd64-49b0-86e6-b6b27bb634a3_1060x890.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>People with a Business Analysis background and these set of skills typically engage with Stakeholders a lot more than people with an engineering background and as a result seem to use business language more often.</p><p>When I am working with a data team for the first time, I can often guess the language that team will use based on the primary skills or roles in the team.  </p><p>Or I can review the things they produce and the language used within those things and guess at the primary roles and skills within that team.</p><h2>Metadata Driven Information Factories</h2><p>When I talk to data practitioners about being  Business Language driven, they will often jump straight to the pattern of metadata-driven development.</p><p>As I was writing this article I did a LinkedIN poll on this.<br><br><a href="https://www.linkedin.com/posts/shagility_as-a-data-practitioner-are-you-code-driven-activity-7414943978749480960-EgPO">https://www.linkedin.com/posts/shagility_as-a-data-practitioner-are-you-code-driven-activity-7414943978749480960-EgPO</a></p><p>In the post I wrote:</p><blockquote><p><br>As a data practitioner are you code driven or metadata driven in your way of working?<br><br>Do you write code that creates the data structures you need. <br><br>- And then use those structures or that code to create the &#8220;map&#8221; of the data model those structures represent.<br><br>Or do you define the data structures as &#8220;metadata&#8221; and have code that reads that metadata and then automatically creates the structures.<br><br>- And then use that metadata to create the &#8220;map&#8221; of the data model those structures represent.<br><br>(I tried to abstract a whole bunch of different context languages into 2 simple patterns above, did a so so job) </p><p></p></blockquote><p>I have been in the data domain for enough decades to have watched metadata driven data tools emerge and die with each new technology wave.</p><p>Our AgileData.cloud product is a form of a metadata driven capability, with a sprinkling of business language.</p><p>As the pattern for Context Planes start to emerge to support GenAI patterns, I think we will see these tools remerge.<br><br>But they will contain Context in the form of Business Language rather than metadata in a language of technology.<br><br>I look forward to the day we can capture this Context using the language of the business reality, and that powers the creation of the other languages for us automagically.</p><h2>In Summary</h2><p>One of the reasons I write is to help me think.</p><p>Sometimes I can sit down and write quickly and with clarity, as its something I have thought about for a long while and something I have taught other people.</p><p>Sometimes it is a new set of patterns or pattern templates and I yet to have clarity, so i write to think and iterate towards that clarity.</p><p>When its the latter I use a constraints model, I time box the content, to stop me going down endless rabbit holes.</p><p>This article was deffo the latter.<br><br>But the intersting thing is its based on a slide, a piece of content, a pattern, I have used for years.</p><p>So the fact that I still don&#8217;t have clarity on that pattern, is a surprise to me.</p><p>And means it probably never resonated with anybody else when I used it due to that lack of clarity.<br><br>In the famous words of the Terminator &#8220;ill be back&#8221; on this one.<br></p>]]></content:encoded></item><item><title><![CDATA[Can AI tools bring back data modeling with Andy Cutler ]]></title><description><![CDATA[AgileData Podcast #78]]></description><link>https://agiledata.info/p/can-ai-tools-bring-back-data-modeling</link><guid isPermaLink="false">https://agiledata.info/p/can-ai-tools-bring-back-data-modeling</guid><dc:creator><![CDATA[Shagility]]></dc:creator><pubDate>Tue, 23 Dec 2025 19:21:57 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/36b67043-cc5a-4414-9283-c8b776731cb1_800x800.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Join Shane Gibson as he chats with Andy Cutler about the art of data modeling and the potential of AI tools to improve the art.</p><blockquote><p><strong><a href="https://agiledata.substack.com/i/182448381/listen">Listen</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/182448381/google-notebooklm-mindmap">View MindMap</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/182448381/google-notebooklm-briefing">Read AI Summary</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/182448381/transcript">Read Transcript</a></strong></p></blockquote><p></p><h2>Listen</h2><p>Listen on all good podcast hosts or over at:</p><p><a href="https://podcast.agiledata.io/e/can-ai-tools-bring-back-data-modeling-with-andy-cutler-episode-78/">https://podcast.agiledata.io/e/can-ai-tools-bring-back-data-modeling-with-andy-cutler-episode-78/</a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://podcast.agiledata.io/e/can-ai-tools-bring-back-data-modeling-with-andy-cutler-episode-78/&quot;,&quot;text&quot;:&quot;Listen to the Agile Data Podcast Episode&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://podcast.agiledata.io/e/can-ai-tools-bring-back-data-modeling-with-andy-cutler-episode-78/"><span>Listen to the Agile Data Podcast Episode</span></a></p><p></p><blockquote><p><strong>Subscribe:</strong> <a href="https://podcasts.apple.com/nz/podcast/agiledata/id1456820781">Apple Podcast</a> | <a href="https://open.spotify.com/show/4wiQWj055HchKMxmYSKRIj">Spotify</a> | <a href="https://www.google.com/podcasts?feed=aHR0cHM6Ly9wb2RjYXN0LmFnaWxlZGF0YS5pby9mZWVkLnhtbA%3D%3D">Google Podcast </a>| <a href="https://music.amazon.com/podcasts/add0fc3f-ee5c-4227-bd28-35144d1bd9a6">Amazon Audible</a> | <a href="https://tunein.com/podcasts/Technology-Podcasts/AgileBI-p1214546/">TuneIn</a> | <a href="https://iheart.com/podcast/96630976">iHeartRadio</a> | <a href="https://player.fm/series/3347067">PlayerFM</a> | <a href="https://www.listennotes.com/podcasts/agiledata-agiledata-8ADKjli_fGx/">Listen Notes</a> | <a href="https://www.podchaser.com/podcasts/agiledata-822089">Podchaser</a> | <a href="https://www.deezer.com/en/show/5294327">Deezer</a> | <a href="https://podcastaddict.com/podcast/agiledata/4554760">Podcast Addict</a> |</p></blockquote><div id="youtube2--LXPaqBZoFo" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;-LXPaqBZoFo&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/-LXPaqBZoFo?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>You can get in touch with Andy via <a href="https://www.linkedin.com/in/andycutler/">LinkedIn</a> or over at <a href="https://linktr.ee/andycutler">https://linktr.ee/andycutler</a></p><div class="pullquote"><p><strong>Tired of vague data requests and endless requirement meetings? The Information Product Canvas helps you get clarity in 30 minutes or less?</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://agiledataguides.com/ipc&quot;,&quot;text&quot;:&quot;Fix Your Data Requirements&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://agiledataguides.com/ipc"><span>Fix Your Data Requirements</span></a></p></div><h2>Google NotebookLM Mindmap </h2><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fO4p!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48fb47cd-a498-4d3e-ae5e-af9fbefaf315_4121x9737.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fO4p!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48fb47cd-a498-4d3e-ae5e-af9fbefaf315_4121x9737.png 424w, https://substackcdn.com/image/fetch/$s_!fO4p!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48fb47cd-a498-4d3e-ae5e-af9fbefaf315_4121x9737.png 848w, https://substackcdn.com/image/fetch/$s_!fO4p!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48fb47cd-a498-4d3e-ae5e-af9fbefaf315_4121x9737.png 1272w, https://substackcdn.com/image/fetch/$s_!fO4p!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48fb47cd-a498-4d3e-ae5e-af9fbefaf315_4121x9737.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fO4p!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48fb47cd-a498-4d3e-ae5e-af9fbefaf315_4121x9737.png" width="1200" height="2835.164835164835" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/48fb47cd-a498-4d3e-ae5e-af9fbefaf315_4121x9737.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:3440,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:1988975,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/182448381?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48fb47cd-a498-4d3e-ae5e-af9fbefaf315_4121x9737.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fO4p!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48fb47cd-a498-4d3e-ae5e-af9fbefaf315_4121x9737.png 424w, https://substackcdn.com/image/fetch/$s_!fO4p!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48fb47cd-a498-4d3e-ae5e-af9fbefaf315_4121x9737.png 848w, https://substackcdn.com/image/fetch/$s_!fO4p!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48fb47cd-a498-4d3e-ae5e-af9fbefaf315_4121x9737.png 1272w, https://substackcdn.com/image/fetch/$s_!fO4p!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48fb47cd-a498-4d3e-ae5e-af9fbefaf315_4121x9737.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p></p><h2>Google NoteBookLM Briefing</h2><h2>Executive Summary</h2><p>This document synthesizes a discussion between data experts Shane Gibson and Andy Cutler, focusing on the persistent challenges and future direction of data modeling within the modern data landscape. The central argument is that while data technology has rapidly evolved toward accessible, powerful cloud platforms, the discipline of data modeling has been neglected, creating a significant knowledge and practice gap.</p><p>The conversation identifies a &#8220;repeated and constant battle&#8221; to prioritize modeling over the immediate appeal of technology, which provides instant feedback that modeling processes lack. This issue is compounded by a decline in traditional mentorship, where senior practitioners historically guided newcomers. Modern data platforms from major vendors like Microsoft, Snowflake, and Databricks are criticized for lacking integrated, opinionated tools that guide users through modeling processes, forcing practitioners to manually implement common patterns like Slowly Changing Dimensions (SCD) Type 2.</p><p>The primary conclusion is that Artificial Intelligence, particularly Large Language Models (LLMs), presents a transformative solution. AI is positioned not merely as a code generator but as a new form of mentor and assistant. It can educate novices, generate starter models, and, crucially, act as an &#8220;antagonistic&#8221; agent to stress-test models for future flexibility&#8212;replicating the critical feedback once provided by experienced data modelers. The effectiveness of these AI tools, however, hinges on providing them with opinionated constraints and clear business context to generate practical, fit-for-purpose models rather than theoretical, unimplementable ones.</p><h2>The Evolution of Data Platforms and Recurring Patterns</h2><p>The discussion begins by contextualizing the current data landscape within a 25-year evolution of technology. This history highlights a recurring cycle of platform development and a significant shift from capital-intensive, on-premises infrastructure to flexible, cloud-based services.</p><ul><li><p><strong>From On-Premises to Cloud:</strong> The journey is traced from early-2000s technologies like ColdFusion and SQL Server 2000, which required purchasing and managing physical hardware (e.g., &#8220;compact three eight sixes&#8221;), to the advent of the first cloud data warehouses like AWS Redshift, and finally to modern platforms like Snowflake, Databricks, and Microsoft Fabric.</p></li><li><p><strong>Democratization of Compute:</strong> This shift democratized access to powerful computing resources, moving from multi-thousand-dollar hardware purchases to pay-as-you-go cloud services.</p></li><li><p><strong>Recurring Cycles:</strong> A pattern is noted where the industry moves from installable software to pre-configured appliances and now to cloud-native databases. Despite these technological waves, fundamental challenges, particularly in data modeling, reappear. As Gibson notes, &#8220;every technology wave it seems to become hot and then cold.&#8221;</p></li></ul><h2>The Persistent Challenge of Data Modeling</h2><p>A core theme is the struggle to maintain the discipline of data modeling in the face of rapid technological advancement. It is often seen as a difficult, time-consuming process that lacks the immediate gratification of working with new tools.</p><ul><li><p><strong>A Constant Battle:</strong> Andy Cutler describes a &#8220;repeated and constant battle to make sure that data modeling is at the forefront of a data platform project.&#8221; He argues that modeling is frequently deprioritized in favor of focusing on technology.</p></li><li><p><strong>Architecture vs. Modeling:</strong> A common point of confusion is the conflation of data architecture patterns with data modeling patterns. Cutler clarifies this distinction: &#8220;The architecture enables the modeling. The modeling is put over the architecture.&#8221; He notes that patterns like the Medallion Architecture are data layout patterns, not a substitute for disciplined modeling techniques like Kimball or Data Vault.</p></li><li><p><strong>The Lack of Instant Feedback:</strong> A key insight is that technology provides immediate, binary feedback (it works or it doesn&#8217;t), which is psychologically rewarding. Data modeling, in contrast, does not. As Gibson puts it, &#8220;I can&#8217;t get instantaneous feedback that my model is good or bad or right or wrong... a model that you&#8217;ve created six months, a year down the line when all of a sudden something happens... the model isn&#8217;t flexible enough.&#8221; This delayed feedback loop makes technology more appealing to practitioners.</p></li></ul><h2>The Decline of Mentorship and the Knowledge Gap</h2><p>The conversation highlights a critical loss of institutional knowledge transfer. As tools have become more accessible and projects faster-paced, the traditional mentorship structures that trained previous generations of data professionals have eroded.</p><ul><li><p><strong>The &#8220;Grumpy Old DBA&#8221;:</strong> Learning was often driven by experienced seniors, colloquially the &#8220;grumpy old DBA,&#8221; who provided critical feedback and guidance on performance, design, and best practices. This hierarchy of mentoring was essential on expensive projects where mistakes were costly.</p></li><li><p><strong>Erosion of Foundational Concepts:</strong> With modern, abstracted tools, new practitioners are often not exposed to foundational concepts. The example cited is a user asking, &#8220;what is data persistence?&#8221;&#8212;a concept ingrained in older professionals who used tools that required manual saving (e.g., pre-cloud Excel).</p></li><li><p><strong>Lack of Accessible Learning Resources:</strong> While foundational books from authors like Steve Hoberman and The Kimball Group still exist, formal courses and guided learning paths for modeling are less prevalent. Unless actively guided to these resources, newcomers may not discover them.</p></li></ul><h2>The Inadequacy of Modern Data Modeling Tools</h2><p>A significant contributor to the modeling gap is the lack of robust, integrated, and opinionated modeling tools within major data platforms.</p><ul><li><p><strong>Vendor Agnosticism:</strong> Vendors like Microsoft, Databricks, and Snowflake avoid baking specific modeling methodologies into their platforms. They provide a &#8220;canvas&#8221; and &#8220;paintbrush&#8221; but &#8220;don&#8217;t help you draw the picture.&#8221; This forces users to bring their own process and often use disconnected, third-party tools.</p></li><li><p><strong>The SCD Type 2 Example:</strong> The implementation of Slowly Changing Dimension (SCD) Type 2 is a prime example of a common, well-defined modeling pattern that largely lacks out-of-the-box support. Practitioners are still required to write custom code to handle historical tracking, even though it&#8217;s a fundamental requirement in dimensional modeling. Databricks (Delta Live Tables) and dbt (Snapshots) are noted as exceptions that offer some built-in functionality.</p></li><li><p><strong>From Conceptual to Physical:</strong> There is a lack of end-to-end tooling within platforms like Microsoft Fabric that facilitates the entire modeling lifecycle, from conceptual design through logical design to the automated generation of the physical model.</p></li></ul><h2>Artificial Intelligence as the Future of Data Modeling</h2><p>The discussion concludes that AI, particularly in the form of specialized LLMs, is poised to fill the void left by declining mentorship and inadequate tooling. AI can act as an expert assistant, a sounding board, and a critical partner throughout the modeling process.</p><ul><li><p><strong>AI as Educator and Mentor:</strong> For those new to the field, AI can act as a guide, explaining different modeling patterns (e.g., Dimensional, Data Vault, Third Normal Form) and helping to translate business requirements into an initial model. This helps bridge the knowledge gap. The tool Ellie AI is mentioned as a specific example of an LLM-powered tool focused on guiding users through data modeling.</p></li><li><p><strong>From Generation to Antagonism:</strong> The most powerful application of AI is not just in generating a model, but in stress-testing it. The concept of using an AI to be &#8220;antagonistic&#8221; is raised, where the user can prompt it to find weaknesses and potential future problems.</p></li><li><p><strong>The Power of Opinionated AI:</strong> An unconstrained LLM may default to the most prevalent pattern in its training data (likely Kimball modeling, due to the volume of public content). The true value emerges when the AI is given specific constraints and opinions. Key inputs that improve AI model generation include:</p><ul><li><p><strong>Source Context:</strong> Providing the AI with source schemas and metadata.</p></li><li><p><strong>Design Patterns:</strong> Instructing the AI to use a specific, opinionated modeling pattern (e.g., &#8220;concepts, details, and events&#8221;).</p></li><li><p><strong>Business Boundaries:</strong> Using artifacts like an &#8220;information product canvas&#8221; to define the specific business outcomes the model must support, preventing it from over-engineering.</p></li></ul></li><li><p><strong>Multi-Agent Approach:</strong> A proposed advanced approach involves using multiple AI agents with different perspectives (e.g., one focusing on source systems, one on business processes, one on reporting outcomes) and having them &#8220;antagonize each other&#8221; to arrive at an optimal, pragmatic model that balances all constraints. This mimics the cognitive process of an experienced human modeler.</p></li></ul><p></p><div class="pullquote"><p><strong>Tired of vague data requests and endless requirement meetings? The Information Product Canvas helps you get clarity in 30 minutes or less?</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://agiledataguides.com/ipc&quot;,&quot;text&quot;:&quot;Fix Your Data Requirements&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://agiledataguides.com/ipc"><span>Fix Your Data Requirements</span></a></p></div><p></p><h2>Transcript</h2><p><strong>Shane</strong>: Welcome to the Agile Data Podcast. I&#8217;m Shane Gibson.</p><p><strong>Andy</strong>: And I&#8217;m Andy Cutler.</p><p><strong>Shane</strong>: Hey, Andy. Thanks for coming on the show. Today we&#8217;re gonna have an intriguing track around data modeling and whether it&#8217;s been done not being done, gonna be done and how tools and AI can fit into, it&#8217;s it&#8217;s a passion of mine. But before we jump into that conversation, why don&#8217;t you give a bit of background about yourself to the audience.</p><p><strong>Andy</strong>: Yeah, sure. Thanks Shane. so my name&#8217;s Andy Cutler, and I&#8217;ve been working. In the data space for about 25 years. So back in 1999 when I was working for a tiny little UK drum and bass record label called good Looking Records I was given the role of updating the company website and that ran on a technology called ColdFusion and I got quite into web design.</p><p>So I then managed to get myself a job, and I guess it was the first IT job that I had designing websites. Then that website agency bought a SQL Server 2000 license, and I know there&#8217;s older versions of sql. I&#8217;m, really aware of that. But that was the first version of SQL that I got my hands on, and I was tasked with building the data models and the store procedures to run CMS systems.</p><p>And then five years later, so mid two thousands, I am working in the data warehousing space. So I&#8217;d learned everything about database normalization, third normal form, so on and so forth. And then I was told to unlearn that all because I needed to de-normalized everything for data warehouses. So I&#8217;ve literally worked for the last 20 years in data warehousing.</p><p>The last few years has all been cloud-based. I actually started my cloud journey with AWS and Redshift. That was, when we&#8217;re going on for sort of 12, 13 years ago now. And then when Azure really started to motor with the products and services, I was then using tools like Azure, SQL data Warehouse.</p><p>That then moved into Synapse Analytics and for the last couple of years it&#8217;s been Microsoft Fabric. So really it&#8217;s been predominantly the Microsoft data space that I&#8217;ve been working in. .</p><p><strong>Shane</strong>: Excellent ColdFusion and SQL server in the early days. That was back in the days where we had to buy our own hardware and put it under our desk or in a cupboard or, pre-data center. I remember back then I, we were doing some stuff and my first job and, we were buying compact three eight sixes and the argument was do we get an SX or a dx?</p><p>And I, I can&#8217;t remember. I think. Had , four mega memory and it cost the organization $35,000 New Zealand dollars back then. Yeah, times have changed. Right now you can spin up those kind of things by just putting your credit card in or even getting your free tier and . You got a massive amount of firepower in there.</p><p><strong>Andy</strong>: Yeah, and that&#8217;s the thing is it&#8217;s also a little bit disconcerting now because all that compute is sitting in the cloud. Certain vendors like to show compute in certain ways and show you cause and other vendors, they like to obfuscate that behind other kind of terminology, which yeah, guess Microsoft do.</p><p>So you&#8217;re trying to map the cloud compute with what you&#8217;re doing with on premises, right? And saying, okay, I&#8217;ve got a certain amount of cause and I&#8217;m running this certain amount of workload on premises. What does that look like when. I go to the cloud and I&#8217;m then dealing with a service that I can&#8217;t exactly a hundred percent map to those on-premises calls, but yeah, I to, I totally get it.</p><p>You&#8217;re choosing, hardware to run that software. You&#8217;re configuring that software as well. You&#8217;re configuring that software to death to get as much as you can out of that hardware.</p><p><strong>Shane</strong>: Yeah. And I think like you, I, when I had my consulting company, when Redshift and AWS came in we jumped on board really fast. &#8216;cause in those days, you&#8217;ve had to buy a big million dollar Teradata box or you had to buy some Oracle database and again, rack and stack it with some leased hardware. And when Redshift turned up, that really was the first cloud database for analytics. And it&#8217;s interesting how, it&#8217;s lost market share. It was first to market and it&#8217;s obviously been taken over by Snowflake and Databricks and a few others. So what&#8217;s interesting for me is we see these patterns getting repeated.</p><p>We see, databases where we used to have to install them, and then we see databases as an appliance where you buy the hardware and the database used to come with it pre-installed. Now we&#8217;re seeing cloud databases, Redshift being the first and some more oh, should I say, modern. So ones that solve some of the problems I&#8217;m gonna have to vacuum your database for anybody that&#8217;s dealt with a Redshift cluster before. and so one of the things that&#8217;s interesting is this idea of data modeling. Because, I&#8217;ve been in the data space for 35 years and we&#8217;ve always modeled data, but every technology wave it seems to become hot and then cold. It&#8217;s yep we model, and then no we democratize.</p><p>And people can do all the work without any conscious data modeling. So what are you seeing in the modding space and in your part of the world at the moment?</p><p><strong>Andy</strong>: I am seeing, and I&#8217;m gonna speak honestly, and off the cuff here. I&#8217;m seeing a repeated and constant battle to make sure that data modeling is at the forefront of a data platform project. That it isn&#8217;t just about the technology and it isn&#8217;t just about the data layout patterns that we&#8217;ve seen.</p><p>One of my sort of bug bears is that I&#8217;ve seen a lot of articles on LinkedIn comparing architecture patterns with data modeling patterns, and I&#8217;m thinking, hang on, you are really comparing apples to oranges. The architecture enables the modeling. The modeling is put over the architecture. I can&#8217;t imagine a scenario where any CIO or CTO is gonna talk to, members of their team and ask them, should we architect our data a certain way versus modeling it a certain way?</p><p>No. They are complimentary. So in the last few weeks, I&#8217;ve been thinking about the data modeling side of things and asking myself is it because it&#8217;s hard to do? Is it because it requires thought and time and collaboration? Where sometimes technology is a little bit of an easier thing to do.</p><p>There&#8217;s still lots of human elements in using and deploying and working with technology, but I just feel that data modeling, you really have to have that community grounding where people are working together to get what they need from the data into a shape that&#8217;s actually useful for their business. So yes, that was my framing on that, .</p><p><strong>Shane</strong>: It&#8217;s an interesting lens and for me I&#8217;m with you, right? I talk about. Data architecture layers. . So how we are laying out the architecture of our data across our platform. And I&#8217;m a great fan of layered architectures. I&#8217;ve used them for many years and I see massive value in them. And then once I have , an idea of the layers, then I can talk about what data modeling patterns we want to use in each layer. And I&#8217;m a big fan of mixed model arts, which Joe Reese talks about in my experience, I&#8217;ve never actually used a single data modeling Pattern. Even back when I was dimensionally modeling I always had a persistent staging area that was some form of native relational model that met the structure of the source system before I moved it into the dimensional model. So if I think about that in terms of data architecture, and then I think in terms about data modeling, then I think in terms of technology, and I asked myself this question if I was moving into the data domain. For the first time at the beginning of my career, where do I find any information around data modeling? Like how do I learn? Because the courses and the books that we had when we started have the books are still there, but the courses have disappeared to a degree. I think part of the problem is lack of accessibility , to that content for some reason. But the other thing that you just raised, and I&#8217;ve never thought about it this way, is as a technologist, if I want to earn a product, the product, I give it a go, I read some of the documentation. Like these days I&#8217;ll probably perplexity it. And I get to have a go and I get immediate feedback. So I can, install the software or turn it on in the cloud, and then I&#8217;m immediately able to log on and then I can give it a whack. I can try to use it and it will give me feedback. If I&#8217;m doing that with data modeling that&#8217;s not true. I can&#8217;t just turn something on. I can&#8217;t get instantaneous feedback that my model is good or bad or right or wrong or doing what I want. And yeah. It&#8217;s an interesting point you raised. Maybe that is the reason that people love to play with technology because they can just learn it against an instantaneous feedback and probably that adrenaline rush.</p><p>That first time you load the data up and your dashboard turns up with a pretty graph, you&#8217;re like, eh, that was pretty cool. That&#8217;s a bit of an endorphin hit. Whereas I created a data model in Miro. Yeah. That&#8217;s a picture. It&#8217;s interesting lens.</p><p>You&#8217;ve got do you agree that one of the problems is lack of access to content and ways of learning what data modeling is and how you use it?</p><p><strong>Andy</strong>: I think it always has to be a constant conversation with the data people around modeling, and it always has to be brought up, but it always has to be. Marketed and communicated that the technology is just the starting point. Modeling comes into it as well. You are delivering a product that is the amalgamation of the technology.</p><p>Of course, it is the process that the organization requires that data technology to do, and you&#8217;ve got there through modeling and landing the data in the right way for that organization. If we go back to resources, and you touched on a point there around the books that we had access to and the resources that we had.</p><p>So yeah, I remember Steve Hoberman&#8217;s data modeling books. I then remember. The Kimball Group books. I mean they were, and I&#8217;m, looking at the Kimball reader on my shelf right now and the Dimensional modeling toolkit because , yes. You required database technology to implement that model, but the technology could be from Microsoft, it could be from Oracle, MySQL, Postgres.</p><p>It didn&#8217;t need to be a specific vendor. You were applying a process to a technology. So unless you are guided towards these resources, and I was told, I remember being told in the early two thousands to go, I think it was a database administrator who told me to go and learn about normal form.</p><p>Database normalization because the data had to be structured in a certain way to facilitate a transactional system, right? Database normalization. And then when I started to move into more of the data warehousing and analytics side of things, again, I was guided. I was told it was a MicroStrategy consultant that I was shadowing so that I could understand the data warehouse that he was implementing.</p><p>Again, that consultant pointed me in the right direction. He pointed me to towards books and said I&#8217;ll, I can teach you the basics of this now, but of course you are going to have to get hands on. You&#8217;re gonna have to learn this. And at at some stage, I was told that I would have to learn these things and.</p><p>Get hands on. And I think that unless you&#8217;ve got those people now who are guiding people to do that, people have gotta be very proactive and they&#8217;ve gotta be going out there and they&#8217;ve gotta be saying, okay, I&#8217;m building a data platform. I&#8217;m using this technology.</p><p>What else do I need in terms of my knowledge that I need to apply to that as well? So there is a certain amount of guidance, and this is, I hope, being passed down by people. There are, great people out there like Johnny Winter who are talking about data modeling and, we&#8217;ll get onto the data modeling AI topic in a little bit that I.</p><p>Essentially took from Johnny, right? Johnny was talking about this and I looked at this and thought, wow, this is, yes, this is relevant. I like this. But yeah, so that&#8217;s what I feel. The other thing is, if you look at vendors, are vendors necessarily pointing people in the direction of the modeling processes?</p><p>I&#8217;m a Microsoft data developer, and throughout the years there have been various books around how to apply things like dimensional modeling, a modeling patterns to Microsoft products, right? And even now, you can go onto the Microsoft Fabric learn documentation, and in the warehouse, the fabric warehouse documentation, there&#8217;s resources around dimensional modeling.</p><p>They will tell you how you can do that.</p><p>So the vendors, like I said, have been providing some of this guidance and some of this documentation, but it&#8217;s very it&#8217;s small fry, right? It&#8217;s a few pages in their documentation on how to do it. So you&#8217;ve gotta, you&#8217;ve gotta be mindful of the fact that when you are coming at these data platforms, that you&#8217;ve got to be doing the modeling behind it as well.</p><p>And it has to be front and center to a project. Yeah, so that&#8217;s what I think about that.</p><p><strong>Shane</strong>: Yeah. And I agree with you that I, if I remember a lot of my learning was from that grumpy old DBA. When you used to deploy something and it ran like shit I remember the standard response from the grumpy old DBA is when you said, oh, my, my ETL loads are going too slow. They used to always reply with, compared to what, that was standard response. And then, eventually they go and help you tune it. They typically tell you it was a data model or your code that was wrong, not the database. And then, again, if I think back, I think you&#8217;re right that often the projects we are doing were expensive and expensive &#8216;cause the people and time. Therefore there was a whole hierarchy of mentoring because you couldn&#8217;t afford to have somebody just rush off and do something that didn&#8217;t fit the way everybody else was working. So there was always a mentoring process where somebody more senior experienced than you, took you on the journey to learn to do things the way they did. And I think what&#8217;s happened is as we&#8217;ve got new technologies, we&#8217;ve got new forms of democratization, and therefore people are able to do the work quicker and faster using those tools. And we&#8217;ve lost that mentoring process, that knowledge transfer outside the tool itself. And also people aren&#8217;t being taught what we got taught in the early days.</p><p>We did a podcast or a webinar and we&#8217;re talking about data layered architectures. And a template that I&#8217;ve been working on. And one of the questions in that was, is the data persisted in the layer or is it virtualized, right? Or is it temporal memory? And one of the questions we got online is, what is data persistence? And I sat back and went, holy shit. And one of the comments from Chrissy was doing the webinar in Ramona with me was, know what persistence is because we&#8217;ve worked with tools when Excel didn&#8217;t save. And if you didn&#8217;t hit the little dis get and persisted that data down to your laptop and your laptop crashed, you&#8217;d lose it. Now we&#8217;ve got Google sheets where you type, it saves, it persists the data. We don&#8217;t have to care about that. And so I think for some of us, we forget some of the core foundational concepts that we just learn by doing things outside of data to a degree that we&#8217;ve applied. So if we think about that, then. Is the problem with data modeling right now? Lack of tools really there is still a lack of data modeling tools in our data stack. Or is it a lack of mentoring and therefore is AI the answer? Do we actually now see AI bots or AI clones or whatever we wanna call them, that actually become the daily modeling mentors for people that don&#8217;t have a physical mentor like we did? What do you think?</p><p><strong>Andy</strong>: I think that the technology itself, and I&#8217;m talking about it could be Microsoft technology, it could be Oracle, it could be Databricks, snowflake, so on and so forth. They don&#8217;t have any guided ways of creating a model. I reckon the last time that I used a piece of software that guided me through the process that was very aligned with a very specific data modeling technique was analysis services multidimensional, where objects inside this model.</p><p>Were named after a specific modeling Pattern. So your facts, your measures, and your dimensions. If we&#8217;re talking about Kimball dimensional modeling, so that was really the last time that I used a piece of software that was quite aligned and essentially had a wizard that would take you through a modeling process.</p><p>&#8216; cause of course now we&#8217;ve got different modeling processes we could model. Yes. Third normal form. We could use dimensional modeling, we could model it data vault, and no vendor is locking themselves into a specific Pattern, right? Vendors are saying, Hey, you can bring this specific type of modeling to our software so they&#8217;re not baking in a specific modeling process.</p><p>One of my real asks in terms of software is to start to bring in some of these modeling aspects. I would love functionality, like slowly changing dimensions. So this is a dimensional modeling concept, which adds historical records to a, a reference table, to a dimension table.</p><p>And this is just my experience with the software that I&#8217;ve used. I really only see, Databricks that have done this, slowly changing dimension type two in their Delta live tables or lake, Lakehouse declarative pipelines. DBT, they&#8217;ve implemented slowly changing dimensions in something called snapshots.</p><p>So it is there, but it&#8217;s a little piecemeal and it&#8217;s not necessarily being massively called out in terms of this is the modeling Pattern and this is the feature that we&#8217;ve implemented for you to be able to realize that modeling Pattern. So then you were talking about ai, right? And this is where we start to get into productivity.</p><p>This is where we start to get into how can someone with not much experience of something ask AI to help them. And this is a classic case of where AI can help and start to move someone towards understanding how they work with the business. How they would evolve a Pattern as well. And like you were saying before, , people can get hands on with technology, they can get cracking with technology.</p><p>It&#8217;s binary, something&#8217;s gonna work or it&#8217;s not. If you model some data, you might not know whether that model works that you&#8217;ve created six months, a year down the line when all of a sudden something happens. That means that the model isn&#8217;t flexible enough. It hasn&#8217;t been thought of it, it hasn&#8217;t, there hasn&#8217;t been enough collaboration to understand the impact Act.</p><p>I went to the Fabric London user group. I was talking at the user group, but I was talking on technology, right? I was talking about a specific feature within fabric called materialized Lake Views. A little, segue from our conversation, but it&#8217;s a feature. It&#8217;s a feature in a piece of software from a vendor.</p><p>Johnny Winter was there, and he was talking about data modeling. He was talking about, sunbeam modeling. And this is a modeling practice that I&#8217;ve used in the past only because Johnny Winter has been talking about it, because it brings together a couple of areas of data modeling that I have used.</p><p>I just never thought there was this, port manto of these things. Beam business event analysis, modeling from, Lawrence Corr and the, the great Agile, data warehouse or Jim Stato and Mark Whithorn, which I now know is the pronunciation who was championing sun modeling, in terms of, a central event and then your sunbeams were your reference data in terms of how you brought context to that data.</p><p>And of course that was just the modeling aspect. But Johnny then started to show a tool called LE ai and that really got me thinking. So this is a data modeling tool in which. It&#8217;s focused on data modeling. It&#8217;s focused on understanding how to build enterprise data warehouses, how to help someone and guide them through the process of creating a data model.</p><p>And it&#8217;s very different from something like a data warehouse automation tool, right? So there&#8217;s data warehouse automation tools out there, either wcap and such in which you pretty much have to know the modeling Pattern while you are using these tools. Whereas all of a sudden I&#8217;m looking at Ellie AI and thinking, oh, okay, I get this now.</p><p>It is essentially an LLM or maybe multiple LLMs that&#8217;s been trained on data modeling, perhaps on architecture, layout patterns, technology even. &#8216; cause it can help it then shape how it. It helps the user as they prompt their way to a data model. And at first I looked at it and thought it&#8217;s just another AI tool.</p><p>There&#8217;s millions of AI tools out there now. But I did think, hang on, if this is an AI tool that&#8217;s helping people data model, this can only be a good thing. This could only be of benefit to people to use something like this to help them through the process of modeling. They&#8217;ve got the technology they can use, co-pilot or chat GPT to help them write code.</p><p>But they&#8217;ve got these tools with the subject matter expertise of data modeling to help them and guide them through getting to a state where they might not have these problems in 6, 9, 12 months time of a model. The isn&#8217;t flexible enough because they can prompt their way into flexibility. So that was what I was looking at specifically with the AI tools.</p><p>Shane, so I&#8217;m curious to think about what you think about somebody who doesn&#8217;t have much domain expertise in data modeling, but is working in the data space using , these AI tools for modeling.</p><p><strong>Shane</strong>: there&#8217;s a lot to unpack there. I&#8217;ve just made a whole page of note. So let me go through from the beginning of what you talked about and replay it back and my thoughts around it. I liked your point around opinionated tools. If I think back in, previous of technology for the data space, the tool really supported one modeling Pattern. Yeah. And realistically it was Kimball, Kimball was the number one data modeling Pattern that I ever saw in data warehousing in the old days. And that was because wrote good books. He shared his content online with his blog, which was free and easy to access, and he ran great courses. So getting access to how to model, he was the most accessible piece of content that you could find, and it made sense. I&#8217;m a big data vault fan. I like the physical data vault modeling Pattern. I&#8217;m not a fan of the data vault bi methodology but I find that content and access to how to model using the data vault Pattern of hub sets and links is incredibly hard to find.</p><p>It is poorly written. It is pay. And so I think that&#8217;s why we see with the advent of DBT and when people started to realizing they had to model data consciously, we saw Kimball take off again, right? Because the old content is still valid. Just on that though, I was really intrigued about your SCD two comment. So let&#8217;s take a segue on that and I&#8217;ll come back to the AI stuff in a minute. Because I remember with the original ETL tools that we were using, there was never a native CD two node. We used to always have to bloody well write that node ourselves. then when the cloud analytics databases came out, would&#8217;ve just made sense for me for CD two behavior.</p><p>That Pattern of historical recording of change data to be a database feature. . Not be a piece of code to detect the change and store it, but just make it a feature in the database to say, this table is SCD Type two table, because the database could take care of that change detection the, end dating or the flagging of a current is record. And so it&#8217;s really interesting that SCD type two is one of the patterns that you use for dimensional modeling all the time. When you&#8217;re physically modeling using that Pattern, but somehow we still seem to be lumbered with write some code that deals with it. Is that your experience? Are you still seeing people having to write code to, to implement the type two Pattern?</p><p><strong>Andy</strong>: Yes, I am basically, and this was even talked about a few days ago, so I was at a little conference in Birmingham called fab Fest, which was focused on Microsoft Fabric. I was speaking about a function called, or a feature called Materialized Lake Views, which is essentially like Databricks is Delta live tables or lake flow declarative pipelines as they&#8217;ve called now.</p><p>&#8216;cause they&#8217;ve, they&#8217;re abstracting it away from Delta because it&#8217;s not just Delta that this technology supports, but also DBT and someone said, why have we not got this SCD outta the box functionality? Because it&#8217;s almost like slowly changing dimensions.</p><p>Have they transcended the specific methodology in which it would be used in? And what I mean by that is when you started learning Kimball and you started learning dimensional modeling, one of the sub-categories. Was dimensions and one of the sub subcategories was slowly changing dimensions and all of these different types.</p><p>So type two has probably now become the champion, right? It&#8217;s the one that will track changes over time by adding new rows of data and the associated metadata to keep it. Yes, there are, type four and type six and type three, and all of those have their reasons to implement them. Type three, you&#8217;ve got multiple columns which can store, the current data and then the previous amounts of data, but they are a little bit more difficult to implement.</p><p>So most of the time people will use slowly changing dimension type two, because that&#8217;s the one that is the most. Relevant, the most prevalent and the ones that&#8217;s easy to implement. However, unless you are using some tools that have this functionality in built, like I said, DBT have functionality called snapshots, Databricks in their, lake flow declarative pipelines.</p><p>You can declare an SCD type. It&#8217;s not massively prevalent and soaked in to the data landscape. So this person I remember at this conference was saying I&#8217;m still having to write my code to implement my slowly changing dimension. I&#8217;m still having to say these are the columns that I would like to track.</p><p>This is the key that I would like to join on. These are my columns that I&#8217;ve defined for my metadata, my from my two, my is active. If you have that, any other metadata columns that you want to be able to track your changes. Then, and my last point on this is we look at the medallion architecture right now, I&#8217;m not gonna go into the ins and outs of, should we call it the medallion architecture, because to my mind it&#8217;s a data layout Pattern.</p><p>It&#8217;s not necessarily an architecture, but it has these different zones of data. We know raw is bronze and silver is cleansed, and gold is modeled, but silver. Which is the cleansed data. It hasn&#8217;t yet got to a stage in which it&#8217;s been modeled to a specific Pattern. There&#8217;s a lot of advocates that want to apply slowly changing dimension functionality to that silver data, but it&#8217;s not modeled yet.</p><p>It&#8217;s existing at the same granularity as the raw data. Obviously it&#8217;s gone through dedupe, it&#8217;s gone through cleansing and all that kind of stuff, but it&#8217;s not modeled yet, but we&#8217;re applying something that was a modeling Pattern or a modeling feature into this store of data. So I find that quite interesting as well, is that this SCD has almost been extracted as a feature of a modeling Pattern and can just now be used as a way of tracking changes over time.</p><p>But To your point, I just don&#8217;t see enough of that functionality automatically added to database and data products. People are still having to do it themselves.</p><p><strong>Shane</strong>: And that affects adoption because if we say that a simple technical Pattern of type two implementation, where we know what the patterns are, right? We know that is detect change, insert row, start date, end date. If that , what you like is active or as current as a flag. We could actually just add all of those, and I remember in the early days, when we were constrained on database technology where the cost of those servers was expensive, the cost of the licenses were horrendous. We had to optimize to reduce costs. You probably remember it, we would argue type one versus type two. And we would type one by default, and we would type two where we knew there was value because the cost of type two was higher than type one. Now with cloud analytics databases, we really don&#8217;t give a shit, we just type two everything because we can. And it saves us problems later. And people are more expensive than those databases. And so we can be lazy to a degree, but there&#8217;s value in being lazy. What&#8217;s interesting for me is that Pattern is a very well known Pattern, yet it hasn&#8217;t become opinionated in tools. And when we talk about data modeling patterns, do we go. Snowflake versus star versus data vault versus anchor versus hook versus unified star schema versus anchor, there&#8217;s all these other patterns where actually they&#8217;re a little bit harder to be opinionated about. and therefore they&#8217;re harder to bake into a tool. So if we can&#8217;t do the simple stuff, how do we expect to do the hard stuff? And then the last thing I&#8217;ll say before we move on to the next part of your point that you had earlier was your silver&#8217;s, not mine. I think medallion has been great because it&#8217;s reinvigorated the conversation around layered data architectures and the value of them. But when you talk about your silver as being cleansed, I talk about my silver as designed and I model. We have a an opinionated data modeling Pattern, which is concepts, details, and events.</p><p>That&#8217;s how we model and what we call our silver, our raw is historicized. We are effectively applying an SCD two type Pattern on our raw data for a whole lot of reasons. So again, the problem with medallion is, it&#8217;s a nice way of describing a layered architecture. But soon as we get into any detail of what you&#8217;ve got in your layer, it&#8217;s not what I&#8217;ve got in my layer, and that&#8217;s okay.</p><p>As long as you tell me that you are cleansing in silver and it matches the structure of your source, and you tell me you model in gold, I get it. Now I get your architecture by you just using those words that opinion you have applied. So I think that&#8217;s the key, is no more, there&#8217;s only one way to Medallia.</p><p>It&#8217;s what do you mean by silver? So let&#8217;s jump on then, and then let&#8217;s talk about tools because. In the past we had tools like Irwin, oh God, ea sparks. We had some really hard to use data modeling tools to draw diagrams of what our models look like, and typically they were completely disconnected. We found it really hard to draw a diagram for our conceptual and physical model and then get that model as substantiate in our database easily. And the modern data stack, in the previous wave, now that it&#8217;s dead, we saw tools like Ali and sql, DBM, we saw visual modeling tools come out. But what was interesting is they became category, they were a party or stack.</p><p>So if you had a on data stack, you&#8217;d end up with five to 10 different tools to do your end-to-end processing. And we&#8217;ve seen a lot of consolidation in the market. We&#8217;ve seen a lot of those tools that are part of the stack disappear or get acquired or become features in, in one of the other tools, we haven&#8217;t seen that for data modeling. We haven&#8217;t seen the data modeling capability being bought back into those end-to-end stacks. And again I don&#8217;t do a lot of work in fabric, but I don&#8217;t think, apart from the power bi SQL server analysis services part of the Microsoft stack, they&#8217;ve never really had a modeling tool, have they?</p><p>There&#8217;s no tool I would go into that would help me create a conceptual or a physical data model and instantiate the physical model in a database within the Microsoft stack, or is there.</p><p><strong>Andy</strong>: So this is probably one of the most asked questions in forums around the data modeling aspect because as you&#8217;ve said, we&#8217;ve had, tools over the years that have enabled us to do data modeling. Even SQL Server Management Studio, which you can download for free has got this almost live data modeling process attached to it where.</p><p>It&#8217;s very much a physical design Pattern. You can&#8217;t logically design something in its interface in fabric. We&#8217;re still there. We haven&#8217;t got anything that&#8217;s gonna help us logically design a data model outta the box. There&#8217;s nothing that we can start with and say, okay, I wanna start with the conceptual data model.</p><p>I wanna go down into a logical design, and then finally a physical design of my lake house, of my warehouse. we&#8217;re having to still use other tools to be able to do that. So whether people still use, Visio, whether they&#8217;re using, other things, AATE or SQL database modeler.</p><p>I think even Toad, for these kind of modeling tools, one of the things that I didn&#8217;t like is when Microsoft did deprecate some of the data modeling tools that were available within Visual Studio, I just thought you&#8217;ve got to have something that can help a process. When I look at Fabric or even Databricks, snowflake, all of these sort of cloud vendors is, they are the canvas, they are the paintbrush, but they don&#8217;t help you draw the picture.</p><p>You need a process to go and help you draw that picture. And there are, there are cloud tools that help you do the data modeling. Some of them, I think a lot of them are paid because. To go from conceptual to logical, then physical is a natural progression. People want to be able to create the physical data models.</p><p>Ultimately, yes, some people might be a little bit annoyed that they&#8217;re forced to go through, the conceptual and the logical modeling processes before they get down to the physical. But then they want to be able to click a button and it generates the code necessary for them to run and create that physical model.</p><p>No, I don&#8217;t see anything. And if we&#8217;re talking about Microsoft Fabrics specifically, I don&#8217;t see anything in Microsoft Fabric that&#8217;s gonna help you from the conceptual all the way down into the physical. </p><p><strong>Shane</strong>: And again, you&#8217;ve gotta be, software should be opinionated. So the way we do it is I create the conceptual model and then it generates a physical model without me doing anything because our physical model is opinionated. So our conceptual model is. open in terms of, the things you create, the concepts, the, who does whats, &#8216;cause I&#8217;m a Great Beam fan as well, </p><p>lawrence CO&#8217;s book is one of the ones I, read early. We used to, when I had my consulting company in New Zealand, in pre COVID, when things went online, we used to fly &#8216;em over as often as we could to teach our customers how to beam and event model their data.</p><p>&#8216;cause it was so useful. And that idea of, your conceptual model of who does what, your core concepts, depends on the industry, depends on the business case, the usage, the actions and outcomes you want to take. So it is a, a little bit less opinionated to a degree. but once you&#8217;ve got that sorted, you can make your physical modeling Pattern incredibly opinionated. And that&#8217;s why I think these disconnected data modeling tools are struggling from what I can tell. And you&#8217;re starting to see them now bring in the ability to actually generate the ETL, the code to, deploy the model and load the model.</p><p>Because that&#8217;s the space you have to be in. You have to be able to create the model and test it, that it has value. And so when I come back to the AI tools. I&#8217;ve played a lot with the lms played a lot with the bots, and I can see as a helper friend, as an assistant, it allows me to ask questions and provide some context around the industry, the use case, and get back a starter model for 10. And that&#8217;s really useful. But if I think back to this idea of the mentoring I had earlier in my career by those grumpy data modelers what they used to do was they used to stress test the model. There was something magical about the way you could give them a data model they could just look at it and then they could call bollocks on the things you got wrong.</p><p>It was just that innate Pattern matching in their heads. &#8216;cause they&#8217;d done it so often and they could go, yeah. That relationship&#8217;s not a one to many. It&#8217;s a many to many. It&#8217;s not gonna survive. Okay. The rate of change on that table is gonna be horrendous. You&#8217;re gonna go blow out your Oracle instance. Yeah. Or you have to upgrade to Oracle Rack, which will cost you, two legs, one arm and your three newborn children. I wonder if that actually is where AI tools have to take us is effectively an agent model, right? Where there is one that helps us build the model and one that helps us stress test the model, which goes back to that point you made a long time ago which was, when we work with technology, we get instantaneous feedback that the technology&#8217;s working.</p><p>It&#8217;s not when we work with data modeling, we don&#8217;t. So maybe that&#8217;s where the AI tools need to take us, help it to create it. And then another agent, which is, grumpy old data modeler that tells you where you got it wrong. What do you think?</p><p><strong>Andy</strong>: So I think AI can help all the way through that process. And this is where we start to look at AI as less as a technical tool that will help us do something and more as a sounding board, as something that we can ask it to be quite antagonistic with. As you said, and like I, I raised the subject earlier about AI tools and helping with data modeling is you could go to an AI tool and say someone has told me to design, x, y, z system for argument&#8217;s sake.</p><p>Let&#8217;s say it&#8217;s a data warehouse or it&#8217;s a lake house and we&#8217;re gonna use, a data modeling technique that is best for reporting and analytics. And the LLM might reply and say, okay, well here are a few. Modeling patterns and, dimensional modeling is the one that you would use for a data warehouse and so on and so forth.</p><p>So let&#8217;s say they then pick that and, carry on doing that, they&#8217;ll then ask you questions about, I&#8217;ve got all these different entities in my source system. How will a dimensional model help me? So it&#8217;ll then work through those entities and say, okay, you&#8217;ve got these entities that look like they can be grouped together.</p><p>Perhaps that&#8217;s a dimension and it&#8217;ll work through the process with you. So let&#8217;s say you&#8217;ve got that sorted, so you&#8217;ve applied your critical thinking and not just accepted everything that the LLM has given you. You&#8217;ve gone back and dah. Maybe you&#8217;ve Googled, maybe you&#8217;ve asked other people that are experts in that area to say, okay, you know what?</p><p>I&#8217;ve spent a whole day generating this model. It would&#8217;ve taken me two weeks if I had to learn the theory and then do it. What do you think? I&#8217;m sense checking it. Someone might come back and say, okay, we can tweak a couple of things, but actually that looks pretty good. Great. So we&#8217;ve got our starting point then.</p><p>You talked about stress testing and Yes. So this is where you can then ask the LLM to be antagonistic and say so this is my data model we&#8217;ve got here because the business want to be able to report on these X number of attributes and this is how they want to measure it. But I know that there are other source systems and things like that.</p><p>Can you tell me what problems this model might have in the future? And of course you essentially, you are asking the AI to try and do some future proofing for you and some troubleshooting. And it might come back with some generic questions about perhaps your product dimension isn&#8217;t, deep enough.</p><p>What about other entities that you might need to bring in that you are, that you haven&#8217;t yet got links in your fact tables, but it&#8217;s gonna surface, it&#8217;s gonna help surface potential problems for you to then deal with. So I totally get that as well. And then the third point that I wanna add is about anticipating changes.</p><p>We touched on, Lawrence Core and, the Agile data warehouse book a little bit earlier, which Yes is, something that I go back to constantly and, we&#8217;ve got beam in there to help us do this. But the agile data warehouse is also there to help us iterate over a model as well.</p><p>So perhaps we then say to the LLM, look this first version of the model, okay, we&#8217;ve gotta set it in stone now because we&#8217;ve got project deadlines we need to get the data in because the business are gonna build X amount of reports. And we tried to make the model as generic as possible.</p><p>&#8216;cause we don&#8217;t want a report driven model. We want the data driven model in here, but help us understand what we might need to do to modify the model and be a little bit agile. So I would say that the usage of the AI tools to help us generate the model is just the first part. Most of it is gonna be about asking the model to be quite antagonistic about the model it&#8217;s generated.</p><p>Yeah I think that it&#8217;ll help us build the model, but the most important thing is antagonize the model, test it hopefully point people in the right direction in terms of future proofing and certainly surface issues that might happen in the future. They might need to fix those issues. I don&#8217;t think I&#8217;ve worked with a dimensional model yet that doesn&#8217;t incur a certain amount of technical debt in how it&#8217;s implemented.</p><p>But if you can mitigate those things earlier on, that&#8217;s just gonna be of benefit. Yeah, so that&#8217;s my thoughts on that, Shane.</p><p><strong>Shane</strong>: Yeah, and that&#8217;s that problem between, doing a model quickly that gets value now and trying to boil the</p><p><strong>Andy</strong>: Yeah.</p><p><strong>Shane</strong>: for all changes in the future. Or an enterprise data model. Again, going back to that anti-patent, we have these days of, a data modeler sitting in a room for two years doing , one enterprise data model to rule them all that nobody implements. One of the things we&#8217;re doing is we&#8217;ve been experimenting in this space and we have a bunch of partners use our platform and we got one of them to experiment. And we had a use case around Google ads, so the partner needed to bring Google Ads data in and deliver it for a customer. And so that first part of the agent that opinionated agent that&#8217;s gonna help you do the initial model was really interesting. we found was, first thing was we were lucky that Google ads gave us effectively context about the tables. It brought in metadata that described the tables quite well. So that was really valuable for the agent because it now got a bunch of hints of what the source data looked like. The next thing was we have an opinionated design Pattern for the way we model.</p><p>And so , our agent already knew about that, so it knew what the rules of the game was, it knew it couldn&#8217;t dimensionally model it, it knew, it couldn&#8217;t anchor it, knew all the patents it couldn&#8217;t use. It knew what our patent looked like. So that opinion was effectively already in the agent. As a a bunch of rules. We also gave it the information product canvas that a partner had done, which is a description of what the actions and outcomes and business questions that need to be answered first. And what that gave was a boundary to the agent to say, don&#8217;t model all the Google Ads data. Only model the data that&#8217;s gonna support this outcome. So again, it gave it a boundary that led a lightly model. Now, what we didn&#8217;t do was, the bit that you are just raising is then stress test change, right? We didn&#8217;t say what&#8217;s gonna happen next, but deal with that in different ways at the moment. But I&#8217;m gonna think about that one really well, so boundary of opinion that was given to the modeling agent meant it did a good job, If I had just said to a, I&#8217;ve got Google Ads data and model it, I have a theory and I&#8217;m gonna go test this. I reckon every time I ask it that even though it&#8217;s non-deterministic, it&#8217;s gonna kimble model it. The reason I say that is if you think about why has Kimball and dimensional modeling become the number one modeling technique for DBT? Because if you are an analyst and you are moving into the engineering space and you hear that you need to model some stuff and you use an LLM or you go Google search, you&#8217;re gonna come back with dimensional modeling every time. &#8216;cause as I said, it&#8217;s the most freely available. Describe content in the world for data modeling in an analytics space, in my opinion. so therefore, the lms, who trained on everybody else&#8217;s content without paying for it, won&#8217;t go into that one. It&#8217;s gonna have the richest piece of content in the LLM for dimensional modeling. interesting question on that one is actually, if I go and ask an LLM to model it with no constraint, no opinion, I bet it&#8217;ll come back with Kimball Modeling.</p><p>What do you think?</p><p><strong>Andy</strong>: And I think the LLM would probably not be displaying any emotion. So let me expand on that. So you touched on data vault earlier and you say that&#8217;s a, that&#8217;s a data modeling Pattern that you like. If we go back several years, actually decades we&#8217;re talking about Kimball versus Inman.</p><p>We were talking about dimensional modeling versus third normal form. Then Dan Linted comes along and we&#8217;ve got data vaults. There was lots of emotion involved in people comparing these technologies. In fact, all. Of those people, bill Inman, Dan Linted, Ralph Kimball, they all said, and it&#8217;s in their books, that these modeling patterns are complimentary.</p><p>Bill Inman would say your enterprise data warehouse can be third normal form, but for reporting and, for feeding into analytical tools. Kimball model, the dimensional model is great. Kimball would say, ah, okay, you can do that. But yeah, you can also do your, your enterprise data warehouse and dimensional modeling.</p><p>Okay. There might be a little bit difference of opinion there, but they were still complimentary. Even data vault, Data vault, you can&#8217;t just plug straight into analytics and tools. and I will. Admit and agree that it&#8217;s great for tracking changes over time in the lowest level of granularity, giving you the most flexibility to do what you want with it afterwards.</p><p>But a dimensional model is very good in plugging it in. But of course, we see all those debates out there. This versus this, the versus this, the LLM doesn&#8217;t will take those arguments into consideration, but it has no bias. It has no emotion attached to any of those modeling patterns. So I would say that LLM will probably come out with dimensional modeling because it&#8217;ll reason that, okay, you can store your data this way, but it&#8217;s not going to be what you need to design the model that&#8217;s going to be delivered to the business.</p><p>And I would like to test this as well and antagonize an LLM around these modeling patterns and ask it. Okay, I&#8217;m gonna be designing this. What do you think the best model is gonna be? And then I might add in a few, trip wires to it and say what about data vault? And what about this?</p><p>And I&#8217;m hoping that the LLM would say, yes you can use those modeling patterns, but it&#8217;s generally agreed that they are complimentary and that you can add on a dimensional modeling Pattern to a third normal form or a data vault. But I am hoping that the AI has less shall we say, emotion attached to picking that data modeling Pattern.</p><p><strong>Shane</strong>: if it&#8217;s got no emotion, we should ask it for a definition of data, product, semantic layer </p><p><strong>Andy</strong>: yeah.</p><p><strong>Shane</strong>: We&#8217;d love to argue. I&#8217;m gonna, I&#8217;m gonna disagree with you on that one. And the reason is I don&#8217;t think LMS are reasoning, they are Pattern matching and tokenization based on a bunch of content. my hypothesis, and it is just a hypothesis, is the Kimble and dimensional content is being far more widely available. And therefore that model has a bias towards using it However. It&#8217;s just a hypothesis. And one of the things that Joe Reese has been doing as part of the practical data modeling community is just testing stuff live with a bunch of people. I&#8217;m gonna suggest you and I do that. I suggest that we figure out how to do a live session. bring up and, multiple LMS and with a bunch of other people watching or helping us. We just bash the snot out of it and try and see actually is there a bias for a modeling technique? But if I take that away for now your comment around some being really intrigued me, because Johnny&#8217;s mentioned it before and I struggled to find any content around it, it&#8217;s one of those data modeling that is actually quite hard to find anything about it. And good point. I need to get somebody on the podcast to come and explain it. So if I wanted an LLM to assist me in designing a Sunbeam model because I&#8217;m opinionated that&#8217;s the model I prefer for whatever reason, I think it&#8217;s gonna struggle and that will be another test, maybe we can give it a bash on that to, to see , so I think if you&#8217;re using an LLM to assist you in modeling and you are using a well-known Pattern that you are opinionated about, so data vault dimensional, third normal form potentially, you probably don&#8217;t need to use a lot of reinforcement with the LLM because it knows what you&#8217;re talking about.</p><p>If you wanted to bring in some of the more obscure modeling patterns, like something, I think you&#8217;re gonna actually have to pass it a reinforcement of that content. So one of the interesting things about Joe recently, what he does with the practical data community is he does live sessions where we all jump on screen share and we actually test out a hypothesis. And so one of the ones that he did was this idea of could we start with a business problem in an industry none of us knew. Get the LLM to help us understand the industry, create a conceptual model, move it to a logical model, move it to a physical model, and actually implement that physical model in a database. And that was fun. But what was interesting for me was observing the way Joe approached it, and he comes from more of a data science background than I do. So he tended, in my view, what I saw was approach it from an EDA, an exploratory data analysis approach. So he would be looking at the data sources and trying to understand the data that&#8217;s coming in, because that&#8217;s how we thought. Whereas for me, I come from more of a business background. That&#8217;s how I&#8217;ve been trained.</p><p>So more of that. Who does what business process stuff. A lot of the beam first part of the book. And so for me, I typically wanna understand the who does, what&#8217;s the core business events, the core concepts, and that&#8217;s how I model. so again, I think in terms of an AI assistant that helps you model, you probably want to train it around your modeling process.</p><p>Or maybe you don&#8217;t actually, maybe the LLM should decide what inputs it needs. Rather than you being opinionated on how you typically model, what do you think? First of all, which way do you model? Do you think source specific and understand the data first, or do you try and understand the core business events?</p><p>And then do you think you should be opinionated to the LLM about which way it should approach it?</p><p><strong>Andy</strong>: I&#8217;ve always modeled from the source system side of things and then taken into consideration the requirements from the business. And I know why that is, and that&#8217;s because I am fundamentally a technical person. So I&#8217;ll always want to default to looking at the technical aspects of things. Hey, I can look at source systems, I can understand schemas, I can understand tables and columns and the domain values within that, and then I can look at matching that to what the business wants.</p><p>And I&#8217;ve worked with people that work the other way around. They are interested in the art of the possible. They&#8217;re interested in what do the business need to make the decisions, and then let&#8217;s go and find what we need from those source systems. So I&#8217;ve moved my needle a little bit more towards the art of the possible, right?</p><p>This is what the business ultimately needs in terms of decision making. If they need to be more operationally efficient, if they want to, take advantage of opportunities in the market, challenge competitors, whatever, I then go and look at those source systems and see what&#8217;s available. And then sometimes you really can&#8217;t realize.</p><p>Some of that data that is needed for the business, right? Because it&#8217;s just out of the hands of those source systems. And I suppose you could calculate it and you could infer it from that data, but I&#8217;ve tended to work that way. And then interestingly, when you were, talking about AI and Joe working through the kind of modeling aspects from that perspective, yes, I can totally see someone asking the model to work with them in a specific way and say, right here is my source system.</p><p>This is what I need to do. Now when those source systems, and if you&#8217;ve got access to schemas and you can provide that schema to an LLM and let&#8217;s, let&#8217;s say it&#8217;s An LLM that&#8217;s secure, it&#8217;s within your organization or it&#8217;s within your data boundaries. So you can give the LLM that schema and then you ask it to model that for you.</p><p>You may then ask that LLM I need to join it with other systems as well, and this is the system and this is the schema. So it could then help you generate the model to incorporate multiple systems or showing you some examples about how you can, join those systems together.</p><p>What I wouldn&#8217;t want an LLM to do is just go crazy with the business requirements and say, okay, this business operates in. This domain. So this is all of the data that they need to be ultra competitive and at the top of their game. And then I&#8217;ve gotta scrabble around trying to desperately find where I&#8217;m going to get this data from.</p><p>That the model says is going to help me build the perfect data model for the business. I guess I&#8217;m a data modeling pragmatist. I would look at the source systems and I would look at how they support the business in what they want to do. And then I would probably work with the LLM in that fashion.</p><p>I&#8217;d be saying okay, here&#8217;s the framing. Here&#8217;s the context. This is what we need to measure, but this is what we&#8217;ve got in our source systems, and this is the hard facts about what we&#8217;ve got in the source systems. Help me build that model that can, realize what the business wants, but work within the limits of what I&#8217;ve got with those source systems.</p><p>So that&#8217;s what I would do.</p><p><strong>Shane</strong>: I was just thinking then, back to my point about, enterprise data model is sitting in the cupboard for two years to come out with the most beautiful data model ever. That&#8217;s basically a person going into an LLM now and saying you have no constraints. Here&#8217;s the industry. Gimme a data model from scratch that does everything .</p><p>It&#8217;s quicker. It&#8217;s not two years, but it&#8217;s just as unimplementable, that&#8217;s a bad word. One of the things we found was if we think about chat GT five and this idea that, we didn&#8217;t get the a GI that we expected, but what we got was a better interface. So instead of having to decide what type of LLM foundational model you wanted to use, you tell it what you want to achieve and it works out which model is the best fit for you. And one of the things we found when we were experimenting was. If we had one agent, our agent&#8217;s called 80. If we asked her to do everything, she was okay at it, but she wasn&#8217;t great when we broke her out into sub-agents. So we had agent, 80, the data modeler, 80, the what we call change rules, like the ability to write ETL and we gave her a clearer opinions and a clearer boundary.</p><p>We, we got a really good uplift in the accuracy, right? And the the evals that we got back in terms of, we got better responses that made our lives easier. so of the things when I&#8217;m teaching my canvas I actually use something that Lawrence talked about, right? Which is modeling based on source, modeling based on report or modeling based on business process. It&#8217;s one of the things that stuck with me for many years. And so now I&#8217;m thinking based on that. What we really probably need is an agent that we can go to and say give us a model based on the source system. And another agent. We can say, give us a model based on the core business events or business processes, or the who does what. And a third one where we go give us a model based on outcome, and then we give it the constraints of what we actually wanna achieve in the next iteration. What do we actually have to deliver and we get it to model it for us. So based on those three, tell us what the actual optimal model is to achieve this outcome and then stress test it for me. So that is getting a bunch of inputs, right? Because if I think about it. That&#8217;s what great modelers do. they take a stance, but then they always jump, your technical you source first, but then you go right now what are the core business events?</p><p>Customer orders, product good. I don&#8217;t need to worry about store ship&#8217;s product in this iteration, So I&#8217;m only modeling in that boundary. And then, okay, what do we know we have to deliver? Is it a dashboard, is it a data service? And then you are using that to iterate initial stance of that model until you get something that is fit for purpose.</p><p>If I think about it, that&#8217;s what we do as humans. So that&#8217;s probably the process we need to encourage the L LMS to do. What do you think?</p><p><strong>Andy</strong>: I like that idea of asking each of those models to come up with their specific version based on those constraints. You do it by source system, you do it by business process, and then getting them to antagonize each other in terms of. Getting to a realistic result, taking on board each of those aspects.</p><p>And what I did make me laugh in my head when you were talking about, a great modeler is that in the Rocky films there was Apollo Creed who said to Rocky that, you fight great, but I&#8217;m a great fighter. And I think that about the modeling domain as well.</p><p>you can point at a person and say, you model great, but I&#8217;m a great modeler. And that&#8217;s just built up through experience, That&#8217;s just built up through battle testing models that you&#8217;ve created over the years and iterating over those models over the years. And like you said, in a couple of points before.</p><p>About the LLMs generating these models and coming up with dimensional models because that&#8217;s what the prevalent documentation will have. Those models have been trained over that documentation, so those models will be like, ah, okay, this seems to be the most popular way of doing things. So then when those LLMs are generating a model based on a source system, it&#8217;s gonna be based on the context of them understanding that source system and what the output of that source system is.</p><p>I look at something like Dynamics, which has generally been something that is quite difficult to model when so much customization happens within the platform. There is no real vanilla implementation of dynamics, which means that the LLM is gonna have to. Understand business context, not just source system because I guess the source system is just going to have all these entities that it might think I don&#8217;t have any information, because those sorts of things haven&#8217;t been discussed before.</p><p>Before. And as human beings, I guess we can reason over those things and we can hypothesize and we can make a best guess and iterate over perhaps the LLM can&#8217;t really do that because it&#8217;s been trained on previous data, previous examples, and it doesn&#8217;t have the ability to think outta the box if it hasn&#8217;t encountered something before.</p><p>Yeah, interesting point there, Shane.</p><p><strong>Shane</strong>: Although I would posit that it probably has encountered it because it&#8217;s got access to information outside of data warehouse, data modeling. So it&#8217;s gonna have all the books on Dynamics implementations, it&#8217;s gonna have all the blog posts of people who have customized it. But yeah, I get your point, especially things like SAP, where, who knows how that bloody thing works.</p><p>It&#8217;s it&#8217;s gonna have more knowledge than I do on that. But less knowledge than an enterprise. SAP data modeler, who does it for a thing. so again, I think it comes back to being clear about when you want to provide an opinion when you don&#8217;t. So when you wanna provide an opinion, &#8216;cause it&#8217;s important that the LLM or the agent stays within that boundary for you or where you just leave it because it in theory has access to more expertise and knowledge than you have in that, in a specific space. And so if we go back to medallion, if we go back to layer data architectures, that&#8217;s actually a really good place where you might wanna be opinionated. Because if you are saying I want you to help me do a conceptual model, you probably don&#8217;t care about your layered architecture. But if you are saying, I want you to help me create physical models that I&#8217;m gonna implement, I know that. My layered architecture is I have a designed layer that is concepts, details, and events. And my consume layer, is a one big table. And I know that your cleanse layer is source specific data structured with data being cleansed. And your gold layer is, I&#8217;m guessing, a dimensional model,</p><p>if each of us had the same agent, but we put in those opinionated boundaries, we are gonna get back physical data models that are more fit for purpose for us to implement in our platforms of choice. And, realistically, if the L and N came back to you and said throw away your dimensional models and do one big table from cleansed, you are probably gonna look at that and go the cost of change is quite high.</p><p>I really need to understand why you&#8217;re making me do that. Because I have to learn a whole lot of new things and I have to rebuild everything. So I think, again being opinionated where you want to be opinionated, that makes sense is important. And then being free and hippie and open to whatever, the great a, a I agent in the world tells us as a start for 10 and then stress testing it with expertise.</p><p>I think that&#8217;s where we&#8217;re gonna end up. What do you think?</p><p><strong>Andy</strong>: So I think that opinionated aspect is quite important in terms of how you approach the LLM because being opinionated about something means that you&#8217;ve got strong convictions in how you want to do something. Which then also means that someone else could be opinionated and have strong opinions in how they want to do something.</p><p>And as we know over the years when you&#8217;ve been working with other people that have experience doing these things, there might be a crossover, there might be some differences of opinions. How is the LLM going to know those differences of opinions that you can work out and collaborate on? Yes, it&#8217;s got this body of knowledge.</p><p>Hopefully the people with the opinions have been provided information, writing books, writing blog posts that the LLM can take on board and learn and use. But if I come to a project and use an LLM to create my medallion. I also ask it to create my data model based on my biases and based on my experience and the way that I wanna do things, I&#8217;m probably gonna guide the LLM to a certain conclusion.</p><p>Someone else is gonna come along and say I wanna start this project. I wanna lay my data out. Do you have some thoughts about how I wanna lay it out? I have used this in the past and then I wanna use data modeling to do it. It might come up with a slightly different outcome based on your own strength of conviction and your biases that you&#8217;ve put into the LLM.</p><p>It might come back with something that is generic at first, but then you might say to the LLM, actually. I don&#8217;t want my ization in silver, I want it in raw because that&#8217;s traditionally where I&#8217;ve put it. Whereas someone else could say, okay, why have you asked me to put my ization in raw? I&#8217;ve learned that it&#8217;s supposed to be in silver, but if we remove the LLM from this conversation for a second, it&#8217;s almost like we&#8217;d get the same outcome if it was just humans.</p><p>Because humans would go into an organization, implement a data platform. and based on their experience and biases, they would implement it a certain way, a different set of people could go into that same organization and implement it slightly different.</p><p>So I don&#8217;t think we&#8217;re necessarily solving the problem of these differences of opinions or the convictions that you have. It&#8217;s just using another tool to help, possibly reason, to help possibly have this entire body of knowledge and experience our disposal that we can interrogate and hopefully gets to a more human slash ai reasoned outcome.</p><p>That&#8217;s what I use it for. So I would use AI to help me model, and I Based on my experience, but I&#8217;m also not going to be arrogant enough to think I know everything there is to know about data modeling. So I also want to ask it questions about have we thought about it this way or have we thought about it the other way?</p><p>Just so that I can bring a little bit of critical thinking to it as well, and then evaluate the outcome. Yeah, it&#8217;s an interesting point there, Shane.</p><p><strong>Shane</strong>: I like that. We&#8217;ve got to is, &#8216;cause it goes back to where we started, which is if you are new to the data domain and you wanna understand data modeling, there really isn&#8217;t a lot of great content apart from the Kimball stuff around. And there are other techniques. So now LMS give us access to ask questions and effectively get educated on what the art of the possible is.</p><p>What could I do? The second thing is, the mentors, the grumpy people who could just look at your model and tell you where you got a. Wrong. They&#8217;re not really so prevalent in organizations anymore for some reason. So again, using the LLM to provide that expertise, that rigor, that stress testing, that feedback is really valuable. There is another lens, but we don&#8217;t have time to go into detail, but I&#8217;ll just drop it in here &#8216;cause maybe we come back and have another chat about this idea when you&#8217;ve thought about it a bit, which is centralized platforms may disappear. So if we think about CRMs, they are centralized platforms that everybody uses with a core bunch of features, they&#8217;ve being, built in a way that become reusable. And what we&#8217;re seeing now with vibe coding is actually you could build one feature, one app that does one thing really well really quickly. and it doesn&#8217;t need to be part of a shared platform. And that&#8217;s gonna be a really interesting change in the market when that happens. That also means that you can buy code something that has 30,000 lines of code. And while you should know what it does, potentially, you don&#8217;t need to care because if it&#8217;s safe and secure, it does the job actually. You don&#8217;t need the expertise to understand how it does it. we apply that to the data domain and this idea of moving away from shared data platforms, that I&#8217;ve created this information product that does one thing well, and I&#8217;ve got the LLM to design, five different architectural data layers using 17 different data modeling techniques and a bunch of code, I don&#8217;t understand, as long as it doesn&#8217;t have to be reusable, then maybe we don&#8217;t care and again, I&#8217;m old. I find reuse really valuable. I find expertise and shared language really important. But maybe, the New World is one and done information products where the LMS are. Doing everything completely different each time you deliver a product. But we&#8217;re out of time.</p><p>So I&#8217;m gonna leave that one on the table, maybe have a think about it, and that could be a, a good follow up conversation around how would we use AI and LLMs to remove the need for reuse and shared platforms in the new Age Worlds. But before we finish off how do people find you? How do they see what you&#8217;re reading, what you&#8217;re writing?</p><p>You&#8217;re obviously spending a lot of time going to some great conferences what those conferences are and how they can find you.</p><p><strong>Andy</strong>: Yeah, so I think the first thing is Link Tree. So Link Tree is just is basically my go-to in terms of, I. A jumping off point for people to find me. So that&#8217;s basically, link Tree slash Andy Cutler, so A-N-D-Y-C-U-T-L-E-R &#8216;cause that&#8217;ll take you to my company. So data high.com.</p><p>That&#8217;ll take you to my community blog, which is serverless sql.com. There&#8217;s my Blue Sky account, there&#8217;s YouTube, which is data high, and then my LinkedIn as well. So the Link Tree, Andy Cutler will take you to everything You need to do most of the conferences. I am in the Microsoft space and predominantly fabric, which is, which is where I, I spend, as the Fresh Prince of Bel Air would say most of my days.</p><p>The next conference is over in Oslo, so that is fabric February. So that&#8217;s gonna be over in Oslo. There&#8217;s obviously SQL Bits, which is, one of the UK&#8217;s biggest data conferences as well. so I would say, I generally post on LinkedIn blog posts, opinions, lots of conversation as well.</p><p>And as I said, yeah, Tre Andy Cutler, that&#8217;s where you&#8217;ll find me.</p><p><strong>Shane</strong>: Excellent. Alright, hey, thanks for a great chat around AI and LMS and data modeling and I hope everybody has a simply magical day. </p><h2>&#171;oo&#187;</h2><div class="pullquote"><p><em>Stakeholder - &#8220;Thats not what I wanted!&#8221; <br>Data Team - &#8220;But thats what you asked for!&#8221;</em></p></div><p>Struggling to gather data requirements and constantly hearing the conversation above?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0Bu2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0Bu2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg" width="387" height="342" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:342,&quot;width&quot;:387,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:19725,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/160520537?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!0Bu2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Want to learn how to capture data and information requirements in a repeatable way so stakeholders love them and data teams can build from them, by using the Information Product Canvas.</p><p>Have I got the book for you!</p><p>Start your journey to a new Agile Data Way of Working.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://adiwow.com/168&quot;,&quot;text&quot;:&quot;Buy the Agile Data Guide now!&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://adiwow.com/168"><span>Buy the Agile Data Guide now!</span></a></p><h2>&#171;oo&#187;</h2>]]></content:encoded></item><item><title><![CDATA[Recording the path to your "AI Agent" responses]]></title><description><![CDATA[If you can't see what what path was taken, you can't safely experiment with that path]]></description><link>https://agiledata.info/p/recording-the-path-to-your-ai-agent</link><guid isPermaLink="false">https://agiledata.info/p/recording-the-path-to-your-ai-agent</guid><dc:creator><![CDATA[Shagility]]></dc:creator><pubDate>Mon, 22 Dec 2025 16:53:49 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!TrNs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68ba820c-afbb-4d94-b4a8-823fa57e37d7_1753x845.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>How did you get that answer?</h2><p>As Nigel and I keep testing out different use cases for where our Ask ADI capability can help data professionals reduce the cognition and effort required to do complex data work we stumbled on an interesting pattern we both use.</p><h3>Building the House while living in it</h3><p>As we are using ADI in anger to do the data work for our AgileData.team Fractional Data Service customers and we are also extending out the capabilities in our AgileData.cloud data platform at the same time, we found we would often use ADI to help us with a task and then ask her <em><strong>&#8220;How did you get that answer&#8221;.</strong></em></p><p>This would often be triggered with ADI giving us one of three responses:</p><ol><li><p>Something that was right</p></li><li><p>Something that was wrong</p></li><li><p>Something that was unexpected</p></li></ol><h3>Something that was right</h3><p>For these ones we used the answer and moved on to the next step in the data work, sing with joy that she is doing what we designed her to.</p><h3>Something that was wrong</h3><p>She would give as an answer that could politely be called hallucinating, in the data world I typically think of it as just being wrong.</p><p>In this scenario we always want to understand the path she took to return the wrong results. So we can figure out what we would change to make her more accurate in the future.</p><p>So we would ask her <em><strong>&#8220;How did you get that answer&#8221;.</strong></em></p><h3>Something that was unexpected</h3><p>She would reply with something that made us go, wow that was unexpected, but actually bloody good, how the hell did she get that response?</p><p>Which meant we would next want to understand the logic path that she took to get that response.</p><p>Hence the follow up <em><strong>&#8220;how did you get that answer?&#8221;</strong></em> question.</p><p>In hindsight the pattern were doing was a form of eval, which we were doing repeatably but manually.</p><h2>And then we started scaling it</h2><p>As we tweaked the Google Gemini models we used, we tweaked what and how we stored Context in the Context Plane, we tweaked the prompts and reinforcement objects we stored in the Context Plane and made accessible to ADI etc, we decided she was good enough to be out in the hands of our AgileData.network partners.</p><p>To be clear they were well aware that she was still in an &#8220;discovery&#8221; mode, and not to be let lose directly on their customers without the partner in the loop, and was not ready to be given directly to the customers themselves.</p><p>But as one of our core AgileData principles is co-design, its how we can scale what we do so fast with just two co-founders.  We know that putting patterns and features into the hands of our talented partners is a much faster way to iterate and scale.</p><h3>WTF did they just try to do?</h3><p>And of course you can imagine what happens when you put a fairly permissive AskAI capability into a data platform that covers the complete gambit from data collection to data consumption, and enables every data task in-between, into the hands of data practitioners who are by default very early adopters, who are working across multiple customers, in multiple industries and are trying to be at the edge if not bleeding on that edge &#8230;.</p><p>They started doing things that made us say, why the fook did they try and do that, what were they trying to achieve?</p><p>And more importantly, how the hell did ADI come up with that answer, and was it right or wrong, or excitingly unexpected?</p><h3>Log everything</h3><p>One of the patterns we apply to the AgileData.cloud is we log everything.</p><p>So of course we were already logging the question they asked ADI and the response they got.</p><p>But this didn&#8217;t help us work out the rest.</p><h3>Manual processes don&#8217;t scale</h3><p>So of course the first thing we tried, was taking the question they asked and typing it into our own AgileData Tenancies.</p><p>You can imagine what happened.</p><ul><li><p>The data in our tenancies is different to theirs.</p></li><li><p>You typically wont get exactly the same answer to the same question with a LLM based model.</p></li></ul><p>Epic fail and not a lot of use to help iterate ADI&#8217;s behaviour.</p><h3>Not just &#8220;how many sales were there&#8221;</h3><p>The natural language and non-deterministic behaviour of the Google Gemini models we use under the covers for ADI is where a lot of the value resides.</p><p>And this means she can help with a lot more data tasks than the typical &#8220;Text to SQL&#8221; use case every data vendors is chasing as table stakes these days.</p><p>And so the questions being asked and data work being done by our talented partners was a lot more than the simple <em>&#8220;how many sales were there&#8221;</em>.</p><p>Which meant we couldn&#8217;t just implement something like a judge pattern to make sure ADI was returning the correct number to each question.</p><h2>Iterate with simplicity</h2><p>As part of our Way of Working we alway try and decompose the work to be done into the smallest chunk possible and to start off with simplicity and add complexity later.</p><h3>Automatically log what path was used</h3><p>First thing we did was to extend the logging to include the path ADI took to get the answer, so we could review it after the fact.</p><p>Originally this was just logging in the background, but we found that this logic was actually useful to the Data Practitioner, so we surfaced it as part of the ADI response.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TrNs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68ba820c-afbb-4d94-b4a8-823fa57e37d7_1753x845.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TrNs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68ba820c-afbb-4d94-b4a8-823fa57e37d7_1753x845.png 424w, https://substackcdn.com/image/fetch/$s_!TrNs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68ba820c-afbb-4d94-b4a8-823fa57e37d7_1753x845.png 848w, https://substackcdn.com/image/fetch/$s_!TrNs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68ba820c-afbb-4d94-b4a8-823fa57e37d7_1753x845.png 1272w, https://substackcdn.com/image/fetch/$s_!TrNs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68ba820c-afbb-4d94-b4a8-823fa57e37d7_1753x845.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TrNs!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68ba820c-afbb-4d94-b4a8-823fa57e37d7_1753x845.png" width="1200" height="578.5714285714286" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/68ba820c-afbb-4d94-b4a8-823fa57e37d7_1753x845.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:702,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:189334,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/182334326?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68ba820c-afbb-4d94-b4a8-823fa57e37d7_1753x845.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TrNs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68ba820c-afbb-4d94-b4a8-823fa57e37d7_1753x845.png 424w, https://substackcdn.com/image/fetch/$s_!TrNs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68ba820c-afbb-4d94-b4a8-823fa57e37d7_1753x845.png 848w, https://substackcdn.com/image/fetch/$s_!TrNs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68ba820c-afbb-4d94-b4a8-823fa57e37d7_1753x845.png 1272w, https://substackcdn.com/image/fetch/$s_!TrNs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68ba820c-afbb-4d94-b4a8-823fa57e37d7_1753x845.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>You can see the path being returned here in the language of as set of assumptions.</p><h3>Ask for feedback</h3><p>We still don&#8217;t know if the path used was actually the best path and if the response helped the partner do the data work quicker and easier, so we reuse the feedback pattern from social media products so they can quickly give us an up or down thumb for each response.</p><p>As our partners know they are co-designing with us, this quick and easy feedback loop provides value, without slowing them down from doing the data work and delivering value to the customer.</p><h3>Should I trust the response</h3><p>We also decided to experiment with providing a confidence score on each ADI response, we find this useful when evaluating the responses after the fact, it will be interesting to see this helps our partners or not.</p><h2>Another Use Case</h2><p>Here is another use case where we Ask ADI to help us model the data from  Google Analytics.</p><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3-NU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fd3208d-6b33-47f2-ac8a-416f9a5bb4d5_1753x1233.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3-NU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fd3208d-6b33-47f2-ac8a-416f9a5bb4d5_1753x1233.png 424w, https://substackcdn.com/image/fetch/$s_!3-NU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fd3208d-6b33-47f2-ac8a-416f9a5bb4d5_1753x1233.png 848w, https://substackcdn.com/image/fetch/$s_!3-NU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fd3208d-6b33-47f2-ac8a-416f9a5bb4d5_1753x1233.png 1272w, https://substackcdn.com/image/fetch/$s_!3-NU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fd3208d-6b33-47f2-ac8a-416f9a5bb4d5_1753x1233.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3-NU!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fd3208d-6b33-47f2-ac8a-416f9a5bb4d5_1753x1233.png" width="1200" height="843.9560439560439" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1fd3208d-6b33-47f2-ac8a-416f9a5bb4d5_1753x1233.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1024,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:457783,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/182334326?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fd3208d-6b33-47f2-ac8a-416f9a5bb4d5_1753x1233.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3-NU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fd3208d-6b33-47f2-ac8a-416f9a5bb4d5_1753x1233.png 424w, https://substackcdn.com/image/fetch/$s_!3-NU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fd3208d-6b33-47f2-ac8a-416f9a5bb4d5_1753x1233.png 848w, https://substackcdn.com/image/fetch/$s_!3-NU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fd3208d-6b33-47f2-ac8a-416f9a5bb4d5_1753x1233.png 1272w, https://substackcdn.com/image/fetch/$s_!3-NU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fd3208d-6b33-47f2-ac8a-416f9a5bb4d5_1753x1233.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!s6xs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e90072c-851a-481a-afe2-4117e7b1d654_1753x387.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!s6xs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e90072c-851a-481a-afe2-4117e7b1d654_1753x387.png 424w, https://substackcdn.com/image/fetch/$s_!s6xs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e90072c-851a-481a-afe2-4117e7b1d654_1753x387.png 848w, https://substackcdn.com/image/fetch/$s_!s6xs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e90072c-851a-481a-afe2-4117e7b1d654_1753x387.png 1272w, https://substackcdn.com/image/fetch/$s_!s6xs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e90072c-851a-481a-afe2-4117e7b1d654_1753x387.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!s6xs!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e90072c-851a-481a-afe2-4117e7b1d654_1753x387.png" width="1200" height="264.56043956043953" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6e90072c-851a-481a-afe2-4117e7b1d654_1753x387.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:321,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:109377,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/182334326?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e90072c-851a-481a-afe2-4117e7b1d654_1753x387.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!s6xs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e90072c-851a-481a-afe2-4117e7b1d654_1753x387.png 424w, https://substackcdn.com/image/fetch/$s_!s6xs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e90072c-851a-481a-afe2-4117e7b1d654_1753x387.png 848w, https://substackcdn.com/image/fetch/$s_!s6xs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e90072c-851a-481a-afe2-4117e7b1d654_1753x387.png 1272w, https://substackcdn.com/image/fetch/$s_!s6xs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e90072c-851a-481a-afe2-4117e7b1d654_1753x387.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Again we are logging the path ADI took to respond, but with slightly different language to make it fit the Context of the question more.</p><p>And ADI is providing a suggest next step and asking if she can help.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vlLr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4e8d578-3c91-4c01-acc7-ed336b6098d6_1753x725.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vlLr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4e8d578-3c91-4c01-acc7-ed336b6098d6_1753x725.png 424w, https://substackcdn.com/image/fetch/$s_!vlLr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4e8d578-3c91-4c01-acc7-ed336b6098d6_1753x725.png 848w, https://substackcdn.com/image/fetch/$s_!vlLr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4e8d578-3c91-4c01-acc7-ed336b6098d6_1753x725.png 1272w, https://substackcdn.com/image/fetch/$s_!vlLr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4e8d578-3c91-4c01-acc7-ed336b6098d6_1753x725.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vlLr!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4e8d578-3c91-4c01-acc7-ed336b6098d6_1753x725.png" width="1200" height="496.15384615384613" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c4e8d578-3c91-4c01-acc7-ed336b6098d6_1753x725.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:602,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:189274,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/182334326?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4e8d578-3c91-4c01-acc7-ed336b6098d6_1753x725.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vlLr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4e8d578-3c91-4c01-acc7-ed336b6098d6_1753x725.png 424w, https://substackcdn.com/image/fetch/$s_!vlLr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4e8d578-3c91-4c01-acc7-ed336b6098d6_1753x725.png 848w, https://substackcdn.com/image/fetch/$s_!vlLr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4e8d578-3c91-4c01-acc7-ed336b6098d6_1753x725.png 1272w, https://substackcdn.com/image/fetch/$s_!vlLr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4e8d578-3c91-4c01-acc7-ed336b6098d6_1753x725.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>And as mentioned before the user will always go somewhere we don&#8217;t expect.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cCT2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe26c984-d357-42fa-8e1b-404ab6492e97_1753x872.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cCT2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe26c984-d357-42fa-8e1b-404ab6492e97_1753x872.png 424w, https://substackcdn.com/image/fetch/$s_!cCT2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe26c984-d357-42fa-8e1b-404ab6492e97_1753x872.png 848w, https://substackcdn.com/image/fetch/$s_!cCT2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe26c984-d357-42fa-8e1b-404ab6492e97_1753x872.png 1272w, https://substackcdn.com/image/fetch/$s_!cCT2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe26c984-d357-42fa-8e1b-404ab6492e97_1753x872.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cCT2!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe26c984-d357-42fa-8e1b-404ab6492e97_1753x872.png" width="1200" height="596.7032967032967" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/be26c984-d357-42fa-8e1b-404ab6492e97_1753x872.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:724,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:315396,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/182334326?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe26c984-d357-42fa-8e1b-404ab6492e97_1753x872.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cCT2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe26c984-d357-42fa-8e1b-404ab6492e97_1753x872.png 424w, https://substackcdn.com/image/fetch/$s_!cCT2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe26c984-d357-42fa-8e1b-404ab6492e97_1753x872.png 848w, https://substackcdn.com/image/fetch/$s_!cCT2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe26c984-d357-42fa-8e1b-404ab6492e97_1753x872.png 1272w, https://substackcdn.com/image/fetch/$s_!cCT2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe26c984-d357-42fa-8e1b-404ab6492e97_1753x872.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The point of these is not that the Gemini LLM provided definitions for commonly used Google Analytics concepts, any LLM will do that, you can just use ChatGPT on its own etc.</p><p>But the fact is we end up with full visibility of what was asked, what the answer was and who that answered was derived.</p><p>And with that visibility we can decide the most valuable use cases to iterate Ask ADI for, based on the things our AgileData.Network partners actually need them to help them with.</p><h2>And one more thing</h2><p>And you can see where we will take this. </p><p>Given ADI is embedded in the data platform that the data practitioner is doing the actual data work in, we can increase the level of assistance over time.</p><p>ADI should probably respond to the Google Analytics data questions, with a response that is tailored on how you do this work in the AgileData.cloud and use the patterns and language we use in that platform (Information Product Canvas, Concepts instead of Entities etc)</p><p>ADI should probably create the Concept Model for those Concepts.</p><p>ADI should probably populate the Business Glossary with the default definitions for those Concepts.</p><p>ADI should probably use the Context we have defined before for Google Analytics to differentiate between a Pseudo User and a User.</p><p>ADI should probably rehydrate the Change Rules (data transformations) needed to populate those Pseudo Users from thw GA4 event data.</p><p>ADI should probably &#8230;. [insert use case we see our partners do, that we have never thought about here]</p><h2>Patterns you can adopt.</h2><p>Here are some simple patterns you can adopt as you continue your &#8220;AI&#8221; journey in your organisation:</p><ul><li><p>Log all questions asked and all responses given by your AskAI feature, somewhere you can see and query</p></li><li><p>Log the path your LLM took to provide the response</p></li><li><p>Find a way to gather feedback as you scale</p></li><li><p>Put it in the hands of your early adopters as soon as possible and let them help you co-design the most valuable areas to iterate with next.</p></li></ul><p></p>]]></content:encoded></item><item><title><![CDATA[The pattern of Metric Trees with Timo Dechau ]]></title><description><![CDATA[AgileData Podcast #77]]></description><link>https://agiledata.info/p/the-pattern-of-metric-trees-with</link><guid isPermaLink="false">https://agiledata.info/p/the-pattern-of-metric-trees-with</guid><dc:creator><![CDATA[Shagility]]></dc:creator><pubDate>Fri, 21 Nov 2025 18:09:16 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/tThvtGdGEig" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Join Shane Gibson as he chats with Timo Dechau about Metric Trees</p><blockquote><p><strong><a href="https://agiledata.substack.com/i/179577155/listen">Listen</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/179577155/google-notebooklm-mindmap">View MindMap</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/179577155/google-notebooklm-briefing">Read AI Summary</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/179577155/transcript">Read Transcript</a></strong></p></blockquote><p></p><h2>Listen</h2><p>Listen on all good podcast hosts or over at:</p><p><a href="https://podcast.agiledata.io/e/the-pattern-of-metric-trees-with-timo-dechau-episode-77/">https://podcast.agiledata.io/e/the-pattern-of-metric-trees-with-timo-dechau-episode-77/</a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://podcast.agiledata.io/e/the-pattern-of-metric-trees-with-timo-dechau-episode-77/&quot;,&quot;text&quot;:&quot;Listen to the Agile Data Podcast Episode&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://podcast.agiledata.io/e/the-pattern-of-metric-trees-with-timo-dechau-episode-77/"><span>Listen to the Agile Data Podcast Episode</span></a></p><p></p><blockquote><p><strong>Subscribe:</strong> <a href="https://podcasts.apple.com/nz/podcast/agiledata/id1456820781">Apple Podcast</a> | <a href="https://open.spotify.com/show/4wiQWj055HchKMxmYSKRIj">Spotify</a> | <a href="https://www.google.com/podcasts?feed=aHR0cHM6Ly9wb2RjYXN0LmFnaWxlZGF0YS5pby9mZWVkLnhtbA%3D%3D">Google Podcast </a>| <a href="https://music.amazon.com/podcasts/add0fc3f-ee5c-4227-bd28-35144d1bd9a6">Amazon Audible</a> | <a href="https://tunein.com/podcasts/Technology-Podcasts/AgileBI-p1214546/">TuneIn</a> | <a href="https://iheart.com/podcast/96630976">iHeartRadio</a> | <a href="https://player.fm/series/3347067">PlayerFM</a> | <a href="https://www.listennotes.com/podcasts/agiledata-agiledata-8ADKjli_fGx/">Listen Notes</a> | <a href="https://www.podchaser.com/podcasts/agiledata-822089">Podchaser</a> | <a href="https://www.deezer.com/en/show/5294327">Deezer</a> | <a href="https://podcastaddict.com/podcast/agiledata/4554760">Podcast Addict</a> |</p></blockquote><p></p><div id="youtube2-tThvtGdGEig" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;tThvtGdGEig&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/tThvtGdGEig?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>You can get in touch with Timo via <a href="https://www.linkedin.com/in/timo-dechau/">LinkedIn</a> or over at <a href="https://timodechau.com">https://timodechau.com</a></p><div class="pullquote"><p><strong>Tired of vague data requests and endless requirement meetings? The Information Product Canvas helps you get clarity in 30 minutes or less?</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://agiledataguides.com/ipc&quot;,&quot;text&quot;:&quot;Fix Your Data Requirements&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://agiledataguides.com/ipc"><span>Fix Your Data Requirements</span></a></p></div><h2>Google NotebookLM Mindmap </h2><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Yym7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2faf6195-535a-4133-9c15-e07f6f49f9b4_9690x13137.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Yym7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2faf6195-535a-4133-9c15-e07f6f49f9b4_9690x13137.png 424w, https://substackcdn.com/image/fetch/$s_!Yym7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2faf6195-535a-4133-9c15-e07f6f49f9b4_9690x13137.png 848w, https://substackcdn.com/image/fetch/$s_!Yym7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2faf6195-535a-4133-9c15-e07f6f49f9b4_9690x13137.png 1272w, https://substackcdn.com/image/fetch/$s_!Yym7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2faf6195-535a-4133-9c15-e07f6f49f9b4_9690x13137.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Yym7!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2faf6195-535a-4133-9c15-e07f6f49f9b4_9690x13137.png" width="1200" height="1626.923076923077" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2faf6195-535a-4133-9c15-e07f6f49f9b4_9690x13137.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1974,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:5073889,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/179577155?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2faf6195-535a-4133-9c15-e07f6f49f9b4_9690x13137.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Yym7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2faf6195-535a-4133-9c15-e07f6f49f9b4_9690x13137.png 424w, https://substackcdn.com/image/fetch/$s_!Yym7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2faf6195-535a-4133-9c15-e07f6f49f9b4_9690x13137.png 848w, https://substackcdn.com/image/fetch/$s_!Yym7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2faf6195-535a-4133-9c15-e07f6f49f9b4_9690x13137.png 1272w, https://substackcdn.com/image/fetch/$s_!Yym7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2faf6195-535a-4133-9c15-e07f6f49f9b4_9690x13137.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p></p><h2>Google NoteBookLM Briefing</h2><h3>Executive Summary</h3><p>Metric Trees: A Framework for Strategic Alignment and Business Transparency</p><p>Metric trees are a powerful framework for visualizing the mathematical and logical relationships between key business metrics. Their primary function is to deconstruct high-level, often non-actionable, &#8220;output metrics&#8221; like Monthly Recurring Revenue (MRR) into a hierarchy of actionable &#8220;input metrics&#8221; that individual teams can directly influence. This transforms abstract company goals into a tangible map that guides strategic planning and execution.</p><p>The core value of a metric tree lies in its ability to serve as a shared language and a visual representation of the entire business model. It fosters alignment across disparate teams&#8212;such as product, marketing, sales, and data&#8212;by clarifying how their specific activities contribute to top-line objectives. By creating this transparent map, organizations can have more focused conversations about priorities, diagnose performance issues, and measure the true impact of strategic initiatives.</p><p>Implementing metric trees effectively requires a blend of product-thinking patterns and data discipline. Methodologies like Event Storming and Domain-Driven Design are crucial for identifying the core processes and high-value &#8220;heartbeat events&#8221; that underpin the business. Ultimately, a metric tree is a strategic tool for planning, communication, and high-level monitoring; it complements, rather than replaces, the need for deep-dive, explorative dashboards.</p><h3><strong>The Challenge: Isolated Metrics and Disconnected Teams</strong></h3><p>Organizations often struggle with a fragmented understanding of performance, driven by metrics that are viewed in isolation and teams that operate on different &#8220;planets.&#8221; This disconnect manifests in several key challenges:</p><p>&#8226; <strong>The Problem of Isolated Metrics:</strong> Stakeholders are frequently presented with long lists of potential metrics (the &#8220;PDF with 100 SaaS metrics&#8221; problem) without any context for how they interrelate. A metric like MRR, viewed alone, offers little guidance on what actions to take. As Timo Dechau states, the critical missing element is the &#8220;relationship&#8221; between metrics, which is essential for making them actionable.</p><p>&#8226; <strong>The &#8220;Iceberg&#8221; of Complexity:</strong> Metrics that appear simple on the surface, such as &#8220;Active User&#8221; or &#8220;MRR,&#8221; conceal immense complexity. Defining these metrics accurately requires a deep understanding of the product&#8217;s specific use cases and the business&#8217;s various edge cases, a process that can take weeks or months. A well-defined metric can &#8220;change the track for a company,&#8221; but a poorly defined one creates confusion.</p><p>&#8226; <strong>Siloed Data Cohorts:</strong> The data domain itself is often fragmented. Cohorts focused on the data warehouse, product analytics, and data science frequently use different techniques and technology stacks, even when working with similar data patterns. Product analytics, with its focus on behavioral event data and sequence analysis (funnels, cohorts), has historically operated separately from classic Business Intelligence (BI), creating what Dechau describes as &#8220;two different planets&#8221; that rarely interact.</p><h3><strong>Defining the Metric Tree</strong></h3><p>A metric tree, also known as a driver tree, is a visual framework that maps the relationships between metrics, showing how lower-level inputs drive higher-level outputs.</p><p>&#8226; <strong>Core Concept:</strong> It functions as a deconstruction of a primary business goal into its constituent parts. Every metric has a relationship to another, and the tree makes these connections explicit.</p><p>&#8226; <strong>Primary Function:</strong> Its purpose is to translate a high-level, non-actionable <strong>output metric</strong> into a series of actionable <strong>input metrics</strong>.</p><p> &#9702; <strong>Output Metric:</strong> A lagging indicator that reflects past success (e.g., Revenue, Profit). It is difficult for a team to influence directly.</p><p> &#9702; <strong>Input Metric:</strong> A leading indicator that teams can directly control (e.g., New Accounts, Conversion Rate from Trial).</p><p>&#8226; <strong>Structure:</strong> The tree is a hierarchical model that can often be expressed as a mathematical equation. For example:</p><p> &#9702; MRR is composed of <code>New MRR</code> + <code>Expansion MRR</code> - <code>Contraction MRR</code> - <code>Churned MRR</code>.</p><p> &#9702; <code>New MRR</code> can be broken down further into <code>New Subscribers</code> * <code>Average Plan Price</code>.</p><p> &#9702; <code>New Subscribers</code> can be derived from <code>New Accounts</code> * <code>Account-to-Subscriber Conversion Rate</code>.</p><p>This decomposition continues until the metrics at the lowest level of the branches are things a team can directly execute against, such as running more webinars to increase <code>New Accounts</code>.</p><h3><strong>The Strategic Value of Metric Trees</strong></h3><p>The primary value of a metric tree is not just in the metrics themselves, but in the clarity, alignment, and strategic conversations it enables.</p><h4><strong>A Shared Map for the Business</strong></h4><p>The metric tree acts as a universally understood &#8220;map&#8221; of the business operating model.</p><p>&#8226; <strong>Creates a Common Language:</strong> It allows different departments to point to the same part of the map and understand how their work affects others, breaking down communication silos.</p><p>&#8226; <strong>Fosters Transparency:</strong> It makes the mechanics of the business model clear to everyone. For many employees, it may be the first time they see a clear illustration of how the company generates revenue.</p><p>&#8226; <strong>Reveals Interdependencies:</strong> The map highlights how initiatives in one area can impact metrics elsewhere. Dechau notes that workshops to build these trees often lead to &#8220;eye-opening&#8221; moments where teams realize their actions might be inadvertently hurting another revenue stream.</p><h4><strong>Driving Action and Measuring Impact</strong></h4><p>Metric trees connect daily work to strategic goals, making it easier to plan and measure initiatives.</p><p>&#8226; <strong>Connects Strategy to Execution:</strong> Teams can clearly see their area of influence on the map. A marketing team knows its efforts to generate new accounts are a direct input to the company&#8217;s overall revenue goal.</p><p>&#8226; <strong>Measures Initiative Success:</strong> A specific metric tree, or &#8220;sub-tree,&#8221; can be built for a new initiative. This allows the team to define success upfront and provides a &#8220;control instance&#8221; to validate whether local efforts (e.g., A/B tests) are creating a meaningful impact on the larger business goals.</p><p>&#8226; <strong>Identifies Opportunities:</strong> By populating the tree with data, teams can spot areas of high potential. For instance, a part of the funnel with high volume but low conversion rates becomes an obvious target for optimization.</p><p>&#8220;If a metric is not actionable, it will have a hard life. It lives lonely on this dashboard and no one has an idea what to do with it.&#8221; - Timo Dechau</p><h3><strong>Implementation and Good Practices</strong></h3><p>Building and using a metric tree is a strategic exercise that requires a structured approach and an awareness of its limitations.</p><h4><strong>Starting the Journey</strong></h4><p>1. <strong>Map the Process:</strong> The first step is to gain a deep understanding of the customer journey. This is best achieved through a collaborative workshop, like an <strong>Event Storming session</strong>, involving people from all relevant disciplines who can map the process from initial awareness to long-term retention. This process naturally identifies the key milestones and potential metrics.</p><p>2. <strong>Align with Strategy:</strong> It is critical to sit down with the leadership team to understand their strategic priorities for the next 6-12 months. This alignment ensures that the initial metric tree focuses on what is most important to the business, which dramatically increases buy-in and adoption.</p><h4><strong>The Art of Event Tracking</strong></h4><p>A robust metric tree is built on well-defined data. This requires moving beyond simplistic interaction tracking to a more meaningful model of product usage.</p><p><strong>Tracking Interactions</strong></p><p>Capturing every click, scroll, and granular user action. This is the &#8220;take everything and look at it later&#8221; approach.</p><p>A high volume of noisy, low-signal data that is difficult for human analysis and disconnected from business success.</p><p><strong>Tracking Product Usage</strong></p><p>Applying <strong>Domain-Driven Design</strong> to identify core business entities (e.g., Account, Subscription, Project) and their lifecycles (e.g., Created, Updated, Deleted).</p><p>A small set (~15-20) of high-value, meaningful events that directly reflect product usage and can be used to build core metrics.</p><p>A key goal is to identify the <strong>&#8220;Heartbeat Event&#8221;</strong>&#8212;the single, central event that proves the product is alive and delivering value. For Slack, this is sending a message; for Miro, it&#8217;s adding an asset to a board. This core event can often be used to define multiple key metrics.</p><h3><strong>Practical Considerations and Limitations</strong></h3><p>&#8226; <strong>Keep it Simple:</strong> Avoid the temptation to &#8220;boil the ocean&#8221; by mapping every conceivable metric. An overly complex tree with 90+ nodes becomes an un-operational &#8220;monster.&#8221; It is better to start with a simple model and use sub-trees for specific initiatives.</p><p>&#8226; <strong>Acknowledge Timelessness:</strong> A standard metric tree exists in a timeless space. It does not inherently account for the time lag between an action and its result (e.g., an increase in new accounts may not impact revenue for 60-90 days). This requires separate <strong>cohort analysis</strong>, which does not fit the tree structure.</p><p>&#8226; <strong>It&#8217;s Not a Dashboard:</strong> A metric tree is a tool for planning, communication, and high-level monitoring. It is not designed for deep-dive, explorative analysis to find the root cause of a problem. Dashboards are still required for that function.</p><p>&#8226; <strong>Who Does the Work?</strong> The skillset required&#8212;combining product thinking, data expertise, and strategic facilitation&#8212;is often found in senior data leaders. <strong>Heads of Data</strong> are well-positioned to lead this work, as it provides them a strategic lever to connect their team&#8217;s efforts directly to business value.</p><h3><strong>The Role of North Star Metrics</strong></h3><p>The concept of a North Star Metric is closely related to, but distinct from, the top-line metrics on a business-focused metric tree.</p><p>&#8226; <strong>Correct Definition:</strong> A North Star Metric is fundamentally tied to <strong>successful customer experience</strong>. It measures the value customers receive from the product. Revenue is the <em>result</em> of delivering this value, not the value itself.</p><p>&#8226; <strong>Common Pitfall:</strong> A frequent mistake is labeling a revenue goal like &#8220;New MRR&#8221; as a North Star Metric. This confuses an internal business outcome with customer success. Shane Gibson jokes these should be called &#8220;South Star&#8221; or &#8220;East Star&#8221; metrics instead.</p><p>&#8226; <strong>Relationship to Metric Trees:</strong> The North Star Metric is an ideal candidate for the top of a <em>product-focused</em> metric tree, with input metrics below it defining the user behaviors that lead to a successful customer experience.</p><p>On North Star Metrics: &#8220;The concept of a North Star metric is it&#8217;s always bound to successful customer experience, it has nothing to do with revenue. Obviously we hope when people have a good customer experience that... it has a causal connection to more revenue.&#8221; - Timo Dechau</p><h3><strong>Future Outlook: Metric Trees in an AI-Driven World</strong></h3><p>As technology evolves, the principles behind metric trees become even more critical for managing complexity and providing essential business context.</p><p>&#8226; <strong>Context for LLMs:</strong> A defined metric tree provides a structured map of the business model. This is invaluable context for AI agents and LLMs, enabling them to perform more accurate &#8220;what-if&#8221; simulations, brainstorm strategies, and answer complex business questions.</p><p>&#8226; <strong>Governing Democratized Development:</strong> The rise of AI will enable the rapid creation of thousands of small, single-purpose applications (&#8221;AI slop apps&#8221;). This will create immense data complexity. A framework of core concepts (e.g., a universal definition of &#8220;user&#8221;), governed events, and a central metric tree will be essential to ensure these new applications contribute to meaningful business goals rather than creating data chaos. The metric tree provides the &#8220;principles, policies, and patterns&#8221; needed for this new landscape.</p><p></p><div class="pullquote"><p><strong>Tired of vague data requests and endless requirement meetings? The Information Product Canvas helps you get clarity in 30 minutes or less?</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://agiledataguides.com/ipc&quot;,&quot;text&quot;:&quot;Fix Your Data Requirements&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://agiledataguides.com/ipc"><span>Fix Your Data Requirements</span></a></p></div><p></p><h2>Transcript</h2><p><strong>Shane: </strong>Welcome to the Agile Data Podcast. I&#8217;m Shane Gibson.</p><p><strong>Timo</strong>: , And I&#8217;m Timo Dechau.</p><p><strong>Shane</strong>: Hey, Timo. Thank you for coming on the show today. I would like to talk about metric trees, something that&#8217;s intrigued me for a little while, so it&#8217;s great to have somebody who knows a lot about them on the show. But before we rip into that, why don&#8217;t you give the audience a bit of background about yourself.</p><p><strong>Timo</strong>: I didn&#8217;t start out in data. I don&#8217;t know. I, it is interesting question would be if anyone actually ever started out in data. I started out in product. I will give some connections to metric trees later. Why? It basically was really important for me to connect it with product. . I started out on product was very quickly annoyed by people doing features just by gut feeling. I wanted to have a different kind of layer, so therefore I started to introduce analytics data to it. So this was really the early days. It was not even called product analytics at that point of time. The term came later. , So I spent a lot of time while walking through different kind of product roles to always setting up data setups for that. Also did the first connections there with data warehouses. Just a slightly little bit. And then after eight years in product, , I decided to go all in data.</p><p>So it was more focusing on the analytics side of things. So I&#8217;d say classic marketing analytics and product analytics. At that point of time, it was the high time in Berlin where a lot of e-commerce startups were really growing a lot. And so I was doing first data warehouse setups mostly in the marketing analytics space. Like people wanted to have their own attribution models and these things. I did this I spent then a lot of time in this kind of area. And then at some point I thought, okay, I want to go back to the roots and want to do more product analytics again. But I wanted to do product analytics in the data warehouse, , which required some things that were not so common at the time.</p><p>So one is like an event data model, which some people had some experience with, but not. Potentially for product use cases. , And then the second thing is there was no approach, like Amplitude postdoc were the classic tools that you do product analytics, they then were possible to put them on top of a warehouse. But then I figured out actually there, most of the time they&#8217;re too complicated for just doing product work. So in the end, you have to have a metric first product analytics approach. And this is where I came back to metrics where I never really had a good relationship with them.</p><p>So we might talk about this in this session. But I had to basically revisit my relationship to metrics. And this is where I came across metric trees again. I had them already at university. And yeah, starting to dive deeper into this.</p><p><strong>Shane</strong>: , I&#8217;ve followed your stuff for a long time, so thank you for sharing so much great content. And what&#8217;s always intrigued me is in the data domain, I could almost see a data warehouse cohort, a product analytics cohort, and a data science cohort. And , although each of those cohorts are using very similar patterns, very similar techniques.</p><p>They always seem to use different technology stacks. And then we had the whole, what was it? Composable, combustible, whatever it is. Yeah. Yeah. So that idea that you don&#8217;t need a dedicated product analytics tool. You can just do it off your cloud data warehouse or cloud data warehouse database.</p><p>So it&#8217;s really interesting that there was almost a separate track, wasn&#8217;t there for product analytics and the techniques you used and the technologies compared to some of the other, cohorts or tracks in that data domain.</p><p><strong>Timo</strong>: It is super weird because like sometimes you talk to people in the data space. So let&#8217;s say who do classic bi and , this was my problem when I was starting out. So because I was looking for specific way how I can model product data, the same as marketing data. Marketing data is a bit less, but like it&#8217;s a lot behavioral data.</p><p>So a lot of event data. And you do sequence analysis so funnels or cohorts and something like that. And it was really a hard time to talk to people who are in the classic BI space and tell them . Most of the data models don&#8217;t really work for me because I have to do sequence analyze.</p><p>So I cannot create 200 fact tables. It would be a bit crazy. And , I really had a hard time to explain, but it was also interesting because I learned a lot from them. I think some learned a little bit from me but it&#8217;s interesting because everything calls data, but I always call it they live on two different planets and they rarely visit each other.</p><p>I think it&#8217;s getting more, it&#8217;s getting better. So I think there&#8217;s more conversion now and something which I, for example, work a lot on. So my last, let&#8217;s say my last mission, maybe it&#8217;s, if it&#8217;s my last, I don&#8217;t know, but my biggest one is really crossing the whole marketing and product behavioral data with revenue data which then makes it really interesting if you do this.</p><p><strong>Shane</strong>: We had a customer and, they were a scale up.</p><p>And so we needed to do some work around . Standard metrics for a startup. The pirate metrics, a r, all those kind of good things. And when you first go into those areas, you go, how hard can it be? And then you find out</p><p>And it&#8217;s damn this should be a solve problem right now.</p><p>The core metrics for startups, should be well-defined. They all use, standard type of software as a service products for capturing subscriptions and signups and, all those kind of things. There&#8217;s three or four of them that everybody tends to use. It shouldn&#8217;t be hard, it should be patterns out there that you could just implement and make it really easy. And again, whenever you say those words, you know it&#8217;s gonna bite you in the bum and then you start doing it and you&#8217;re like, actually for some reason there&#8217;s a whole lot of complexity just hidden under the top of that layer that bites you.</p><p><strong>Timo</strong>: I just do MRR calculation. So it&#8217;s let&#8217;s say technically these are all the same metrics. So there&#8217;s, let&#8217;s say a whole metric set around MRR. But the tricky thing is , how do you define it under the hood? Because everyone comes to you and in every case, company tells me like, no, we have a very standard subscription model.</p><p>There&#8217;s nothing crazy in it. And then I always know, okay, yeah, this will hold for two weeks until we do the first investigation into all the edge cases, and then we will just cover all the edge cases. Ah yeah, we, there we had to do something different. Oh yeah. And then yeah, we had this switch from one subscription to the, and so it&#8217;s always messy.</p><p>And that&#8217;s the same with product stuff, I think the pirate metrics are still a really good framework to understand how product, and let&#8217;s say this business around is evolving. But let&#8217;s say just go for something like retention or active user or activated. So activated by default is in the end one metric, but it can take you weeks and sometimes months to redefine how you basically calculate this because it really comes down how the product works, what kind of use cases are you actually solving.</p><p>So it&#8217;s a fascinating, interesting thing, but it&#8217;s also good &#8216;cause in the end it&#8217;s one metric that you just see, but it&#8217;s an iceberg. So on the top you just see this one metric, but quite complex underneath. But this is where the fun is. If you really do a really good job there and the definition and so on, it can really change the track for our company.</p><p><strong>Shane</strong>: Even then though, there&#8217;s still outside of the metrics, there&#8217;s some core patterns you&#8217;ve gotta decide from an engineering point of view. So again, when I look at product analytics, when I look at event tracking in a software product, typically you have a choice of pull every event. so effectively pull event logs to say lots of things happened, and then apply the events you care about after the fact.</p><p>So more of a Lakey type pattern. Or only define the events you want in the product when at the beginning. And then only bring those in and use those. And if you want to look at a new event, you&#8217;ve gotta go and define it again. And there&#8217;s a trade off there, right between this,</p><p>If you get every event, you get this wash of noise and it&#8217;s really hard to figure out where the signal is.</p><p>But when I worked with a lot of product teams, they seemed really reticent to define the events that were important. It was easy for them to say, I&#8217;ll just take everything and we&#8217;ll look at it later, versus this is the event that will tell us that this feature was successful.</p><p>That&#8217;s actually quite a hard metric to define, </p><p><strong>Timo</strong>: it is, this is this is another this is my second passion topic and this is the one where I wrote a book about. I think the problem is when you go into this, you have to distinguish between tracking interactions and tracking actually product usage. And so most of the setups are tracking interactions.</p><p>Where people click, and it&#8217;s quite understandable why you do this because you have to define something. So it&#8217;s not it&#8217;s not everyone&#8217;s job to define what you want to track. So it&#8217;s usually a side job that you have to do in between. So the, let&#8217;s say the most approachable way to do this is you open up your own application, then you see, okay, where can people actually do something?</p><p>And so they&#8217;re like, okay, they can click here, they can click here and they can do this. And this is what I mean you track interactions. If, let&#8217;s say, if this whole output is just for machines, then it might be. Okay. We are still not there. That, let&#8217;s say you can collect all these, let&#8217;s say very noisy, very granular data, and then you have something running over it, finding some patterns.</p><p>So far I didn&#8217;t really see things that go in this direction, but it might come at some point, but when a human has to analyze it, this doesn&#8217;t work. It&#8217;s far too far away from what actually defines product success or what actually defines business success. So it has too many noise, as you said, and so therefore what I ask product teams is define the use cases and define the jobs to be done.</p><p>Define the entities that make your product. So in the end, it&#8217;s classic domain driven design, what I do with them. And so okay, let&#8217;s define the different kind of entities that build up your products. It&#8217;s usually five or six. And then we define how does the lifecycle look into this? So let&#8217;s say we have an account can be created, can be updated, can be deleted. So three events updated usually you don&#8217;t need because there&#8217;s not often a business value in it if someone update an account. So who would optimize on account update updating usually. Yeah. Any there edge cases. But, and you can do this for everything else. You can do it for subscriptions. You can do it for whatever your product is built off. And then you can just end up with. This is usually my take. You end up with 15 events and the same, you end up with 10 metrics. It&#8217;s sometimes really quite crazy if you really have a good setup. So right now, I was just working on a project where we had one event that was defining six core metrics that could explain if the business is running or not. Because there&#8217;s always, this is always my interesting take when I work with the client is in every setup you have something which I call a heartbeat event. And this is this is a central event. If you just need to track one event, this is the one that tells you is your product still alive or not.</p><p>So for Slack, messages for Miro, if let&#8217;s say an asset, it&#8217;s added to a board or so, you have always this one event where you know, okay, when this is still coming in, things are good, we don&#8217;t know about it, but they&#8217;re still there looking good. And you have to get to this level, then you can basically tame the chaos that product analytics can cause.</p><p><strong>Shane</strong>: It sounds a bit like a North Star metric. This idea that yes, all the other things are important, but actually if you can define the one thing that really is the core, focus on that and the other things are useful, but if you don&#8217;t focus on that one thing, </p><p><strong>Timo</strong>: I always have to fight a little bit for the North Star metric because there are so many definitions out there, and I think the worst case is when someone comes to you and says yeah, new MRIs or North Star metric where you say, no, it&#8217;s actually not the, let&#8217;s say the concept of a North Star metric is it&#8217;s always bound to successful.</p><p>Customer experience, it has nothing to do with revenue. Obviously we hope when people have a good customer experience that let&#8217;s say it, it has a cau connection to more revenue. And then the North Star metrics is usually also built up by two or three input metrics that lead to it. But in the end, it&#8217;s the same what you said.</p><p>You have a set of three metrics that could explain how is your product actually delivering value at the moment.</p><p><strong>Shane</strong>: I&#8217;m with you is that, the North Star metric is success for the customer, not success for the business because if the customers get the success that you promised them, then your business will grow. But yeah, time and time again, I see North Star metrics being internal metrics. And it&#8217;s maybe we should just go read the definition again or call it something else, south Star Metric or East Star. Like just give it a different name. That&#8217;s okay.</p><p><strong>Timo</strong>: Yeah. You can always call us like this is our, yeah, I don&#8217;t know. I think North Star Metrics is just a great name, so it&#8217;s a great label. So you lo everyone understands what it is. It sounds great, like adventure. So this is our North Star. We have to go there.</p><p><strong>Shane</strong>: That&#8217;s my standing joke is that as data people, we laugh about the fact that our stakeholders can&#8217;t define active customer, active , subscription. Yet as in the data domain, we argue about the definition of a semantic layer, a data product, a north star metric, day in, day out.</p><p>So we&#8217;re as bad, if not worse than them. So on that note let&#8217;s go and talk about metric trees. So if somebody said to you, what the hell is a metric tree? How would you describe it?</p><p><strong>Timo</strong>: First of all the same. I think it might be also a definition problem. So some people call it driver tree. There&#8217;s also like the term of kpit. I&#8217;m not really good in definitions. I know that they have slightly different variations of it. So I give you my explanation and my definition of it. So one problem, what you often have with metrics is when you take a metric alone, so let&#8217;s say you have a software as a service. So you&#8217;re not really sure what you should measure. You are on LinkedIn. One of this posts where someone is posting, Hey, I just compiled the greatest collection of software as a service.</p><p>Metrics, like common metrics, and you get my PDF with this 100 metrics. These things never really work for someone. And the problem is why they don&#8217;t work. It&#8217;s because everyone picks then a metric and looks at it in isolation. So I don&#8217;t know. Yeah. For example, MRR, monthly recurring revenue. So what is you have a standard alone there and then you don&#8217;t really have an idea what you should do with it.</p><p>Because two things are missing and what defines the metric? We mostly is it&#8217;s missing a relationship. So let&#8217;s say every metric has a relationship to another metric. They have I cannot even think about one, which is really standing alone. So usually you have a relationship and why do you need this relationship? Because not every metric is very actionable. So we talked about this North star, when you say, Hey, revenues are North Star. So the problem with MRR monthly recurring revenue is. It&#8217;s an output metric, so it&#8217;s something that comes out at the end. So when the CEO goes into a meeting and let&#8217;s say assembles all the heads of the different teams and says we have to increase MRR, everyone would say, yeah, sure.</p><p>Sure it&#8217;s our business model, so it makes sense, but no one would immediately come up with an idea, oh yeah, sure, we should do that to increase more MRR. No, you usually break it down. So you break it down into, oh, okay. When we need MRR, and maybe we need more accounts because they could end up in a subscription and then this would be new MR. But then this is interesting because this is already the first step to build out a metric tree because then you have MRR, and then you say, okay, what is actually making up MRR? New subscribers multiply it with an average. Plan price or average price, and then new subscribers you can build up from new accounts with a conversion rate. And then you get into your first version of a metric tree. And I think the nice thing about a metric tree is that it explains how you can get from something that you will end up with to something that you can directly influence. So if you go to a marketing team, tell them , Hey, we need more new accounts, they usually have an idea what to do. So they will like, oh yeah, okay, maybe we have to run more webinars or we have to invest in a podcast or whatever. And so there you make it actionable. And at least this is my driving force. Why metric trees are interesting because they help you to break down something that you want to achieve to something that you can actually do. And this is the tricky thing. So if a metric is not actionable, will have a hard life. It lives lonely on this dashboard and no one has an idea what to do with it.</p><p><strong>Shane</strong>: So if I think about it what you&#8217;re really saying is there&#8217;s a bunch of metrics and they have a relationship to each other.</p><p>And we treat it like a tree. So if we think about an HR human resources org chart where we typically see a tree of people and people report up this tree, what we are saying is if we can find the relationship metrics that behave that way, so these metrics support this other metric which supports this other metric, then that relationship helps us work out what we need to change in our business to move one or many of those metrics.</p><p>Is that kind of the concept?</p><p><strong>Timo</strong>: Yeah. That&#8217;s the concept. You can also use it for that. Where I like to use it is to really try to explain how the business works on a, let&#8217;s say, common concept that at least the data team understands and that the business team understands. And this is something which I would say the metric tree so far worked best.</p><p>So I tried different kind of formats to bring both teams together. We did it. Okay, what kind of events do we have to measure? Usually it doesn&#8217;t really work well because it&#8217;s too abstract and, You sometimes find common ground, but it&#8217;s too far away from what people really care about. If you really build up a metric tree, you have really interesting conversations with, let&#8217;s say, when I do these workshops where it&#8217;s usually with startups where it&#8217;s easier to get relevant people into one room.</p><p>So then we have people from all the different kind of disciplines. And we have really good conversations because for a lot of people it&#8217;s for the first time that they see how actually the revenue is coming together in the company. So this is let&#8217;s say often, unfortunately for data people, it&#8217;s the first time that they see, oh, this is how we calculate revenue. Because it&#8217;s usually one person who does this part, but not everyone and then for products the same, they&#8217;re like, oh, okay. I was not aware that we also do this to get revenue. So it&#8217;s a really interesting exercise that can, open up a lot of black boxes and understands how the whole system&#8217;s actually building up. And then. Obviously, if you go a little bit more down, then also, all the different teams might find a place in this tree where they say, oh, this is actually where I work hard for the data team. They usually don&#8217;t find a place there. But they&#8217;re the ones who can build the metric tree and can provide the metrics. But let&#8217;s say you can have a part where you can see what product can do, you will have a part where you can see what sales do. Then you will have a part where you can see, oh, this is actually the influence area of marketing. And then in theory you can analyze how these different kind of levers that you have, let&#8217;s say on a lower level, how they&#8217;re actually attriting to the whole final output metrics. Every time I did this kind of format in a workshop, it was always some kind of eye-opening in there where someone said, oh, I never looked into this. Or someone even said actually right now, I think with some initiatives we are starting to hurt our revenue and so on.</p><p>No, no one has recognized that because now we see it for the first time that when we do this, let&#8217;s say this stream of revenue will have a problem, which no one really thought about. And so this is something where metrics we definitely can help to bring transparency into. Let&#8217;s say the mechanics, how the business actually making money that often is completely overlooked because a dashboard in theory does it, but it doesn&#8217;t have the structure.</p><p>The tree structure is nice because , someone can look at it and immediately understands what it does. Yeah. So go from the bottom up to the top or the other way down. </p><p><strong>Shane</strong>: I think that&#8217;s part of the core value of it, is that simple map. &#8216;cause what we know is we know we, there&#8217;s a lot complexity and when we draw a simple map, a simple diagram, two things happen. People can visually ingest it and understand it. There&#8217;s something about a human looking at a map that just naturally sees the pattern.</p><p>And the second part is you can point to part of the map and everybody knows they&#8217;re talking about the same thing. So that idea of different parts of the organization, different silos, understanding how they affect another metric. So you know, your example, how do we increase monthly recurring revenue while there&#8217;s a bunch of metrics that the marketing team can affect, number of ads placed, number of prospects found optimizing the funnel from prospect to signup.</p><p>And they can look at that part of their map and go I can do some work in there. And that in theory will increase our monthly recurring revenue, because that&#8217;s up the tree. So that idea of a map gives us understanding and also gives us a clarity of a shed language, i&#8217;m focusing on this part of the tree.</p><p><strong>Timo</strong>: Yeah, I&#8217;m very happy that you come up, that you say the word map because this is actually what it is. It&#8217;s one map that can be very useful. So at least when I create one for the projects, it&#8217;s not that we constantly work with it. So we might get to this point like how operational are metric trees, but where I always use it is when we come together and discuss big things.</p><p>Okay, what kind of metrics we should focus on? Let&#8217;s say what kind of initiatives we should look into. So this can always help to have a then sitting around and when we say, okay, look, now we want to focus on this kind of area, you can always look then in the, let&#8217;s say in the tree.</p><p>And the nice thing is when you have the tree with some data, then you can see, okay. We focus on this kind of area, , because we have the feeling we have a lot of volume here. But we can see for example, the conversion rates in this kind of areas are not really high. So there&#8217;s, let&#8217;s say there&#8217;s quite nice potential for us to improve things just slightly a little bit, but see a big impact because we see that a lot of push comes in there. , And so then we can analyze how this works. For that, it&#8217;s really nice to have it because again, everyone immediately knows where we are. So it&#8217;s not that you have too long. Explain how this might have an impact on revenue, because immediately everyone sees that</p><p><strong>Shane</strong>: The good thing for data teams is it means they don&#8217;t have to go look at yet another stupid strategy PowerPoint with four boxes that have no context and no understanding and no data.</p><p>You might have to, but at least they can say that, that part of the pyramid or what are we doing this week? The circle, whatever the latest consulting picture is for for selling a story.</p><p>How does that map to the tree? One of the things you said though was data teams would struggle to figure out which metrics are theirs and that, that kind of gave me an idea. So, you know, One of the things I work on a lot is this idea of an information product, and one of the key things I teach teams is focus on the action and the outcome.</p><p>If you don&#8217;t understand what action the stakeholder&#8217;s gonna take and what outcome&#8217;s gonna be delivered from that action and the value of that outcome, then really there&#8217;s a risk. You&#8217;re doing data work that has no value. While we say stakeholders, were accountable for that actually as data teams, we should be as well, we should be holding our stakeholders to account that actually they can describe the value, at least the outcome,</p><p>That&#8217;s been taken.</p><p>So what would be interesting is if there&#8217;s a metric tree in place, a data team, which should be able to point to the metrics that they&#8217;re doing data work for or with to improve, i&#8217;m working with the marketing team to use data to reduce the time between a prospect being identified and signing up for an account.</p><p>And so they should actually be able to use the metric train to show where the data work is adding value to the organization in conjunction with those stakeholders, those different business operating groups,</p><p><strong>Timo</strong>: yeah, exactly. I think this is where it can really play a really nice role because can sit in these, let&#8217;s say you work with the marketing team. Marketing team does some planning for the next two or three months. So they have some ideas let&#8217;s say they come up with three initiatives, what they want to do. And so you can take every initiative and you can then say, let&#8217;s say, okay, here on the big company metric tree these initiatives. They&#8217;re trying to improve these kind of areas. And then we can use this metric there as always, as our our control instance to really see, okay, do we see some impact? Because that&#8217;s a tricky thing. You can run a lot of initiatives and locally looks really great, but when you look at it on the big picture, everyone is yeah, I don&#8217;t know. AB tests look great, but I don&#8217;t see any kind of uptick somewhere. So it&#8217;s definitely nice to have, okay, this is the one big metric that it should, let&#8217;s say influence and let&#8217;s see if we can fire it up enough that we can see some influence. Then what you can do. The second thing is let&#8217;s say you can create different views of metric trees. So you can do this very high level for a company, but then you can also take every initiative and just build a metric tree for every initiative.</p><p>This is something which I often like to do because again, it helps you to understand, okay, how do we measure the success of this initiative? So you might have this one big metric that we identify in the company metric tree that let&#8217;s say is at the top end. And then we break it down for this kind of initiative.</p><p>And so then it really comes, let&#8217;s say you want to improve conversion rate. So then we can build the whole surrounding around this conversion rate and maybe even break it down by three different channels because this is what we are trying to achieve. You&#8217;re trying to get more people from this kind of channel.</p><p>So then you basically bake strategy into the metric tree and the strategy connects. Directly to the initiative that marketing is driving. And then everyone has a very clear idea what you do. And then also you will build something that in the end, can marketing and you can use to explain if this initiative was successful or not. , And you can validate it before you can ask people, okay, look, when we deliver this, does this make sense to you? When, let&#8217;s say when the initiative is over in four weeks and we report on this metrics and we define, okay, success looks like when we move this part, I don&#8217;t know, by 10%. So does everyone agree with that? And then because I have the feeling it makes it for a lot people easier to think in that way. So that they not don&#8217;t say, yeah, I don&#8217;t have really idea if I should agree or not to this because at least, yeah, I think metric, at least they understand. They might ask, okay, how do we define this kind of metric? But that&#8217;s fine. And so then we can spend some time to say, okay, how we define it. But I like this approach a lot. Also like to not always see the metric tree as this one big, okay. We explain the whole business model, but really to use it to explain how I run a specific initiative. It&#8217;s also nice for me. Let&#8217;s say personally when I work on these, let&#8217;s say, as a supporting part for data for these initiatives, it&#8217;s a great brainstorming tool for me. So let&#8217;s say someone comes up with this initiative, I do a first version of a metric tree. Then I say, does it really represent what they&#8217;re actually doing there?</p><p>And then I say, yeah, to a very generic part, but maybe not customized enough. So I&#8217;m always trying to tweak the metric tree that the people who run the initiative immediately find it in there. So let&#8217;s say I do something for e-commerce. And the e-commerce is really pushing and getting high loyal customers. Let&#8217;s say , they really want to improve the segment of high loyal customers who buy all the time. There&#8217;s these, let&#8217;s say, high buyers that you sometimes have. So when I just report on let&#8217;s say new customers or returning customers. Then I don&#8217;t really have it covered because yes, we are going for returning customers, but for a special segment within returning customers.</p><p>So therefore what I can do is I can say, okay, I break it down in three groups and new returning and let&#8217;s say VIP customers and then with the VIP customers for the first time, I make it visible what kind of impact the initiative right now has. Or let&#8217;s say I can see, okay, how is the share of these users is growing or not growing or whatever. And so this gives me a lot of, playroom to build something that can really support what the business is trying to do, but do it in a language or in a way that at least most of the people understand.</p><p><strong>Shane</strong>: All righty. So I&#8217;ve got a shit ton of notes right now. So let&#8217;s let&#8217;s go through them one by one, because there&#8217;s so much gold in there. So the first thing you pulled out is this idea that actually metric trees is a shared language. Yeah. So you can, you&#8217;ll define a metric tree for what you think you heard from the organization of how their business operates.</p><p>And by putting that map, by putting that tree in front of them, they&#8217;re gonna identify where you&#8217;ve got it wrong. They&#8217;re gonna look at it and go yeah. Oh no, hold on. We don&#8217;t do that. Or we are different or, I don&#8217;t understand. It becomes that visual shared language of an entire business and their operating model </p><p><strong>Timo</strong>: it maps the process to some degree, and that makes it easy for people to see, okay, is it actually what we do? </p><p><strong>Shane</strong>: Then the second thing is you said when you want to go and do something new, you wanna do an initiative or some investment or change a process or go into a new market, you can pick the metrics you think you are gonna impact and you can guess how much you&#8217;re gonna impact them by. I remember many years ago, it was probably late nineties, early as two thousands, I got into the whole balance scorecard thing.</p><p>I was working for a vendor balance, scorecards were hot. A couple of &#8216;em were trying to build software. And one of the things was, you had this metric and effectively you could put a budget on it. We are gonna increase that by 10% over the next quarter. Now the software itself was pretty hokey.</p><p>But it was the conversations around what are we doing and how&#8217;s it gonna influence that metric. So that&#8217;s one of the key things you called out , is by saying that if we do these things, let&#8217;s guess which metrics are going to change, is that what you&#8217;re saying?</p><p><strong>Timo</strong>: Yes, exactly. And then also really make sure that you measure this metric in the right kind of way. So that, for example, let&#8217;s say you have a corridor that you can see, okay, is this a normal movement of the metric or not? Or that the initiative is big enough that it can move the metric. This is a tricky thing. But often. At least when I work in product or marketing is that often you see a lot of initiatives that are just, let&#8217;s say they&#8217;re very tiny bits. &#8216;Cause I don&#8217;t know, they&#8217;re not really bold. And then it, you will not see anything. And you can even tell this before, okay we try to optimize 5% of the typical audiences that we get in there. So we are working on this. There might be still a strategic reason for that, but then you can still make it clear. It&#8217;s okay, so we work on that. But because , let&#8217;s say the sample or let&#8217;s say the audience is so small, we will not see an impact when we look at the completely blended global metric. So then, for example, you have to break down this metric maybe in different kind of path that you can still see something. But this check-in really helps to see, okay, how do we actually want to see if it makes an impact or not? &#8216;cause sometimes just because of the setup, you won&#8217;t see anything. You can already know this, that it might not happen.</p><p><strong>Shane</strong>: And that was one of the problems with the balance scorecard. So we had this idea of cause and effect, we said that, I think you call them input and output metrics, but this idea that if I improve this metric on the bottom or the left then it has a cause and effect with a metric above it or on the right and.</p><p>Back then, we really wanted to find correlation, we really wanted to say well, actually, if we just throw all the data at the machine, the machine should be able to find causality or correlation between the metrics, I should about to codify this. It should be accurate. And we never got there.</p><p>We still don&#8217;t have the technology or the patterns to do that at the moment.</p><p><strong>Timo</strong>: no. I don&#8217;t think so. So there, this is definitely not my area of expertise, so I know some people do some things like that, so they try to find out, , because you have some relationship when you build a metric tree that are not deterministic. So let&#8217;s say a classic metric tree, you can write as an equation because it&#8217;s basically that&#8217;s also quite nice that you can do it. But sometimes, for example, let&#8217;s say you have a metric which is called active users. And so you have a very well, well-defined active user definition where you say, okay, the people don&#8217;t just show up, they also do valuable stuff within your product. And so once they do it, and they do it within the last 30 days, we basically flag them as active users. So there&#8217;s definitely a correlation between active users and at some point starting a subscription, let&#8217;s say you have a free plan, and so you have an active user and a free plan. So there&#8217;s a correlation between the both, because you might assume, okay, someone has to be active to end up in a subscription. But let&#8217;s say it&#8217;s, it&#8217;s it&#8217;s not a direct connection, so you cannot really say, okay, whenever we get someone as an active user, we have a probability of, I don&#8217;t know, 57% that they will end up in a subscription. I guess this is more straightforward to calculate, but even there, I never really, came across, let&#8217;s say, super good models that can , predict this very good.</p><p>If someone knows this, please let me know. Ping me on LinkedIn. I&#8217;m always interested</p><p><strong>Shane</strong>: Now the answer&#8217;s just AI slop, right? You can just AI it</p><p><strong>Timo</strong>: Yeah </p><p><strong>Shane</strong>: and maybe you can, maybe it&#8217;s good, that non-deterministic patent matching will be really good at this. I dunno, I haven&#8217;t tried lately,</p><p><strong>Timo</strong>: So if we go to this, what you had before, like we, we create a lot of noise and we might throw it in and we might find something yes. That&#8217;s, yeah. But I think for correlation, analyze is potentially not,</p><p><strong>Shane</strong>: but the key is the. Visualization of a map, the conversations of what on the map, the conversations of what you&#8217;re doing to improve the numbers on the map, the relationships that actually is where the value is. So yes, we could get programmatic deterministic correlation across the metrics. That would be awesome.</p><p>But actually the conversations so the next one is, our natural reaction is to boil the ocean. </p><p>If I think about an information product that&#8217;s lifetime value. I tend to say to organizations why don&#8217;t we break it down? We know that to do a lifetime value model, we actually have to have revenue cost to serve churn, there&#8217;s a whole sub submodel sub products that have value. Why don&#8217;t we build the revenue model first? . And give you a revenue information product. Why don&#8217;t we do the cost to serve second, and then over time we&#8217;ll get to that lifetime value. I can imagine with metric trees, the natural reaction for some data people is to define every metric.</p><p>So draw the map end to end. Define every metric before we even start doing anything. Where for me I like the ability to change fast. So for me, I would be going maybe sketch out a map really quickly as kinda like a blueprint of what we think , it looks like, and then focus on some metrics and define them and build them and deliver them and monitor them and use that information.</p><p>And then. Kind of color in the map over time as you learn more, ? How do you do it?</p><p><strong>Timo</strong>: so I had a phase where, let&#8217;s say I was in the rabbit hole and I did one exercise where I was trying to map my whole content production as the most extensive metric tree possible, and it was, I still have it somewhere in Miro. It was really a monster. It was like no one could ever implement this. I think it had in the end, I don&#8217;t know, 90 nodes. And obviously this doesn&#8217;t work, so I could easily see that. It was a nice exercise, was a nice thing to do over the weekend. But it&#8217;s nonoperational so it doesn&#8217;t provide any operational value to do this. And by the way, it would not even possible to track the whole thing because it would include a lot of attribution stuff, which is not possible. So the same, what you say, I think like the model that I used now is. Try to really keep it simple. I don&#8217;t know. Let&#8217;s say also try more to work with sub trees and don&#8217;t be too, let&#8217;s say don&#8217;t be too deterministic with it. So give yourself the liberty to say, Hey, I create a, let&#8217;s say a specific tree for this kind of new product feature that gets introduced and have no idea how it connected to the other tree.</p><p>Now that&#8217;s fine. I live with that because local optimization is still better than no optimization, so therefore be nice to yourself. But no, you definitely have to keep it. Quite simple and then you have to know what you can do with it. So I know that some people use it for root cause analysis, and I think for that it&#8217;s really quite nice because I could you, you can say, okay, look, we lost so much new revenue let&#8217;s say, compared to last month. So can we follow up with the tree to really see where do we see where stuff got off? So it can help you to, let&#8217;s say, get a starting point, but then still, the deep dive is something different. The metric tree will not tell you where it happened and then. Another thing that I discovered was, is often overlooked, especially when you talk about metrics and the metric tree is, it&#8217;s really big risk is you have time is completely out of this thing. So the metric tree lives in a timeless space. So the problem is for example, you increase new accounts in the example that I had before. And then you wonder, okay, why don&#8217;t I see any kind of new revenue? Yeah. Because it happens in 60, 90 days. So whatever your model is or whatever the average time is that people usually take to upgrade into subscription.</p><p>So the metric tree always does a very bad job where things, let&#8217;s say down in the branches are already impacting and you see the impact three months later at the top and you have no idea where it&#8217;s coming from so if you would do it properly, you would do a cohort analysis where you cohort everything that happens something.</p><p>But this doesn&#8217;t work on a tree. I think the important thing about working with trees is really to know the limitations, whether it&#8217;s great or not. As I said, for me, mostly for planning, for brainstorming, for communication. I guess some people use it also. For these check-in meetings, let&#8217;s say have a weekly meeting, you check quickly, okay, how&#8217;s the business performing in specific areas? I think that can work as well. But it&#8217;s not like that. Some people thought, oh, get rid of the dashboards and we just do metric trees. So no, that doesn&#8217;t really work. You still need specific, let&#8217;s say explorative dashboards to figure out why a metric is actually looking a little bit wonky this month.</p><p><strong>Shane</strong>: I think back in, in the balance scorecard days, there was this concept of simulation. , It was almost like a digital twin. And although that term wasn&#8217;t around then from memory, so you could actually go in and say, if I change this input metric by a certain amount what&#8217;s the flow through?</p><p>And from memory we were inferring a delay. like you said, number of prospects coming into the funnel, how long would it be before they create an account and go into a pay plan? How long would it be before that money turns up if it&#8217;s a 30 day window for the bill? And so you bake that into the model to a degree so you could simulate some changes, but, that was quite technical.</p><p><strong>Timo</strong>: What I often do is I take a metric tree and I just break it down on a spreadsheet and then create something which we can call a growth model, where I then have the metric tree on the left defining all the different kind of rows, and then I can put the timeline on the right in there, and then I can just model what you just described.</p><p>I can do some forecasting. And for me, the interesting part is I see the mechanics of the business and then look, I can change some conversion values and can see, okay, what impact does it have when we actually increase this by 10% and so on. And then I see it how it plays down. So it&#8217;s a very basic and amateur way to do forecast simulation, but often quite enough for most of the stuff</p><p><strong>Shane</strong>: And again, , that relationship&#8217;s an interesting one and you just made me think about it. So we&#8217;re doing a whole lot of work in what we call the context planes. It&#8217;s this idea of taking the context of our data and bringing it back into, for us it&#8217;s centralized.</p><p>And we think about it as four types of context we think about as business context. Actions, outcomes, glossaries descriptions, those kind of things. We think about it as a structural context, so physical schema data types and all those kind of things. We know about our data. We think about it as operational context.</p><p>So when was data loaded, when was it refreshed? What&#8217;s the quality score? Those kind of things. The last one we think about is agent context, the prompting, reinforcement, those kind of things. And so there&#8217;s a bunch of object types. We have, so we have, actions as an object type outcomes, an option type metrics as an object type a fact, a value as an object type.</p><p>Then the last part of it is the relationships across those object types. And the reason we&#8217;re working on this is this context map. Effectively, if you give it to an LLM it&#8217;s really good at using it to answer questions. So you can do what we call blast radius. If I change this bit of code.</p><p>If I change this object, if I change this metric, what&#8217;s impacted. And so thinking about metric trees, they&#8217;re giving us a relationship across those metrics, which is effectively describing the business model for the organization. And that relationship would be really valuable to an LL lm if you&#8217;re using an AI agent because you can say, if I touch this metric, you know, what the relationships are infer what&#8217;s gonna happen, and that actually might be a non-deterministic way of getting a simulation quicker than having to build really complex data models.</p><p><strong>Timo</strong>: Yes. So I was experimenting a little bit with that. Let&#8217;s say in the easiest way, you just do a copy of of your metric tree and put it in and then say, okay, can you please run this in specific ways. I also did some experimentation with, let&#8217;s say, descriptive YAML format for metric trees was not really happy with that. So I definitely, so far I have abandoned the idea to do this. But no, you&#8217;re right. Also because when you put it in LMM, then for example, you can ask, okay, can we run a cohort simulation on top of that? So let&#8217;s assume we run this initiative in the next months where we think, okay, we will increase this kind of metric, and then we want to see how the impact will look like over the next month.</p><p>And what I often see. For example, when I have, let&#8217;s say full data set up for let&#8217;s say a startup. And then the metric tree is always you can use it in an l and m and then you can say, okay let&#8217;s brainstorm on this. So right now we have this problem here, or we want to dive deeper into retention.</p><p>It&#8217;s this is the whole picture, so can you help us to break it down? So can you give us different kind of versions of what, let&#8217;s say what would the level underneath look like? And still at the moment, this is my favorite LMM workflow or let&#8217;s say use case is really to use it as this massive brainstorming machine where you can say, look, okay, let&#8217;s look at this angle.</p><p>And I can say, oh, okay, let&#8217;s slightly change this kind of angle. If we bring this, and then because you have so many things, then. Mapped out to you so you can immediately say, oh yeah, this is the direction we should follow on. Let&#8217;s go deeper there.</p><p>So it definitely helps to not do crazy stuff because you give it a, let&#8217;s say, a form or structure</p><p><strong>Shane</strong>: And then my natural reaction is, oh no, because LMS are non-deterministic, and so the simulations it&#8217;s gonna run is wrong, but we&#8217;ve just already said that, you Correlation of cause and effect across the metrics is hard to do if not impossible. So a non-deterministic engine is just behaving like a human going, this is what the patterns we see.</p><p>So actually using it to do those what if analysis, those simulations, those, how are those things related is just as valuable as humans doing it. I hadn&#8217;t thought about it that way.</p><p><strong>Timo</strong>: and the Usually nowadays would, take the other approach and would write a Python program.</p><p>Often, when you ask these things, they say, yeah, let me write a quick script for you. Usually does that, and then you can just check. </p><p><strong>Shane</strong>: Interesting space. I think , we&#8217;ve seen metric layers come out as bi semantic layers. And they&#8217;re not, from my view, they&#8217;re not making it right. They&#8217;re not really getting traction on the market. Maybe we&#8217;ll see a reinvigorated of the balanced scorecard products as metric three products.</p><p>&#8216;cause there&#8217;s probably some value there. Alright so let&#8217;s just go back to basics, i&#8217;m rocking into an organization, or I&#8217;m an organization and I wanna start off this journey. We&#8217;ve talked about the map is the most important thing, the shared conversation. So we need to build that out over time.</p><p>And we&#8217;ve talked about doing it, step by step, don&#8217;t bore the ocean. We&#8217;ve talked about the fact that actually defining metrics are hard. . So each one of those you&#8217;re gonna think is, oh, it&#8217;s only MRR. How hard can that be? You&#8217;ll find out. So we know that each metric to define it and implement it actually is a lot harder than we normally think and takes time.</p><p>Where do you start? , So if we think about this idea of input and output metrics and metrics in the middle that we&#8217;ve got this idea of a heartbeat metric, the core metric for your organization, that actually is the one you want to look at the most, and that&#8217;s gonna be made up of a whole lot of input metrics, we know that. Where do you start? Do you tend to start at the input side? Do you tend to start at the output side? Do you tend to start in the middle? , How do you decide which metric to do first?</p><p><strong>Timo</strong>: Huh, that&#8217;s a good question. I start on two areas. Area number one is I usually bring, let&#8217;s say, a set of people of this company into one room. That can explain, let&#8217;s say the whole customer journey or the whole customer flows in this company to me. I usually work with, let&#8217;s say startups, which are between seed and series A.</p><p>So therefore, the whole business model or the whole business processes are not super complex yet. , So it&#8217;s definitely possible in three hours to map the whole thing out. So I do a classic event storming session. So not classic event storming is a little bit broader and it does more things. So like I do a reduced version because I&#8217;m just interested in specific things. But. When you ask a company to explain how the customer&#8217;s basically going through the whole process, then you will identify these things. So you will identify, okay, what are the actually important points that they have to come to that in the end? Like it means success for you. So we will find this one place usually pretty quickly where we say, okay, this is a sweet spot. So that we definitely have to have a metric for that. We just have to see how we define it. For example, an active user is a North Star metrics in general. It&#8217;s it&#8217;s a definition process that takes longer because you come up with a hypothesis where you say, okay, people have to, okay, I&#8217;m just making this up, let&#8217;s say when you&#8217;re miro, so they have to create one new dashboard every month. They have to share at least two. And then they have to add at least 20 cards on a board or so, so then you would say they&#8217;re active. So this definition has to be fine tuned over time to see, okay, does it actually stick? This comes later. So I start with understanding the whole process. This is part number one, part number two, and this was something that I learned later. So I started very early on with these event storming maps because I use it already in other projects for other things. They&#8217;re really great to understand the process very quickly. The part that was always missing for me, that came later for me was. I sit down with the leadership team to understand strategy. I really want to understand where they want to go in the next six months. So what is the thing they want to improve which kind of direction they want to go, because from there I get the priorities, what you just asked.</p><p>Okay, where do we get started? So what do we have to build out first? I always try to bring it as close as possible to the strategy that the company right now is doing, because if I do it, I usually get a lot of more interest by management teams. By everyone I mean then by everyone else because they have to report towards this. I won&#8217;t do this, people would like. Yeah. It&#8217;s interesting, but it doesn&#8217;t really. Let&#8217;s say count into our current initiative. So if I fail to support a current push, strategic push into something, then usually I get a really low adoption rate. So this is like the second, I would say most essential part that I really try to understand, okay, what they&#8217;re trying to achieve in the next six months.</p><p><strong>Shane</strong>: So once you&#8217;ve blueprinted out or, event stormed the kind of, I, I think of it as a blueprint, you&#8217;re creating a hypothesis of what the metric tree might look like when you&#8217;re finished. Then what you&#8217;re saying is you then look and talk to &#8216;em about their strategy to figure out where you should zoom in on you&#8217;re basically saying as part of that event, map that metric map.</p><p>And based on your strategy. I think we should do these ones first because they seem the most valuable, the ones that&#8217;ll get the most traction or have the largest conversation and adoption internally, which is the goal, is to get more people to understand it and use it and go, yeah, these are valuable.</p><p>Let&#8217;s do the rest of them.</p><p><strong>Timo</strong>: It&#8217;s even slightly different. It&#8217;s not like just highlighting the areas. It&#8217;s often there&#8217;s a different version of a metric tree that I use. So you can take a metric and you can break it down, buy specific things. So this is the classic thing that you do in BI as well. So you have sales, you can break it down by something. You can do the same thing with the metric tree. So you can take a total metric and then you can break it down by some kind of segmentation, whatever it is. In this segmentation, I can try to incorporate the different kind of strategy. So I want to have, let&#8217;s say some, okay, let&#8217;s say, miro, I&#8217;m always using Miro because most of the people have used it before, so let&#8217;s say they push into AI supported workspaces. So then I&#8217;m trying to get up metrics that will highlight these, let&#8217;s say the adoption of AI supported workspaces versus non AI supported workspaces so that everyone on the first glance can immediately see, are we making progress with that or not? And so it&#8217;s really like creating a variant of the generic metric tree to say, okay, this is how it would be useful for the current strategy.</p><p><strong>Shane</strong>: Okay so lemme play that one back because I just I want to make sure I get it. And is this idea of the buy statements of these metrics, I want this metric buy channel, before we started, we talked about the tools we use to create our content. And we talked about the fact that I use InCast for recording, I use Descrip for editing and I use Podbean for publishing.</p><p>And now what I&#8217;m seeing is a convergence. Yeah. I&#8217;m seeing each of those three products add the other features , that I do need into their product. And I was ranting about how that actually degrades the value of the one thing I use &#8216;em for. So let&#8217;s say that we&#8217;re zencaster and we wanna move into the real time streaming and video editing.</p><p>So what I&#8217;d be looking for is the metric of. Active, whatever, maybe active user, maybe active sessions. And then I wanna be saying, okay, how many people are moving from audio recording only to video recording? Because that&#8217;s a input metric. If they&#8217;re not doing that, they&#8217;re never gonna use our video editing feature.</p><p>And then how many of those move from video recording to video editing and then segmenting it maybe by region? And we say actually the US is the target market, so how many people? And by doing that, we&#8217;re effectively going, there&#8217;s a really small metric. Effectively, we&#8217;ve broken it down to its smallest part, but it&#8217;s there to support our strategy of getting more people using our video editing.</p><p>And so that&#8217;s an okay, that, that makes sense for me.</p><p><strong>Timo</strong>: No, Is a perfect example. So exactly like that. And usually it makes it a lot easier than to talk to the people and people who find it very interesting. They want to see the results from that.</p><p><strong>Shane</strong>: And you can get instant feedback. &#8216;cause somebody can say, actually yes, we are going after the video editing market. That&#8217;s one of our initiatives. &#8216;cause we all know that most organizations have 52 initiatives every quarter. But they can say, that&#8217;s not our most valuable initiative. This one over here actually is our most valuable and that&#8217;s what we should work on.</p><p>So when you are talking, you&#8217;re bringing a whole lot of patents to the table to support the metric trees. So you are talking about event storming. You&#8217;ve talked about jobs to be done. You&#8217;ve talked about domain driven design, you&#8217;ve talked about journey mapping.</p><p>There&#8217;s a whole lot of really valuable product and data and agile patents you&#8217;ve just talked about.</p><p><strong>Timo</strong>: You can see. where I&#8217;m coming from. So it&#8217;s there&#8217;s a big product influence in  this whole thing.  Yeah. </p><p><strong>Shane</strong>: They&#8217;re all valuable patterns that support what seems to be a simple pattern of metric trees, which is define a metric and say how they relate,</p><p>Who does the work then? Because if I talk about a data team, often I don&#8217;t see data teams applying product thinking or product patterns, they&#8217;re data teams. And so you get this idea of a data product manager now who kind of bridges it, but who is the type of person that would do this work given the level of patent understanding and experience that you actually need to make metric trees work to the best ability, not just defining a metric on a dashboard who do you see doing it?</p><p><strong>Timo</strong>: That&#8217;s an excellent question. I don&#8217;t really see data product managers. Obviously because I don&#8217;t really see them happening so much. So from my experience when I talk to some people who use metric trees, I would say most of the times these are head of datas leading the data team.</p><p>Why is it good fit? Because they have a strategic role and some of these people struggle a little bit to , have a strategic role because they love data stuff. So it&#8217;s a tricky process to go from something which, was very operational to something which becomes more strategic. But let&#8217;s say the one or two people that I had long talks about were all heading data teams and they were. Completely concerned about, it&#8217;s okay, how can we do a better job to connect what actually the business is doing? They also let had a high frustration level, to be fair. So they define Northstar metrics without us, and so they, they do all these crazy things where we actually, as a data team should be involved, but there&#8217;s a reason why they don&#8217;t talk to us, so maybe we are not really well prepared for that.</p><p>And so this is where they invest their time. And I think for someone who&#8217;s leading a data team, I think all these things are good things to learn because they have a very strategic aspect to it. And You have the possibility to understand the business, which helps a lot.</p><p>Let&#8217;s say it&#8217;s the head of data, you are the. Chief sales officer for all the data work. So therefore, to have a good idea where your work makes most of the impact and gets you, let&#8217;s say, gets your team a lot of fame is definitely not bad to learn. So I would place it most likely there to not also create a different new role for that.</p><p><strong>Shane</strong>: And it comes back to that value of the data team, value of the things they&#8217;re doing. And in my head, so within the information product canvas, we have this area called business questions, and the questions that we want to answer with that product. And from there I can infer metrics relatively easy.</p><p>And so therefore, if I think about it as a map, it goes, business question is supported by a metric. And an action is supported by a business question. If I answer that business question, what action are you gonna take? Now, when I think about metrics in a metric tree as more a business model, I&#8217;ve gotta think about where it fits in my, in the model in my head, because now I&#8217;m going, oh, actually, hold on. They&#8217;re not really a subset of a business question. They&#8217;re sitting in this patent map in a different place, and . I&#8217;m gonna need to think about that one. &#8216;cause that&#8217;s changed my thinking around metric trees, because </p><p>I just started off this conversation around it&#8217;s just a metric and yeah, I need a metrics layer rather than it is actually a simulation or a visualization of a business model. </p><p><strong>Timo</strong>: Yeah. Yeah.  Business process. So like both, It&#8217;s more in this direction.</p><p><strong>Shane</strong>: yeah. That&#8217;s perfect. All right. So if people want to find you and find your writing and you run a course that teaches some or all of this stuff,</p><p><strong>Timo</strong>: yeah, not yet. Okay, so a guest and I, so if, let&#8217;s say, if you look for Metric tree content, so you will come across, a small group of people who wrote about it. And so a guest is one, I&#8217;m another one. There are some others, Arby has written a lot of stuff about it. Ollie as well.</p><p>So we are working on a course but we want to do it right, so therefore this takes a bit of time. So we did already some iterations. We have some ideas now in general, if people want to read what I&#8217;m writing. So you can go to de odeo.com where I write my newsletter. We sometimes write about metric trees. Metrics, product analytics event data models is usually like what I write about. And sometimes I do this as well on LinkedIn. You can also just follow me there if you have a direct question. Or you can also just write me directly on LinkedIn. I usually try to answer them.</p><p><strong>Shane</strong>: Excellent. I read your content. It&#8217;s great. You can tell when somebody knows your craft and then they spend the time to figure out how to write something that teaches it. &#8216;cause I dunno about you. I find I can write content quickly if I don&#8217;t worry about simplicity. But when I try and write something to explain a complex thing with simplicity, that takes me a long time to get it where I&#8217;m like, yep I think I&#8217;m happy.</p><p>And then testing it, give it to somebody and they can tell me what it actually means. That is a lot harder. And the way you write, I can read it and go, ah, actually I think I get it.</p><p><strong>Timo</strong>: That&#8217;s good. No, like I just wrote a, I think this is the longest that I ever wrote, like an 8,000 word piece about the history and future of digital analytics. And I think this thing was brewing in my head for four months back and forth with different kind of variations. So like some sketches on a paper and then, ah, I&#8217;m not really happy.</p><p>No, this is the wrong direction. And so yeah these things take long until</p><p><strong>Shane</strong>: so on that one , what we&#8217;re seeing now is we&#8217;re seeing generation of AI apps are incredibly fast and, 30, 40 years ago, we used to struggle with one system sitting on, mainframe or a, or whatever and getting the data. And then we moved to enterprise resource planning and CRMs.</p><p>So we always ended up with five to 10, and then we moved to software as a service. So we ended up with 15 to a hundred to a thousand systems. Now, with the ability to spawn up, an app in a heartbeat, we are gonna end up with a problem of, a thousand, 10,000 systems that capture data for an organization that allows that to happen, that&#8217;s gonna change the way we do product analytics, isn&#8217;t it?</p><p>Or is it not? Do you think that the techniques and patterns and technologies we have today are gonna be able to handle this idea that there are 20,001 shot apps in an organization that are being built for a very small use case for a very small set of personas.</p><p><strong>Timo</strong>: That&#8217;s a very interesting question. I would say in general, not so for example, what really took me, some time to get to this point, others came to this point as well. Unfortunately it&#8217;s not so much teach and product analytics that in the end, the tricky thing in product analytics, where do you find an high abstraction layer that still explains the product enough, but it&#8217;s simple enough that you can, basically do some calculation and good stuff around it.</p><p>So , what really works well is the user state model, which is in the end a gross model. So you say, okay, we have a new user. You new user can become activated. It become active, it can become at risk. When it stop being active, then it can be dormant when basically nothing happens anymore, and then it can be resurrect and whatever.</p><p>So you can basically create a loop where people can move through different kind of states. If you take this kind of model, then you can have 400 things under it, as long as you can map the things that are happening in this. To one account. This I is always the tricky thing in product analytics. So as long as you can do this, you can still then abstract it on this high level. You can still say, okay, people do different things. Or , let&#8217;s say we capture different things, what the people are doing and let&#8217;s say these 100 internal tools that we are using, as long as we can bubble them up into one place where we can say, okay, look, when people show these kind of activities, we flag them as an active user. I would say the system still would work. I would say this is the only escape. If you go the, , classic product analytics approach where you try to track everything and then try to figure something out, obviously this is something you cannot win. You really have to go on a very high abstraction layer. I think the tricky thing is really how you do identity stitching. So let&#8217;s say the basic function of making marketing and product analytics work is you have to combine all these different kind of things. So if you cannot combine all the different kind of signals happening somewhere then obviously you cannot analyze them for this one account.</p><p>So then you basically have 20 phantom accounts that are actually one, but you have no idea that they are. And this I think, can be really an issue, especially when these tools can pop up everywhere and the company, you don&#8217;t have a concept to make sure, oh at least we should identify the people in similar ways everywhere.</p><p><strong>Shane</strong>: Actually, you&#8217;re just giving me a. Visual map in my head of how you put all this together potentially. So effectively what I think you are saying is, when we have 10,000 apps that are one shot apps, the first thing we have to do is make sure we&#8217;ve defined the core concepts of our organization.</p><p>So concept of a user. Yeah. And that&#8217;s important. We know that anything to do with a user is important to our business. So that concept needs to be defined and held. And then if your application is touching a user, touching that concept, then you need to be aware of that. So you are touching the user concept, and that&#8217;s an important concept for us.</p><p>And then we have the metrics that are a form of statement around active user but also state of user, so there&#8217;s a state flow. So your application actually has to be able to do whatever it needs to do to tell us when there&#8217;s a state change for that. So we can define active or inactive.</p><p><strong>Timo</strong>: the interesting thing is usually I like to model the state changes, so I don&#8217;t want to have the applications to refine it because we will play around with this. So we will have different definitions over time. We might even break it down. We might have six definition of an active user. This is a great stuff that you can do in a data model, in a data warehouse.</p><p>I usually tend not to have it already on the application layer. I just need to track all the things that are happening. This is why events are nice. &#8216;cause events I can use to derive a state change.</p><p><strong>Shane</strong>: But to do that from a government point of view when you&#8217;ve got 10,000 apps, is you&#8217;re gonna have to say to them, in your app, when a user changes to this state, you need to push it out as an event that I can see.</p><p>That has to be a governed thing, is you have to actually do these events.</p><p>&#8216;cause these are the core events, if you don&#8217;t do this, your app is not valuable. And then how do we know it&#8217;s not valuable? What we can do is show them the metric tree and say, if you&#8217;re not telling us that you are changing the state of a user, then this metric won&#8217;t move. And if this metric doesn&#8217;t move, then these other metrics don&#8217;t move.</p><p>And they&#8217;re our core North star metrics. Internal metrics, sorry, our estar metrics see I got it wrong then is like our internal metrics are North Star. No they&#8217;re not. North star metrics are about customer. So they&#8217;re our internal metrics, but they&#8217;re the important ones. So if you can&#8217;t tell us you&#8217;re improving that metric, then why the hell are you building that app?</p><p>And that it&#8217;s A combination of all those patterns and defining them early so that all those AI slop applications actually fit into this governance </p><p><strong>Timo</strong>: I definitely have to make a case for AI slop applications. So I privately love to build them. Because now, for the first time, I&#8217;m a former product person. I still am a product person at heart, and my biggest problem was always. I need to see things that I can make decisions. So you can create a wire frame, sure. But it&#8217;s not really to have an app in front of you. So now you really have the possibility you can go down three ways. I would say, okay, I would do it like this. You can compare it and you can say, yeah, no, this makes totally more sense. So I think it&#8217;s really great what&#8217;s happening, but you&#8217;re completely right.</p><p>So it will create an interesting complexity for us.</p><p><strong>Shane</strong>: Oh, we&#8217;ve seen this before and democratization is brilliant. The ability to put the tools in the hand of people that aren&#8217;t professionals in their art is great. , That is massively variable. We&#8217;ve seen it time and time again, but we&#8217;ve also seen the impact when we don&#8217;t.</p><p>Bring in the principles, policies and patterns that are useful. DBT allowed a lot of people to write transformation code, which is great. It removed the bottleneck of centralized data engineers who were never allowed to do it fast enough, but what we ended up with is 5,000 DBT, and I&#8217;m using air quotes here, models, and we lost the definition of a data model, of a conceptual model of those kind of things because we didn&#8217;t apply the policies, patterns and principles that were valuable. So I can just see with the ability for democratization of building apps, which is great, we are gonna have the same problem and for me, this idea of events defining them conceptual model and a metrics tree could be the things that we use to create those principles, policies, and patterns </p><p><strong>Timo</strong>: that&#8217;s true. I also never thought about this in this way. I think it totally makes sense. No, you&#8217;re completely right. I&#8217;m a big believer in Democrats on one side, but on the other side, have a really good foundational concept that will tell you, where you define specific metrics that will tell you if the thing still work. So for DBT, my classic metric is how long does it take a new member of your team to understand the model and make a production commit. If you have 5,000 models,  that&#8217;s will take eight. </p><p><strong>Shane</strong>: Where, whereas my metric is what was the original time from a stakeholder saying they had a  problem and being  served with something that solved it with data and you&#8217;ve moved to a team of, 10 new analytics engineers and DBT has that time come down. cause if it hasn&#8217;t that&#8217;s great.</p><p>Busy work. Thank you for hiring more people and being really busy. And the other one that I often talk about when I run my course is the clock starts when the stakeholder says they have a problem. Not when that problem hits the Jira queue. Prioritize for the data team because if that takes three months, actually the stakeholders already said it&#8217;s three months late.</p><p>And it&#8217;s not the data team&#8217;s fault because they&#8217;re not allowed to work into the hits the team. But if we think about nodes and links and metric tum system maps now we just wanna focus on the prioritization process because that&#8217;s where it&#8217;s broken. That&#8217;s three months. And if the team take a day, it&#8217;s still three months in a day as far as the stakeholders.</p><p>So same kind of patents as you, eventually some form of storming, some form of jobs to be done, some form of. Domain driven, bucketing, some form of journey mapping. We can apply that to the way teams work as well. It&#8217;s the same set of patterns and they have value. Yeah, end of rent. But , I can see that metrics tree and that event definitions of those core events has been really valuable in the, in, in the democratization we&#8217;re moving to.</p><p>Excellent. All right, so at the beginning , you said, oh, I&#8217;m not sure we can talk about metric trees for an hour. And I said we&#8217;ll talk about lots of stuff as we did. . Hey, look, thank you for coming on the show. It&#8217;s been awesome. I&#8217;ve learned lots and I hope everybody has a simply magical day.</p><h2>&#171;oo&#187;</h2><div class="pullquote"><p><em>Stakeholder - &#8220;Thats not what I wanted!&#8221; <br>Data Team - &#8220;But thats what you asked for!&#8221;</em></p></div><p>Struggling to gather data requirements and constantly hearing the conversation above?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0Bu2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0Bu2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg" width="387" height="342" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:342,&quot;width&quot;:387,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:19725,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/160520537?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!0Bu2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Want to learn how to capture data and information requirements in a repeatable way so stakeholders love them and data teams can build from them, by using the Information Product Canvas.</p><p>Have I got the book for you!</p><p>Start your journey to a new Agile Data Way of Working.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://adiwow.com/168&quot;,&quot;text&quot;:&quot;Buy the Agile Data Guide now!&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://adiwow.com/168"><span>Buy the Agile Data Guide now!</span></a></p><h2>&#171;oo&#187;</h2>]]></content:encoded></item><item><title><![CDATA[Diagramming to Understanding Your Data Estate with Rob Long]]></title><description><![CDATA[AgileData Podcast #76]]></description><link>https://agiledata.info/p/diagramming-to-understanding-your</link><guid isPermaLink="false">https://agiledata.info/p/diagramming-to-understanding-your</guid><dc:creator><![CDATA[Shagility]]></dc:creator><pubDate>Mon, 13 Oct 2025 20:09:29 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/-vLJDI_tBa0" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Join Shane Gibson as he chats with Rob Long about diagramming patterns that you can use to quickly document and share your data estate.</p><blockquote><p><strong><a href="https://agiledata.substack.com/i/176072946/listen">Listen</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/176072946/google-notebooklm-mindmap">View MindMap</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/176072946/google-notebooklm-briefing">Read AI Summary</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/176072946/transcript">Read Transcript</a></strong></p></blockquote><p></p><h2>Listen</h2><p>Listen on all good podcast hosts or over at:</p><p><a href="https://podcast.agiledata.io/e/diagramming-to-understanding-your-data-estate-with-rob-long-episode-76/">https://podcast.agiledata.io/e/diagramming-to-understanding-your-data-estate-with-rob-long-episode-76/</a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://podcast.agiledata.io/e/diagramming-to-%E2%80%8Aunderstanding-your-data-estate-with-rob-long-episode-76/&quot;,&quot;text&quot;:&quot;Listen to the Agile Data Podcast Episode&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://podcast.agiledata.io/e/diagramming-to-%E2%80%8Aunderstanding-your-data-estate-with-rob-long-episode-76/"><span>Listen to the Agile Data Podcast Episode</span></a></p><p></p><blockquote><p><strong>Subscribe:</strong> <a href="https://podcasts.apple.com/nz/podcast/agiledata/id1456820781">Apple Podcast</a> | <a href="https://open.spotify.com/show/4wiQWj055HchKMxmYSKRIj">Spotify</a> | <a href="https://www.google.com/podcasts?feed=aHR0cHM6Ly9wb2RjYXN0LmFnaWxlZGF0YS5pby9mZWVkLnhtbA%3D%3D">Google Podcast </a>| <a href="https://music.amazon.com/podcasts/add0fc3f-ee5c-4227-bd28-35144d1bd9a6">Amazon Audible</a> | <a href="https://tunein.com/podcasts/Technology-Podcasts/AgileBI-p1214546/">TuneIn</a> | <a href="https://iheart.com/podcast/96630976">iHeartRadio</a> | <a href="https://player.fm/series/3347067">PlayerFM</a> | <a href="https://www.listennotes.com/podcasts/agiledata-agiledata-8ADKjli_fGx/">Listen Notes</a> | <a href="https://www.podchaser.com/podcasts/agiledata-822089">Podchaser</a> | <a href="https://www.deezer.com/en/show/5294327">Deezer</a> | <a href="https://podcastaddict.com/podcast/agiledata/4554760">Podcast Addict</a> |</p></blockquote><p></p><div id="youtube2--vLJDI_tBa0" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;-vLJDI_tBa0&quot;,&quot;startTime&quot;:&quot;2s&quot;,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/-vLJDI_tBa0?start=2s&amp;rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>You can get in touch with Rob via <a href="https://www.linkedin.com/in/robert-s-long">LinkedIn</a> or over at:</p><div class="embedded-publication-wrap" data-attrs="{&quot;id&quot;:4151733,&quot;name&quot;:&quot;AtLongLastAnalytics&quot;,&quot;logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!Y2Sn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff873296-b1ce-48f9-9ad6-7a57cc53a452_500x500.png&quot;,&quot;base_url&quot;:&quot;https://atlonglastanalytics.substack.com&quot;,&quot;hero_text&quot;:&quot;A one-stop shop for all things data engineering, analytics, and strategy.&quot;,&quot;author_name&quot;:&quot;AtLongLast Analytics&quot;,&quot;show_subscribe&quot;:true,&quot;logo_bg_color&quot;:&quot;#ffffff&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPublicationToDOMWithSubscribe"><div class="embedded-publication show-subscribe"><a class="embedded-publication-link-part" native="true" href="https://atlonglastanalytics.substack.com?utm_source=substack&amp;utm_campaign=publication_embed&amp;utm_medium=web"><img class="embedded-publication-logo" src="https://substackcdn.com/image/fetch/$s_!Y2Sn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff873296-b1ce-48f9-9ad6-7a57cc53a452_500x500.png" width="56" height="56" style="background-color: rgb(255, 255, 255);"><span class="embedded-publication-name">AtLongLastAnalytics</span><div class="embedded-publication-hero-text">A one-stop shop for all things data engineering, analytics, and strategy.</div><div class="embedded-publication-author-name">By AtLongLast Analytics</div></a><form class="embedded-publication-subscribe" method="GET" action="https://atlonglastanalytics.substack.com/subscribe?"><input type="hidden" name="source" value="publication-embed"><input type="hidden" name="autoSubmit" value="true"><input type="email" class="email-input" name="email" placeholder="Type your email..."><input type="submit" class="button primary" value="Subscribe"></form></div></div><p>The article we talk about in the podcast episode:</p><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:164669741,&quot;url&quot;:&quot;https://atlonglastanalytics.substack.com/p/data-producer-consumer-diagrams&quot;,&quot;publication_id&quot;:4151733,&quot;publication_name&quot;:&quot;AtLongLastAnalytics&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!Y2Sn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff873296-b1ce-48f9-9ad6-7a57cc53a452_500x500.png&quot;,&quot;title&quot;:&quot;Data Producer-Consumer Diagrams: Understanding Your Data Estate&quot;,&quot;truncated_body_text&quot;:&quot;Read time: 8 minutes&quot;,&quot;date&quot;:&quot;2025-05-30T16:31:01.862Z&quot;,&quot;like_count&quot;:1,&quot;comment_count&quot;:0,&quot;bylines&quot;:[{&quot;id&quot;:319352722,&quot;name&quot;:&quot;AtLongLast Analytics&quot;,&quot;handle&quot;:&quot;atlonglastanalytics&quot;,&quot;previous_name&quot;:&quot;Rob Long&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8a3ab027-fa30-47db-8616-0ba10193864f_460x460.png&quot;,&quot;bio&quot;:&quot;Independent consultant spanning data engineering and analytics, data strategy, and Microsoft Azure. Trying to make data easier to understand and more accessible! &quot;,&quot;profile_set_up_at&quot;:&quot;2025-02-18T18:41:22.316Z&quot;,&quot;reader_installed_at&quot;:&quot;2025-02-21T04:15:39.693Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:4234141,&quot;user_id&quot;:319352722,&quot;publication_id&quot;:4151733,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:4151733,&quot;name&quot;:&quot;AtLongLastAnalytics&quot;,&quot;subdomain&quot;:&quot;atlonglastanalytics&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;A one-stop shop for all things data engineering, analytics, and strategy.&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ff873296-b1ce-48f9-9ad6-7a57cc53a452_500x500.png&quot;,&quot;author_id&quot;:319352722,&quot;primary_user_id&quot;:319352722,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2025-02-18T18:43:38.667Z&quot;,&quot;email_from_name&quot;:&quot;AtLongLast Analytics Newsletter&quot;,&quot;copyright&quot;:&quot;AtLongLast Analytics LLC&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}}],&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null,&quot;status&quot;:{&quot;bestsellerTier&quot;:null,&quot;subscriberTier&quot;:null,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:null,&quot;paidPublicationIds&quot;:[]}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://atlonglastanalytics.substack.com/p/data-producer-consumer-diagrams?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!Y2Sn!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff873296-b1ce-48f9-9ad6-7a57cc53a452_500x500.png" loading="lazy"><span class="embedded-post-publication-name">AtLongLastAnalytics</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">Data Producer-Consumer Diagrams: Understanding Your Data Estate</div></div><div class="embedded-post-body">Read time: 8 minutes&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">10 months ago &#183; 1 like &#183; AtLongLast Analytics</div></a></div><div class="pullquote"><p><strong>Tired of vague data requests and endless requirement meetings? The Information Product Canvas helps you get clarity in 30 minutes or less?</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://agiledataguides.com/ipc&quot;,&quot;text&quot;:&quot;Fix Your Data Requirements&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://agiledataguides.com/ipc"><span>Fix Your Data Requirements</span></a></p></div><h2>Google NotebookLM Mindmap </h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Om6c!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdca1c6d-5d66-4af7-a80a-5ad3f7e25e2a_5918x14936.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Om6c!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdca1c6d-5d66-4af7-a80a-5ad3f7e25e2a_5918x14936.png 424w, https://substackcdn.com/image/fetch/$s_!Om6c!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdca1c6d-5d66-4af7-a80a-5ad3f7e25e2a_5918x14936.png 848w, https://substackcdn.com/image/fetch/$s_!Om6c!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdca1c6d-5d66-4af7-a80a-5ad3f7e25e2a_5918x14936.png 1272w, https://substackcdn.com/image/fetch/$s_!Om6c!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdca1c6d-5d66-4af7-a80a-5ad3f7e25e2a_5918x14936.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Om6c!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdca1c6d-5d66-4af7-a80a-5ad3f7e25e2a_5918x14936.png" width="1200" height="3028.846153846154" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cdca1c6d-5d66-4af7-a80a-5ad3f7e25e2a_5918x14936.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:3675,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:3941575,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/176072946?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdca1c6d-5d66-4af7-a80a-5ad3f7e25e2a_5918x14936.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Om6c!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdca1c6d-5d66-4af7-a80a-5ad3f7e25e2a_5918x14936.png 424w, https://substackcdn.com/image/fetch/$s_!Om6c!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdca1c6d-5d66-4af7-a80a-5ad3f7e25e2a_5918x14936.png 848w, https://substackcdn.com/image/fetch/$s_!Om6c!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdca1c6d-5d66-4af7-a80a-5ad3f7e25e2a_5918x14936.png 1272w, https://substackcdn.com/image/fetch/$s_!Om6c!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdca1c6d-5d66-4af7-a80a-5ad3f7e25e2a_5918x14936.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p></p><h2>Google NoteBookLM Briefing</h2><h2>Executive Summary</h2><p>This document synthesizes key insights from a discussion with data strategy consultant Rob Long regarding the use of <strong>Data Producer-Consumer Diagrams</strong> as a powerful framework for understanding, managing, and communicating the complexities of a data estate. The core concept involves mapping the flow of data through a series of <strong>Producers</strong> (entities that generate/capture data), <strong>Consumers</strong> (entities that use data for a purpose), and <strong>Handovers</strong> (the interfaces between them).</p><p>The primary value of this approach lies in its ability to create a unified, workflow-driven view that integrates disparate organizational artifacts like organizational charts, technology architecture diagrams, and data lineage maps. These diagrams are not meant to replace existing technical documentation but to complement it by providing a more accessible, story-driven perspective. They can be tailored in granularity&#8212;from high-level conceptual models for executives to detailed technical maps for engineers&#8212;allowing for targeted communication.</p><p>Key applications include clarifying team roles and responsibilities, performing impact analysis for technology changes, defining the scope of data contracts beyond mere technical schemas, and identifying process inefficiencies. By visually representing the interplay of people, processes, technology, and data, these diagrams provide the necessary scaffolding for informed, strategic conversations and decision-making, ultimately reducing organizational friction and aligning data initiatives with business objectives.</p><p>--------------------------------------------------------------------------------</p><h2>1. The Producer-Consumer Framework</h2><p>The foundation of this methodology rests on a simple yet powerful set of definitions derived from systems thinking, providing a common language to describe any data workflow.</p><h3>Core Concepts</h3><ul><li><p><strong>Data Producer:</strong> An individual, team, or system responsible for generating or capturing data. Examples range from a physical sensor reading temperature to a CRM system capturing customer interactions.</p></li><li><p><strong>Data Consumer:</strong> An entity that receives and utilizes data for a specific purpose, such as answering business questions, training a model, or generating a report.</p></li><li><p><strong>Handover (or Link):</strong> The critical interface where a Producer transfers data to a Consumer. This encompasses the mechanisms and agreements governing the exchange, including data contracts, quality checks, compliance rules, and delivery formats.</p></li></ul><p>This terminology is analogous to the systems thinking concepts of &#8220;Nodes&#8221; (where a job is done) and &#8220;Links&#8221; (the handover between nodes).</p><h3>Fundamental Value Proposition</h3><p>The central benefit of this framework is its ability to unify disparate views of a data estate into a single, coherent narrative. As described by Rob Long, &#8220;it helps give you a unified view of your data estate... together they give you that workflow driven kind of diagrams which help unify everyone and reduce organizational friction.&#8221;</p><ul><li><p><strong>Complements Existing Artifacts:</strong> It enriches traditional documents like vertical organizational charts by providing a horizontal, workflow-oriented perspective. It integrates views of people (org charts), technology (architecture diagrams), and data movement (lineage diagrams).</p></li><li><p><strong>Workflow-Driven Perspective:</strong> The diagrams focus on the end-to-end flow of work and data, clarifying how value is created and transferred across teams and systems.</p></li><li><p><strong>Establishes a Common Ground:</strong> By using a simple, intuitive model, it allows technical and non-technical stakeholders to engage in meaningful conversations about complex data processes.</p></li></ul><h2>2. Key Applications and Use Cases</h2><p>The producer-consumer diagramming approach is a versatile tool with a wide range of practical applications for data strategy, architecture, and team management.</p><h3>Mapping a Multi-Layered View</h3><p>The framework is capable of mapping multiple organizational dimensions onto a single diagram, providing a rich, contextualized picture. This includes mapping:</p><ul><li><p>Data systems and their interactions.</p></li><li><p>The flow of data through different architectural layers (e.g., Bronze, Silver, Gold in a Medallion Architecture).</p></li><li><p>Team design and the boundaries of responsibility.</p></li><li><p>The specific technologies and tools used at each stage.</p></li><li><p>The personas (e.g., Data Engineer, Business Analyst) involved in the workflow.</p></li></ul><h3>Variable Granularity for Diverse Audiences</h3><p>A key strength is the ability to adjust the level of detail to suit the audience and the story being told.</p><ul><li><p><strong>High-Level (Executive View):</strong> A simple map with a few nodes (e.g., &#8220;Source System,&#8221; &#8220;Data Lake,&#8221; &#8220;Data Warehouse,&#8221; &#8220;Reporting&#8221;) can tell a clear, conceptual story without overwhelming detail.</p></li><li><p><strong>Fine-Grained (Technical View):</strong> The same map can be expanded to show intricate details within each node, such as specific data quality rules, data mastering processes, metric definitions, and the technologies involved.</p></li></ul><p>This flexibility allows for the creation of &#8220;variants very simply which tell different stories for the audience.&#8221;</p><h3>Enhancing Handovers and Data Contracts</h3><p>The framework places significant emphasis on the &#8220;handover,&#8221; treating it as a critical point for negotiation and clarity.</p><ul><li><p><strong>Identifying Waste:</strong> It helps uncover process inefficiencies, such as when a producer generates data that is never used or when a consumer needs information that is not provided, forcing them to perform redundant work.</p></li><li><p><strong>Informing Data Contracts:</strong> The model is crucial for defining what needs to go into a data contract. It pushes the concept beyond technical specifications (schema, field types) to what it truly should be: &#8220;an agreement between two parties... a negotiation between the producer and the consumer about what&#8217;s needed for both sides to do their job well.&#8221; This includes non-technical aspects like documentation levels and support expectations.</p></li></ul><h3>Facilitating Feedback and Agile Processes</h3><p>The diagrams inherently support modern development practices by visualizing necessary communication channels.</p><ul><li><p><strong>Feedback Loops:</strong> In reality, data flows are not purely unidirectional. The model highlights the importance of feedback loops from consumers back to producers to report errors, document issues, or request changes. This moves teams away from &#8220;throwing things over the fence&#8221; and towards collaborative problem-solving.</p></li><li><p><strong>Agile Gates:</strong> It provides a structure for implementing agile patterns like &#8220;Definition of Ready&#8221; (what a consumer needs before starting work) and &#8220;Definition of Done&#8221; (what a producer must complete before handing off work).</p></li></ul><h2>3. A Practical Example: Mapping a Data Workflow</h2><p>A concrete example illustrates how these concepts are applied to map an entire data pipeline, layering multiple dimensions of information.</p><h3>Scenario Overview</h3><p>A common data workflow can be visualized with five primary nodes:</p><ol><li><p><strong>Source System</strong></p></li><li><p><strong>Data Lake</strong></p></li><li><p><strong>Data Warehouse</strong></p></li><li><p><strong>Analysis/BI Layer</strong></p></li><li><p><strong>Report</strong></p></li></ol><h3>Layering Information</h3><p>This basic flow can be enriched with additional layers of context to tell a more complete story.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FOs0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0692f4bb-f6fe-4aae-b261-9bc230a1a438_715x411.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FOs0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0692f4bb-f6fe-4aae-b261-9bc230a1a438_715x411.png 424w, https://substackcdn.com/image/fetch/$s_!FOs0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0692f4bb-f6fe-4aae-b261-9bc230a1a438_715x411.png 848w, https://substackcdn.com/image/fetch/$s_!FOs0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0692f4bb-f6fe-4aae-b261-9bc230a1a438_715x411.png 1272w, https://substackcdn.com/image/fetch/$s_!FOs0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0692f4bb-f6fe-4aae-b261-9bc230a1a438_715x411.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FOs0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0692f4bb-f6fe-4aae-b261-9bc230a1a438_715x411.png" width="715" height="411" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0692f4bb-f6fe-4aae-b261-9bc230a1a438_715x411.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:411,&quot;width&quot;:715,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:55892,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/176072946?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0692f4bb-f6fe-4aae-b261-9bc230a1a438_715x411.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FOs0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0692f4bb-f6fe-4aae-b261-9bc230a1a438_715x411.png 424w, https://substackcdn.com/image/fetch/$s_!FOs0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0692f4bb-f6fe-4aae-b261-9bc230a1a438_715x411.png 848w, https://substackcdn.com/image/fetch/$s_!FOs0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0692f4bb-f6fe-4aae-b261-9bc230a1a438_715x411.png 1272w, https://substackcdn.com/image/fetch/$s_!FOs0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0692f4bb-f6fe-4aae-b261-9bc230a1a438_715x411.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h3>Derived Insights and Strategic Questions</h3><p>This single, relatively simple diagram immediately provides significant value:</p><ul><li><p><strong>Quick Comprehension:</strong> It tells an instant story. An observer can see, &#8220;you&#8217;re a Microsoft stack... I&#8217;m not seeing Snowflake, I&#8217;m not seeing Databricks.&#8221;</p></li><li><p><strong>Generates Insightful Questions:</strong> The map acts as a catalyst for deeper inquiry. For instance:</p><ul><li><p>Are the Data Analysts read-only in the Data Warehouse, or can they write transformations?</p></li><li><p>Where are business metrics defined? Are they only in the Power BI semantic layer?</p></li><li><p>Is there a mismatch between the GUI-based tools (Power BI, Excel) and the skills of a potential hire who is a &#8220;hardcore Python coder&#8221;?</p></li></ul></li><li><p><strong>Identifies Boundaries:</strong> The diagram visually delineates responsibilities. It becomes clear where ownership lies within a single team (e.g., Analyst BI to Report) and where cross-team handovers occur (e.g., Data Engineering to Data Analytics), highlighting areas that require formal processes for communication and problem-tracking.</p></li></ul><h2>4. Strategic Benefits and Implementation</h2><p>Beyond tactical mapping, the producer-consumer framework is a strategic tool for driving change, fostering alignment, and building robust data capabilities.</p><h3>Driving Informed Decisions</h3><p>The diagrams provide &#8220;the scaffolding for meaningful conversation and decision-making.&#8221;</p><ul><li><p><strong>Impact Analysis:</strong> They make the &#8220;blast impact&#8221; of a proposed change instantly visible. For example, replacing a BI tool is not just a single change; the map shows it could necessitate replacing &#8220;one-third of our data layered architecture.&#8221;</p></li><li><p><strong>Challenging Assumptions:</strong> By providing a concrete reference point, the diagrams allow stakeholders to challenge decisions with an informed opinion, moving conversations away from pure conjecture.</p></li></ul><h3>Collaborative Creation and Alignment</h3><p>The process of creating these diagrams is as valuable as the final artifact.</p><ul><li><p><strong>Workshop Approach:</strong> A highly effective method involves collaborative workshops where a team jointly maps its processes. This is a quick way to document workflows and often reveals stark differences between how leaders believe work is done and how it actually is.</p></li><li><p><strong>Synthesizing Perspectives:</strong> Another successful technique involves creating separate diagrams with different groups (e.g., technical contributors, managers, executives) and then bringing them together. This process uncovers misalignments in definitions (e.g., what constitutes a &#8220;data set&#8221;) and processes, leading to the creation of a new, shared &#8220;truth&#8221; that becomes the adopted standard.</p></li></ul><h3>A Tool for Strategy, Not Just Technology</h3><p>The framework encourages a holistic view of data strategy, aligning with the &#8220;four pillars&#8221; of <strong>People, Process, Technology, and Data</strong>. It forces a focus on foundational questions before technology selection:</p><ul><li><p>What does success look like and how will it be measured?</p></li><li><p>What team design and ways of working are needed to achieve the goal?</p></li></ul><p>The architecture then becomes &#8220;a means to the end... how you achieve the goal, it&#8217;s not the goal.&#8221; This helps in designing a &#8220;minimal system to achieve the goal and to hit success&#8221; rather than an over-engineered solution.</p><h2>5. The Underutilization of Systems Thinking in Data</h2><p>Despite the long history of systems thinking in fields like lean manufacturing, its application in the data domain remains rare.</p><h3>The Communication Gap</h3><p>The reluctance to adopt these end-to-end mapping techniques may stem from several factors:</p><ul><li><p><strong>Hyper-Specialization:</strong> Data roles are often highly specialized (e.g., Scala developer, dbt modeler), with practitioners not always being taught to consider the entire system.</p></li><li><p><strong>Lack of End-to-End Onboarding:</strong> Unlike factory workers who &#8220;walk the line&#8221; to understand the full process, data professionals are often not onboarded with a holistic view of the data flow.</p></li><li><p><strong>Communication Skills:</strong> There is a recognized gap in &#8220;soft skills,&#8221; particularly the ability to communicate complex technical ideas to diverse, multi-disciplinary audiences.</p></li></ul><h3>Bridging the Technical and Business Worlds</h3><p>Producer-consumer diagrams are a vital tool for bridging this gap.</p><ul><li><p><strong>Making Complexity Accessible:</strong> They distill complex ideas into simple, understandable stories. As Rob Long notes, a key skill is to &#8220;make information accessible, whether that&#8217;s just choosing the right type of diagram, using the right vocabulary.&#8221;</p></li><li><p><strong>Educating Stakeholders:</strong> The diagrams can be used to explain technical concepts like data lineage at a conceptual level to executives. This isn&#8217;t to make them experts, but to build awareness so they can better understand the value and necessity of investments in areas like data contracts and observability.</p></li><li><p><strong>Fostering Business Acumen:</strong> The framework encourages data professionals to become more business-driven by understanding how their technical work fits into the broader value stream, aligning with the principle that &#8220;as companies want to become more data driven, data engineers should want to become more business driven.&#8221;</p></li></ul><p></p><div class="pullquote"><p><strong>Tired of vague data requests and endless requirement meetings? The Information Product Canvas helps you get clarity in 30 minutes or less?</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://agiledataguides.com/ipc&quot;,&quot;text&quot;:&quot;Fix Your Data Requirements&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://agiledataguides.com/ipc"><span>Fix Your Data Requirements</span></a></p></div><p></p><h2>Transcript</h2><p><strong>Shane</strong>: Welcome to the Agile Data Podcast. I&#8217;m Shane Gibson.</p><p><strong>Rob</strong>: I&#8217;m Robert Long.</p><p><strong>Shane</strong>: Hey, Rob. Thanks for coming on the podcast. Today I wanted to have a chat to you about this article you wrote that I really loved. It was called data Producers Consumer Diagrams, and Understanding Your Data Estate. But before we rip into that, why don&#8217;t you give the audience a bit of background about yourself.</p><p><strong>Rob</strong>: Yeah, I&#8217;d love to. So I got started with data in academia. So I actually have a PhD in applied mathematics and geophysics. So I was doing mathematical modeling and numerical simulation of planets and stars. Things like sun spots and storms on Jupiter. And without realizing at the time what I was doing was bringing machine learning and data engineering to that field. &#8216; cause these simulations generate big data and we were implementing. Supervised machine learning, but it was still a step up for that discipline group of people. Since then, I&#8217;ve dabbled all over the place, so I had a couple of years where I run a software as a service, helping schools in England improve their national ranking. So again, very data centric. And then throughout my career I&#8217;ve gone through and I spent a couple of years as a senior data engineer in defense in the uk, so really helping &#8216;em upskill and using cloud platforms, how they do data, and was using GN AI to help with content creation, so scripts, audio imagery, things like that. And now I&#8217;ve moved from the UK to the US at the start of this year, and I run a consultancy where I focus mainly on enabling analytics. So lots of data engineering work and data strategy. And that later part is really where this kind of producer, consumer framework came from as I&#8217;ve been helping. Companies new to data Build the foundation. So really getting them set up on the right path. And that&#8217;s how this kind of came to be.</p><p><strong>Shane</strong>: Let&#8217;s do a little bit of anchoring around terms, &#8216;cause you use slightly different terms than I do, so we&#8217;ll anchor those. So just talk me through this idea of how you define a data producer, a data consumer, and the concept of a handover.</p><p><strong>Rob</strong>: So I think a producer, a very broad strokes, is an individual, a team or system that produces, which means to generate or capture data. So that could be thermometer, reading temperature, it could be a CRM system, whatever it is. A data consumer is someone who takes in that data for a purpose. And so there you&#8217;re thinking they need data to look a certain way so they can use it to answer certain questions. And then a handover is really this interface with the two meet. So that&#8217;s where a producer is literally handing over, in this case, data to a consumer. And so there you might, think about terms like data contracts you might have quality checks, compliance, whatever it is. And so I build these networks on those three ideas really.</p><p><strong>Shane</strong>: So the language I tend to use is I was a great fan of a TEDx video called How to Make Toast. And it&#8217;s around system thinking, so very much the same. And he talks about nodes and links. So for me, I talk about nodes and links. Node is where a job is done, something happens, and then the link is the handover to another node.</p><p>Data producer in terms of data&#8217;s created in the CRM or a sensor is creating an event that is a node and when it moves to somewhere else to get worked on or consumed, use a link to the next node. I&#8217;ll jump between your terminology and my terminology all the way through as I always do. So we&#8217;ve got this idea of producers, consumers, and handovers or nodes and links . How do we use them? ? Why do we care? ? Where&#8217;s the value?</p><p><strong>Rob</strong>: I think the value is really in unifying other things that already exist. So lots of companies, for instance, might have an organization chart, which gives you an almost vertical view of the roles or teams and where they sit. But as we know, lots of data work happens horizontally. But there again, it&#8217;s not to replace org charts, but it helps enrich them because these producer consumer diagrams are very workflow driven. We&#8217;re trying to capture a flow from start to end. so that&#8217;s one use case where it lets you overlay these different artifacts you already have. As I said, an org chart is typically representing people. You might have your architecture diagrams, which represent technology. You might have data lineage or flow diagrams, which capture how data&#8217;s moving through these systems and. All of them, to some degree, you can represent as producers and consumers &#8216; cause they&#8217;re either generating or using data. And so I think this is really powerful &#8216;cause it helps give you a unified view of your data estate. All of these artifacts are really good, but together they give you that workflow driven kind of diagrams which help unify everyone and reduce organizational friction.</p><p><strong>Shane</strong>: I agree. I think the idea of using a map or a diagram to visualize a concept is really valuable as long as we don&#8217;t overload it . We don&#8217;t make it too complex in itself. So those maps should be as simple as possible. And so we can use it to map systems. We can, like you said, we can use it to map all charts.</p><p>We can use it to map flow of data. We can use it to map data layers and the rules on each of those layers. We can use it to map team design. We can use it to map a flow of work as you say, somebody works on something and then hands it off to somebody else who uses that. I think, if we go back to some of the original data mesh thinking there was this idea of input and output ports.</p><p>So we&#8217;re talking about that, somebody does a bit of work and it gets handed over to somebody else who does another bit of work, and we just visualizing that workflow or that flow. Is that how you see it?</p><p><strong>Rob</strong>: That&#8217;s exactly it.</p><p><strong>Shane</strong>: Let&#8217;s give a an example, concrete example. Talk me through, an example of one of these maps.</p><p><strong>Rob</strong>: Going through an example is really useful. &#8216;cause there&#8217;s also this concept of you can look at different granularities and I showed diagrams of that in the article where at a very high level you might just have your external system, which is producing data, and then you bring it into internally into your architecture where you have your own processes for data.</p><p>So there we&#8217;ve got the external being the producer, the internal being the consumer. But if we look more deeply internally, we might have a data preparation layer where we ingest, do quality, create our data models, and then the reporting function, which takes that middle block, which is now consuming data and producing data. So you get this cool thing of, as you add nodes, things can turn from just being a producer to a producer, consumer. And so I think there, that&#8217;s an example where these networks are dynamic, but they also can have as much information as you want &#8216;em to have. So to your point, at a very high level, you might want a very few number of nodes and keep it almost like a conceptual model. And then as you go down, you might want to get into the real nitty gritty, detailed view to understand exactly for instance, which technologies are doing what, which teams are responsible where, and things like that.</p><p><strong>Shane</strong>: I think the level of detail can vary. So for example, if we think about layered data architectures, which have been around for donkeys, but are now hotten and popular again because of the medallion architecture, we can have a very high level map of nodes and links where there&#8217;s five, source system, whether it&#8217;s produced, data&#8217;s generated, bronze , silver, gold and then the consumer.</p><p>And that&#8217;s only five nodes, and it tells a very simple story. Now, what we know as data people is there&#8217;s a whole lot of extra complexity. Where&#8217;s the data quality rules been applied as the data contracts across each of those nodes. There&#8217;s nodes within that. Whenever we get into silver, how are we conforming data?</p><p>How are we mastering data, how we bring in and reference data a whole lot of things, where are we defining those metrics? So we can take that very simple diagram and then with only the five nodes, and then we can break it down to even more detail when it makes sense, so we can present the same system with two different stories depending on who our audience is and the story that we want to tell.</p><p><strong>Rob</strong>: I think that&#8217;s really powerful. That&#8217;s exactly right, is as you said, you might want the data engineering team to understand what goes on within the silver layer, but your business analysts who are consuming at the end, they. Don&#8217;t necessarily need to know care or even have access to the data that&#8217;s there. And so that&#8217;s exactly right. It&#8217;s powerful because you can create variance very simply, which tell different stories for the audience and it lets you again, unify all those other bits as needed.</p><p><strong>Shane</strong>: One of the other ways we can use it is this idea of when there&#8217;s a handover or when there&#8217;s a link, what&#8217;s actually involved. So what is handed over? Is it just a blob of data? Is that a blob of data with a schema? Is it a document? What does the person hand over to the next person?</p><p>What does the producer hand over? And what does the consumer expect? What do they need to be handed over to them to do their step in that process? And a lot of people don&#8217;t think about that. We don&#8217;t bring in that lean system thinking to say, actually, where&#8217;s waste in terms of we, the producer generates some stuff that&#8217;s never used, but also waste because the consumer needs some information that&#8217;s never provided and now they&#8217;ve gotta go and do all that work again and introduce that wasted effort that&#8217;s already been done because nobody told them that it has been done or what it was.</p><p><strong>Rob</strong>: It plays into a bigger story of kind of, in a very ideal case, if everything&#8217;s perfect, these are very unidirectional networks. You take data from source process it, it&#8217;s perfect. So it&#8217;s consumed. In reality, it&#8217;s more complex because you want to empower consumers at each handover to be able to report errors and feedback and say, you&#8217;ve given me this with a bunch of columns, that I dunno what they are &#8216; cause they&#8217;re undocumented or they&#8217;re new in the API or for whatever reason. And so it&#8217;s really important to have both the producer driven flow, which is in my mind, left to right from source to target, but to have the feedback loops that go in the other direction. And so then you can have both the functional benefits of fixing errors quickly, knowing that they exist, what have you, but also from a very almost people level, remove some of the idea of playing the blame game because you&#8217;re no longer throwing things over the fence. It&#8217;s about having well-defined processes so that everyone can work together to fix these problems, and I think that&#8217;s really important.</p><p><strong>Shane</strong>: Yeah, and if we look at it again from that system thinking, we can have different lenses on what those processes are. So for example, if we look at the flow of data, we can look at data contracts. And we can look at data contracts more than just system of capture to the people doing the data wrangling.</p><p>When we get it into that rule layer, we can actually have data contracts between every node and link. So if I&#8217;m transforming some data and creating a metric, what&#8217;s the contract for that metric? Is there a standard YAML format? Is there a certain descriptive metadata I&#8217;ve gotta store?</p><p>We can actually put data contracts in between each of those steps. </p><p>If I look at way of working, we can bring out some of the patterns from Agile and Scrum where we can do definition of ready, end definition of done. So the consumer can say, Hey, if I don&#8217;t get given these things, then you don&#8217;t meet my definition of ready.</p><p>Therefore, I&#8217;m not gonna start work because I don&#8217;t have the tools I need to do the job. I need to do. As a producer, I can have a definition of done. Here&#8217;s all the things I would expect. Myself or anybody else in my team to have done before I&#8217;m saying the work&#8217;s done, it&#8217;s ready to be handed off to the next node or the next consumer.</p><p>So we can bring that idea of gates and patterns and patent templates into both the way the data works, the way our data platforms work, and the way our teams work.</p><p><strong>Rob</strong>: Exactly that, and that&#8217;s why I really like this idea and I think that. , It&#8217;s always gonna come up by the idea of data contracts, which is, that acceptance criteria between producer consumers for data sets effectively. But, producer consumers are really important for telling you what needs to go into a data contract. &#8216; cause it&#8217;s far more than just technical as you said, because if the analysts are receiving data from an engineering team, they need and expect a certain level of documentation, formats, whatever it is. And so it is really about, yeah, removing friction and unifying those requirements and expectations from both sides. So that&#8217;s exactly why I really like these. The framework is the Pattern to use.</p><p><strong>Shane</strong>: And I think one of the things about data contracts is everybody looks at it from a technical point of view. They look at it as what is the schema, what is the field type? What is the grain? But they forget the actual word contract, which basically means an agreement between two parties.</p><p>So it is a negotiation between the producer and the consumer about what&#8217;s needed for both sides to do their job well. that&#8217;s how we should treat it, we should actually negotiate the contract, not just treat it as a technical task.</p><p><strong>Rob</strong>: I a hundred percent agree, and I think that&#8217;s probably improving, but at least in lots of cases, I&#8217;ve seen , data contracts are taken to be technical andnot have the surrounding pieces, which I think I agree with you. They&#8217;re missing</p><p><strong>Shane</strong>: I&#8217;ve never seen a data contract that has the level of documentation involved in the handover unless it was an API contract. . Which is in theory self-documenting. But we don&#8217;t put, APIs between our bronze, silver, and gold layers. . We should probably we should definitely put contracts in between each of those moving parts.</p><p>We don&#8217;t put data contracts in between the way we capture requirements as a business analyst and hand them off to a data team. If we are running separate teams that do requirements gathering in a separate team, that doing bill, which again, I don&#8217;t actually suggest you do. Put them in the same team, solves a whole lot of problems.</p><p>So one of the other things you have done is you&#8217;ve used that idea of consumers and producers to actually map it to tools and technologies that are being used. So just wanna talk us through how you do that and where the value from , that Pattern is.</p><p><strong>Rob</strong>: As I said, I view these as a way to unify these different lenses, technology people, process data. And so when we talk about, for instance, ingesting data from an external system, we&#8217;re gonna use a technology to do that. So I&#8217;m gonna use the example of within the Azure ecosystem, because that&#8217;s what I mainly work with. But any ETL tool, for instance, Azure Data Factory, you can then, say this edge of this diagram is handover, is going to be performed by Azure Data Factory. And then internally, once you&#8217;ve ingested it and gone through your medallion architecture, that&#8217;s gonna be some data warehouse, Postgres, synap, synapse analytics, whatever data platform you are using through to, once you get to the reporting stage, you&#8217;re gonna have a BI tool, typically, whether that&#8217;s Python, power, bi, Tableau. And so really the technologies in general map to the handovers. There are cases where they can be the nodes as well, the producers or consumers. And then I think what also that lets you do, again, the benefit of that is. It lets you have an informed decision about do you really need this technology? So when you&#8217;re thinking about cost, performance capability, when you&#8217;re talking about does your team have the skills to use this technology or do you have to learn something new again, by bringing it all together, you can make an informed decision about that. So it&#8217;s much less the preference of the architect or listen to a vendor pitch and it lets you get that view of what you need to use.</p><p><strong>Shane</strong>: I think the other thing I liked in the diagram you used as an example is you also bring in the personas. So you are mapping out the flow of the data effectively. Then the technologies, the tools that are being used in each step of that flow, and then the personas, you&#8217;re expecting to use those tools.</p><p>So you are differentiating between the tools the data engineers will use and the tools are a persona of data analysts.</p><p><strong>Rob</strong>: Yeah, and as I said, a lot of the people I work with and my clients are. Less mature with their data capability. And so for them, a lot of them start with the Google definition of data engineer or data analyst. So it&#8217;s much more useful then to use these types of diagrams, map the technology, and really distinguish the responsibilities of the personas so then they can hire the right people, get the right skills. But it also again reduces friction because if you are brought in as a data engineer and you&#8217;ve got clear roles and responsibilities, you&#8217;re not gonna kick up a FU because you&#8217;ve been given something that you deem as pure analytics or whatever the case is. And so I think again, that&#8217;s really useful to have because it gives you that unified view of your data estate and lets you make informed decisions about technology, team structure how data flows from source to target.</p><p><strong>Shane</strong>: The thing I like about it&#8217;s, it tells me a story that, it&#8217;s a map at a high level and I can understand it, I can embarrass some stuff and I can get a bunch of questions. So the one I&#8217;m looking at, you start off by saying there&#8217;s a source system that goes across to a data lake that goes across to a data warehouse, that goes across to a analysis BI and then across to a report.</p><p>So I can look at that and go. I think you&#8217;re running a three tier architecture, data lake, data warehouse, and an analysis bi layer. And then you&#8217;ve got source systems of data coming, being produced and then reporting content on the right hand side where people go and use it. And then you are breaking it down to tools.</p><p>So you&#8217;re saying that effectively as Data Factory has been used to extract and load the data into the Lake, synapse analytics has been used to Mung and wrangle the data. Power BI has been used to create the analysis PI layer and then PowerPoint Excel has been used for the primary reporting.</p><p>And then you&#8217;ve got the idea that data engineers sit on the left and do half the work and data analysts. And so I can look at that and I can go, good. There&#8217;s no omni there&#8217;s no thought spot. . I can get an understanding quickly of the technology and I go, yeah, you&#8217;re a Microsoft stack, i&#8217;m not seeing Snowflake. I&#8217;m not seeing Databricks. , I know that effectively you&#8217;re gonna be on Azure, not GCP and not aws. aws. I can see you&#8217;ve got two sets of personas in there. And that raises a question because in the way you&#8217;ve laid it out, it looks like the data analysts are reaching back into the data warehouse.</p><p>So I can have a question of are they read only or are they actually able to write transformation code in the warehouse? Your analysis bi layer is Power bi, so I&#8217;m just gonna assume that it&#8217;s dxi. Semantic key thing, and I&#8217;m gonna ask you questions around where metrics defined. Are they only in that layer or are they somewhere else?</p><p>Your data lake, I&#8217;m gonna assume it&#8217;s Azure Data Lake Gen two or something like that even though you haven&#8217;t specified it. So I&#8217;m gonna ask you those questions so again, I can get a really quick map of that environment, of that estate, and I can ask you a whole bunch of questions.</p><p>For things that I just want to know because I need to know, or I&#8217;m just nosy and I wanna know. And that&#8217;s the value of these maps.</p><p><strong>Rob</strong>: Yeah, and I think what you&#8217;ve just touched on is really useful in that you can create different views like we discussed about earlier. So you might have this, which is your more executive focused story, and then for your technical users, your power users, whomever. You might have that more finer grain that you&#8217;re talking, where you go into more specifics about technology, about the rules. And again, you could also go more in detail. For instance, in my experience, a lot of frustration and friction comes from the handovers is that mismatch of expectation, definition of ready, definition of done. And so this lets you account for some of those. It lets you see quite clearly who owns which parts.</p><p>Is it within one team? So the analyst bi to the report layers are both owned by the data analysts. So the part of the story there is that they should be able to do that internally. Whereas if you look at the handover from data engineering to data analytics, there, we might need some cross team and you have to work out the processes for that. But again , you&#8217;re informed of that with this diagram so you can start to flesh out how that&#8217;s gonna work. Who&#8217;s responsible for what, how communication happens. If it&#8217;s a Slack conversation, a Jira ticket, whatever you want to use for that kind of problem tracking. The extra layers of information you can get when you go to finer granularity.</p><p><strong>Shane</strong>: And I think it&#8217;s that idea of a boundary because my eye is naturally creating boundaries and saying these things these nodes, these links are in the boundary of a data analyst, and these other ones are in the boundary of an engineer. And now I&#8217;m gonna worry about whether the boundaries look weird to me or whether there&#8217;s an the touching of the boundaries or an overlap of the boundary.</p><p>One of the things you do though, when you diagram this is you&#8217;re effectively bucketing or putting together the consumer and the producer node with no line. And I&#8217;m assuming that your visual way of saying that a person consumes something and then produces something, and that in my head is within the node before they hand off to somebody else.</p><p>Is that just a visual style of the way you do it?</p><p><strong>Rob</strong>: It is. So it was trying to capture the logic that, for instance, the data warehouse, it consumes from the previous node, the data lake, and that same data warehouse produces to hand over to the BI tool. So again, you&#8217;re right, you can look at this and view this in different ways. For me, it&#8217;s really just about having the node label as either producer or a consumer or a producer, consumer pair. And yeah, there&#8217;s different ways of doing this. This was one of the simplest ways that I could fit quite a lot of information into a easy to read diagram. And so that&#8217;s a complete design choice that I&#8217;ve had success with.</p><p><strong>Shane</strong>: And I think it&#8217;s just a choice of language now. I understand that when I look at those, they&#8217;re effectively an ensemble. . And I&#8217;m looking for ensembles where these consumer producers, which means these in and outs within that node, or I&#8217;m looking for one where there&#8217;s only a producer or a consumer task effectively.</p><p>Again, it helps me tell that story </p><p><strong>Rob</strong>: And in general, obviously you very rarely want to produce data that&#8217;s not being consumed. So in typically they will come in pairs, and obviously there&#8217;s always the question of where do you end the diagram. So in the case we&#8217;re talking about, the reporting layer says consumer producer, but the producer&#8217;s not connected to a consumer. So again, the bit that&#8217;s missing from there is that there is a decision maker, an executive who takes that report and makes a decision, takes an action. And so that&#8217;s also. Where I say it&#8217;s as much art as science when you make these kind of diagrams is there&#8217;s no rule for granularity or your starting end points most of the time.</p><p>There is a logical start point, but where you decide to kinda end the flow. As I said, it can vary a lot and it really just depends on what you&#8217;re trying to convey with that particular story.</p><p><strong>Shane</strong>: And that&#8217;s the important part is you are telling a story. So I could take , that five node one you&#8217;ve got of source system data, lake, data warehouse analysis, BI and report. And I could extend it to the right and actually say actions that are taken off, those reports and outcomes are delivered by those actions and value that&#8217;s delivered from the outcome.</p><p>If that&#8217;s the story I wanted to tell, </p><p><strong>Rob</strong>: yeah, and then you&#8217;d obviously sell a third persona, the executive or whatever. It&#8217;s Exactly. That&#8217;s how easy it is to iterate these and see the difference between where you are. And a proposed change to your process, framework, technology stack, whatever it is, it directly gives you the implications and new connections you need to manage and prepare for.</p><p><strong>Shane</strong>: And that&#8217;s the other part is it helps with the story. So for example, in this scenario, if I said, oh look, I want to introduce a new BI tool, consumption tool, last mile tool I&#8217;m even gonna have to find one that is compatible with the power bi semantic BI layer. If not, we know that the blast impact of that decision is we now actually have to have two semantic BI layers or replace the power BI one with one that actually serves multiple last model tools.</p><p>So again, I can point to something and say, if we do this, then we are gonna affect that. . And visually point to the map so people understand, oh, holy shit. Actually that&#8217;s one third of our data layered architecture that we were replacing. If we&#8217;ve only been working on it for six months, maybe.</p><p>If we&#8217;ve been working on it for 10 years and we can&#8217;t programmatically migrate it, we&#8217;ve got a lot of work coming. So again, it&#8217;s really valuable to be able to point to parts of the map and say, we&#8217;re talking about that. We&#8217;re talking about the uk, not the us. That&#8217;s the value of the map.</p><p><strong>Rob</strong>: Exactly. That&#8217;s it.</p><p><strong>Shane</strong>: And then the other thing we can do, so we can extend left and right, and we can also change the grain effectively so we can add more rows. So for example, at the moment we&#8217;re saying that all source systems are producers and they&#8217;re the same. But if I wanted to, I could create different rows,</p><p>I could say we&#8217;ve got relational versus SQL sources, or we&#8217;ve got streaming versus batch sources. So if I wanted to, I could just add more rows more nodes and links as a row to add more complexity or tell a different story to this map. Couldn&#8217;t I.</p><p><strong>Rob</strong>: Exactly. You can add more rows in terms of different sources or even like you said, making it modular and taking out the synapse piece and say, what if I replaced it with Databricks? And then you can obviously also account for new data. So in this case, the data warehouse is just producing datasets to give to the analytics and BI tool. But you might decide that you want to also capture system data from your warehouse. So from Synapse, maybe. How long your spark clusters have been running, how many outages you&#8217;ve had, how much it&#8217;s cost. And so you can also kind of, not add new rows, but add an extra dimension where it&#8217;s not just going left to right, but you have metadata that kind of goes up and down as well. So you can add different directions or paths possible in your network.</p><p><strong>Shane</strong>: So do you tend to. Design the complex map first and then simplify it to tell different stories, or do you tend to start off simply and then add the complexity as different versions as you go? Which way do you work?</p><p><strong>Rob</strong>: So I do something very weird, or I start with a super high level simple flow. So I really just want to get the conceptual model right and then I skip the middle bit and jump to the perceived finest grain and do the real in depth these data quality rules, these data contracts, these technologies. Then I backfill the middle bit because I find it really easy to question the requirements at that very high level. And then by jumping to the really low level, find the real complexities. So like I said, if there&#8217;s a skills mismatch with the team versus technology, which then I can propagate upwards. I&#8217;ve got a few horror stories where I did go from top to bottom, got to the bottom, and then realized something didn&#8217;t work.</p><p>And you have to backfill the whole thing then. And even though you&#8217;re only changing one node, it inherently sometimes changes the nodes that it&#8217;s paired with, obviously &#8216;cause things work differently and connect differently. So that&#8217;s how I work. I think it can work both ways, starting at the very high level or very nitty gritty, detailed level. I think it&#8217;s a matter of preference and how good your initial requirements are. I find the better the requirements. The easier it is to start at the height level and then go from there?</p><p>Shane: And then you are handcrafting these, you are drawing them as if they&#8217;re pictures. You&#8217;re not using a tool to help you with this.</p><p><strong>Rob</strong>: A little bit of both. So I do start very conceptual hand drawing in a tool like Draw io, but I do automate some of this with Python in terms of just treating it like a knowledge graph. And you can def create a simple CSV file with your nodes and edges and get it to populate a graph. So I do a hybrid approach to get these ready.</p><p><strong>Shane</strong>: It&#8217;s interesting that there&#8217;s a gap in the tooling. I think at the moment, we used to have things like Spark Enterprise Architect, which was a horrific product to try and use. , I used to do enterprise architecture and I&#8217;d go into an organization. I was forced to use that tool. I&#8217;d if I knew in advance, I wouldn&#8217;t take the gig just made you so slow. And such waste. And like you, I just use draw io. But it&#8217;s a graph problem, we&#8217;re talking about nodes and links effectively ins and outs and relationships. So surprising that actually there&#8217;s not a great tool</p><p>For defining it, but then also creating the simple stories.</p><p><strong>Rob</strong>: I say, I think the closest I&#8217;ve seen is mermaid js, so it&#8217;s diagramming as code. So it works, but you just don&#8217;t get the customization to make it more user friendly. So it&#8217;s great for spinning up dirty diagrams that a technical team would love, but to sell this to an exec, it falls short exactly as you&#8217;re talking about.</p><p><strong>Shane</strong>: You&#8217;re telling a story, so therefore, it helps if the story&#8217;s attractive to look at not ugly. And yeah, I&#8217;ve done ones where I&#8217;ve used some of the lms, so I&#8217;ll put it in as a CSV, get it to gimme the mermaid and tax and then put it into a mermaid viewer. But it&#8217;s ugly,</p><p>It tells a story, but not in the most attractive way. I think the other thing is, again, overcomplicating at your risk. So if you can keep it left or right in an English speaking country, people will understand it as soon as you start branching off to if DL statements where you&#8217;re coming along and you&#8217;ve gotta go up and or down and again, that just increases the complexity of the diagram and the story you&#8217;re telling.</p><p>So if that&#8217;s the story you need to tell, then do it. I think the other thing is how many lenses or dimensions you bring to it. So for the one that you&#8217;ve got here, there&#8217;s a flow of the data and the data layers. There&#8217;s the technology and then the personas. And I find that three is normally the maximum you get to before, again, you start bringing complexity in.</p><p>So if you add another two in there, you&#8217;ve gotta do that consciously. You&#8217;re consciously saying, I want to make this a more complex technical diagram than a simple map.</p><p><strong>Rob</strong>: Fully agree. And again, it&#8217;s all about knowing your target audience and the story you wanna tell. So again, I think these are, these are not your super technical architecture diagrams, and they&#8217;re not meant to replace them. They&#8217;re meant to compliment them by giving a view which is consumable by. Your executive team, your managers, whomever else, decision makers who don&#8217;t know what the icon for Databricks is, for instance. So when you show them an architecture diagram with networks and tool specific icons, it&#8217;s noise and this is about almost filtering that into something useful for that audience.</p><p><strong>Shane</strong>: one of the ways I use this nodes and link format and this idea of maps is, as a workshop with data and analytics teams when we want to change the way they work. So the way it works is I get the team together. I basically put something on the left and something on the right, either on a wall, if it&#8217;s in person or on a virtual whiteboard.</p><p>If it&#8217;s not. So an example would be data sources on the left information consumers on the right. And I asked &#8216;em just to brain dump. Brain dump all the data sources on the left and brain dump all the people that use whatever you produce on the right. So I start giving, a bit of a map. And then I say to them just use some stickies and do a stick a node for everything that you do to get the data from that left to that.</p><p>Right. It&#8217;s always interesting. So some people do very high level stickies. Some people do very detailed. At this stage I don&#8217;t care, I don&#8217;t give them any boundaries. And then once they&#8217;ve done that, I get &#8216;em to group those stickies together. So effectively, where you&#8217;re doing the same task, put it in the same area.</p><p>So your idea of, where it&#8217;s in, in and out, and it looks like it&#8217;s the same. And from there we now have a flow of work. I can get them to use the dot Pattern to say, where do you think it&#8217;s broken? Where do you wanna invest and change? And a whole lot of other things. But I find it, a really quick way of getting a team to document their processes without endless interviews and documentation.</p><p>And also it&#8217;s really interesting when a leader sits in the room and they use the words, but that&#8217;s not how we do it, right? We do this and the team just laugh at them and go, no, this is always how we do it. Or when you have two teams that actually have completely different processes, there&#8217;s some things that are shared and there&#8217;s some things that aren&#8217;t kind of looking at from that lens, gimme some examples of how you actually create these diagrams. Is it just you? Have you ever done it in a teaming environment?</p><p><strong>Rob</strong>: So there&#8217;s two ways that I&#8217;ve had good success. One is where I&#8217;ve been brought in as a almost contractor, and so they&#8217;ve given me the requirements and a few hours of someone&#8217;s time, so I go through the requirements with them and then create this myself. Then the value is really in the playback and discussion session. But like you, I&#8217;ve also had success doing a whiteboard activity where we just go through with maybe one of the data teams, the kind of technical hands-on person to create the flow. Then we go with their manager independently to create the flow. Then we go with the exec level and create the flow. And then you get, like you said, very different viewpoints of how they think things are working. And that&#8217;s how you get real change at a process level because it&#8217;s not what people thought it was. And so I&#8217;ve had great success with that. And then it tends to end with a session where we all come together, put them all up, and go through and discuss differences, similarities, and often we come up with a new truth, which is then the one that&#8217;s adopted and implemented. So it&#8217;s normally some Frankenstein monster of all of them. But it turns out that&#8217;s the one that&#8217;s useful because it brings together the different views people have and brings together the best of each way of working. And so that&#8217;s really where I think the best lessons learned are and how you actually make change using this as a tool. Yep.</p><p><strong>Shane</strong>: And I can imagine when you&#8217;re doing that again, now you&#8217;ve got say, three maps, right? To keep it really simple. So the three expectations of how the system works to help get. Agreement. You can point to parts of the map, you can point to the executive part of the map where it&#8217;s got the word AI agent and you can say nowhere in the other two maps.</p><p>Does that have any idea of LMS or a, agentic behavior. So we need investment, we need to add that node because it&#8217;s just missing. We&#8217;re not investing in that right now. So by pointing to things in the maps, you can get agreement to add things and then you can get agreement to take things away, I&#8217;m assuming.</p><p><strong>Rob</strong>: Yep, exactly that. And another similar example is where you might just need to end up updating their business glossary. So it might just be, and this is real case, where the ic, so the technical contributor versus the executive had different definitions of dataset, and so their flows look very different. And it&#8217;s just because. Dataset was taken for granted to mean dataset, and there we just, went into the business. Glossary added a new entry, and moving forward communication was easier &#8216;cause they were talking the same language. So you can both modify things and update how they, work.</p><p><strong>Shane</strong>: Yeah, actually that&#8217;s a really good point is that when you&#8217;re doing these maps, you need to be iterating a bunch of definitions of business glossary so that if there is a box, if there&#8217;s a node and it&#8217;s got a word, that word needs to be defined. . Because otherwise everybody&#8217;s gonna look at the map and go, Christchurch, I, that&#8217;s Christchurch, New Zealand, no, it&#8217;s Christchurch in the UK endorse it. Oh, okay. Actually, it needs to say Christchurch, New Zealand and Christchurch uk. Otherwise we are gonna look at different part of the map thinking it&#8217;s the same thing. I think two other areas I&#8217;ve seen value during this process, when I get bought in to do data blueprints for organizations, I will create these as a way of articulating a story like you said. . I will help it for me to understand the thing I&#8217;m trying to map, the system I&#8217;m trying to define as a blueprint. And then use that to test and iterate and get feedback on whether I&#8217;m on the right track for what the organization thinks they want.</p><p>And the second one is when you go in and do a review or a stock take. I do exactly what you said as well. I will talk to the technical people and get them to help me draw the nodes and links diagram or get them to do it. And then I&#8217;ll go read the technical documentation, the solution design any documentation and see how well it conforms to their understanding, because that means one or two things, the documentation out of date, which is typically the case or the people have an impression of how the system&#8217;s working, but that&#8217;s not actually what&#8217;s happening.</p><p>So those two use cases , for this Pattern, I found is really valuable as well.</p><p><strong>Rob</strong>: Yeah, I think we&#8217;ve got a very shared experience with those. The last one, which just came to mind was there&#8217;s an educational piece, which is really interesting in that when we talk about these nodes and edges, producers, consumers, as I said, I like to use &#8216;em for different use cases, but for instance, executive level people may not know what a data flow diagram or data lineage is, but if they&#8217;re happy with this as a conceptual map, source data warehouse. Analysis, so on. Then you can give the example, for instance of if you may instead make the nodes data sets that then if you think about it, it&#8217;s really just a data flow diagram or a data lineage diagram. And so there&#8217;s a nice educational piece where something that they&#8217;re comfortable with can help them understand something more technical that they don&#8217;t run into. And I think that&#8217;s actually never the intent, but always a nice side effect of using these.</p><p><strong>Shane</strong>: I think we need to be careful there though, because all data lineage diagrams look awful. They look incredibly complex. They look like a, Frankenstein version of the London Underground. And therefore we need to be really careful that telling that story has some value.</p><p>Because it&#8217;s a tool for data professionals to be able to go into the detail to find the bit of nodes and links that they need to solve a problem. I think it&#8217;s like data models, and, ghost of data past the enterprise data modeler would print everything out on a or zero, have it up on the wall and be very proud at the size of their data model, the number of nodes and links and how complex it was. Nobody else gave a shit. In fact, it actually did them a disservice because it was like, I don&#8217;t understand that. And I think we&#8217;ve gotta be careful with data lineage as well. That actually it&#8217;s a tool for us for data professionals. It&#8217;s not a great map for information consumers.</p><p>So I&#8217;m with you and understand the concepts and people can look in and go, oh yeah, it looks like a really complex version of what I understand. I probably not gonna go near it &#8216;cause I don&#8217;t need to.</p><p><strong>Rob</strong>: So yeah, exactly that. When I say to help &#8216;em understand what it is at a conceptual level. It&#8217;s not an excuse to start showing up with those lineage diagrams. However, , it does mean if they know what a lineage diagram is and consequences for contracts, observability, things which can affect cost value, have implications, it&#8217;s then easier to get time or money assigned for those types of projects because they have some awareness of it versus you rocking up out of the blue, our observability is a mess. So I fully agree. I&#8217;m definitely not saying to try and turn the execs to data lineage experts, but again, just to increase awareness. Something I say a lot is just as companies want to become more data-driven. Data engineers should want to become more business driven. And so it&#8217;s all about that.</p><p>Just, I&#8217;m not expecting data engineers to know the business through and through, but we should be aware of key metrics, processes. . Streams of revenue, what have you. And so I think it&#8217;s just about the conceptual awareness more than anything else.</p><p><strong>Shane</strong>: I think the other thing is the complexity of the map also tells me a story. In our product, we run a relatively simple three-tier architecture. We have history, design, and consume. And then within our design layer, there is effectively three objects you can create. You can create a concept object, which is a list of keys for a thing, customer, supplier, employee product order payment.</p><p>You can create detail about it. Customer name, product skew, order quantity, payment dollar amount and you can create an event, a relationship between them, customer order product and that&#8217;s it. There&#8217;s only three types of objects you can create. And what that means is when I look at our lineage graph, it may have lots of left to rights, but the number of columns in it is very light, which means when I have to troubleshoot, I have a very short conversation with myself.</p><p>Is it the history layer? Is it one, the concept detail or event and design, or is it the consumer layer? When I go look at other organizations and, we&#8217;re creating a transformation code that has lots of crate tables temporary tables in between, and I&#8217;ve now got 16 columns for a relatively simple transformation.</p><p>That complexity comes with cost.</p><p>And we&#8217;ve gotta really understand that. And then the other thing we can do is, if we think about it if these maps become context, if they become metadata, if they become something we can query, we can actually put a boundary around a map and say, Hey, if we replace the source system, how many of those nodes and links, how many of those consumer and producer ensembles need to be touched?</p><p>Oh, 250 out of how many? Out of a thousand. Okay. So what we&#8217;re saying now is we actually have to refactor 25% of our entire state, and we can get a sense of the cost of that change, at a really high level not in detail, but we can start to really understand how much of an impact on the system we&#8217;re gonna make when we make these types of changes.</p><p><strong>Rob</strong>: Yeah, I don&#8217;t have anything to add. That&#8217;s exactly right.</p><p>Shane: Which again, comes back to actually turning this. Context, this metadata into actual data we can use is probably something that we should think about a lot more. Because I&#8217;m like you, I just draw them. I&#8217;ve thought about automating them to make my life of drawing them easier, but I haven&#8217;t actually thought about using them as a global repository to help me make better decisions.</p><p><strong>Rob</strong>: Yeah I think there&#8217;s something really interesting there because as I said, I see these very much as a compliment to those other artifacts that already exist. Your org charts, your architecture diagrams, data dictionaries, business ies, so I think there is a really powerful layer there that if you could bring all this together as your context layer and then yeah, use that again just as context.</p><p>I think there&#8217;s something really powerful you could do with that. I dunno, of anyone or anything that&#8217;s implementing that or even thinking about that yet.</p><p><strong>Shane</strong>: And it comes back to ea sparks. And all those standards and the tools that complied with those standards or applied those standards, that&#8217;s what they were doing. They were bringing all these different dimensions of lenses about everything we know about an organization. The problem was the tools were just horrible to use.</p><p>They weren&#8217;t friendly. And then the diagrams they produced were ugly. So I think that&#8217;s the key, is you&#8217;ve gotta make it easy to create and you&#8217;ve gotta make it easy to consume producer and consumer.</p><p>But do it in the way you define your system as much as the way you define, the way you work, the way you define your technology, your architecture, and your data flows.</p><p>So yeah, I think that&#8217;s a good point.</p><p><strong>Rob</strong>: &#8216; cause that&#8217;s really what I try to do. I&#8217;m a big fan of Dylan Anderson&#8217;s posts about people, process, technology and data, the four pillars. And I always use these in that context of bringing those four together and. How you build a data capability or data strategy, which in turn helps you achieve your business strategy, right?</p><p>So it&#8217;s all about that kind of hierarchy of do what we can and hope that we&#8217;ve done enough for it to propagate up into something more meaningful.</p><p><strong>Shane</strong>: I agree. So when I do the data blueprints, I focus on team design and ways of working flows of work as much as I focus on architecture. &#8216;cause I got sick of strategies or designs that were just a bunch of technology boxes and none of the other things that were important. The other thing I do add though, is I start up with measures of success which is, if we&#8217;re gonna spend all this money changing what we do or implementing this new platform or whatever, what does good look like? How do we measure the investment was worth it, is that increasing the number of self service?</p><p>Being done by people outside the data professionals. Is it data or information being delivered faster or with a higher quality? What does it actually look like if we are successful after we spend all this time and money? &#8216;cause I&#8217;m surprised at the number of people that don&#8217;t even think about that.</p><p>They just deep dive straight into the architecture map,</p><p>Rob: yeah. Whereas I think, again, I&#8217;m fully aligned with you stuff like the architecture map are almost a means to the end. It&#8217;s how you achieve the goal. It&#8217;s not the goal. And so I also start with the people and process driven part. What do we want to achieve? Ways of working to achieve it. How do we know we&#8217;re successful? And then as I said, things like then your team design or responsibilities, they then inform my architecture choices. A lot of times I&#8217;m never gonna suggest to someone who&#8217;s got two data people on their team. To try and pick up you a enterprise level data platform plus storage, plus BI, or whatever the case is.</p><p>It&#8217;s all about almost the minimal system to achieve the goal and to hit success. And that&#8217;s again why I think this just helps my thought process. Keep it simple, keep it trackable, and just have impact versus bells and whistles, which you can always add on later as and when you need them.</p><p><strong>Shane</strong>: And again, going back to that diagram you did, I can also see, ways of identifying who we&#8217;re gonna hire. And the one that you drew, it talks about power bi bean as the semantic bi layer PowerPoint, Excel bean as the primary reporting last mile tool. Then you&#8217;re gonna hire analysts that are used to using gooeys draggy, droppy those types of things.</p><p>Maybe a lot of Excel, which makes sense going back to the DAX formulas. But if you get somebody that&#8217;s, hardcore Python. Coder who wants to just use ACL I there&#8217;s gonna be a mismatch between the system that&#8217;s in place and their expectations. Now, you can deal with that by giving them a different set of tools, but now we&#8217;re gonna have this conversation of how does the hell does a CLI with Python code talk to the semantic BI layer?</p><p>Because, sure as shit, they&#8217;re gonna wanna punch back into the data warehouse layer and use the data or even back into the lake because that&#8217;s what they&#8217;re used to, the way they&#8217;re used to working. And that&#8217;s okay, as long as you understand there&#8217;s a mismatch and you&#8217;re gonna have to change something.</p><p>But if you go in there thinking you don&#8217;t, now you&#8217;ve got a problem. And I can point to you where that problem is. It&#8217;s a mismatch between skills of your analysts and the system you&#8217;ve built. So I think that&#8217;s important. Again, taking these different maps, different dimensions, and being able to compare them.</p><p><strong>Rob</strong>: yep. And I think as a consultant, it gives me some validity. I&#8217;ve not just given them a list of tech, I&#8217;ve not just given you an architecture diagram that you probably don&#8217;t understand. I&#8217;ve given you not just the tools, but the personas, the workflows. I&#8217;ve talked through how I went from your requirements to this proposed solution. And it again, it&#8217;s transparent. It lets you have useful discussions with people. It lets you align their priorities, whether it&#8217;s cost, people, performance, whatever it is. And that&#8217;s why I really like this. And as you said, that kind of mismatch, identifying it early, you then get to make an informed decision. I know I&#8217;ve said informed a lot. That&#8217;s what these diagrams give you. They give you informed decisions from the start before you are too committed to anything.</p><p><strong>Shane</strong>: It&#8217;s also a decision that can be challenged because again, I can point to a box, I can point to a node, I can point to an ensemble, I can point to a line. I can point to a consumer, producer peer and say, that doesn&#8217;t make sense to me. I can point to a handoff and say, that looks like it&#8217;s missing something, or that handoff looks like it&#8217;s waste.</p><p>I can now start to challenge some of those informed decisions with an informed opinion.</p><p>And I think that&#8217;s important. Again, it becomes less conjecture. . Still an opinion, but I can actually try and get some clarity on where I&#8217;m disagreeing.</p><p><strong>Rob</strong>: yeah. It gives you the scaffolding for meaningful conversation decision making. It&#8217;s less about opinions or people there&#8217;s still opinions, but it&#8217;s less about opinions without context and more about how opinions fit into a workflow, which involves technology, people, data, all the different parts. Yeah, so I fully agree.</p><p><strong>Shane</strong>: One I remember back in, in the ghost of Dana past when we were doing big requirements up front, so I used to hate it, every now and again, there&#8217;ll be one of these mega projects, transformational things, and you used to get a list of requirements and number of the bloody things.</p><p>And that was effectively the input into any of the system plans and your blueprints. And I used to map the requirements to the nodes, so this node supports requirements 54 B, 27 Cs. , It helped mitigate some of the arguments that, where did this come from? But I&#8217;m not sure the, the juice was worth the squeeze. I kind of found it waste,</p><p>What about you? Do you actually map any of these back to requirement statements?</p><p><strong>Rob</strong>: It depends on the level of the requirement statement, so I definitely don&#8217;t religiously apply it to all of them, but I think they are the requirements in general. Give me the context and that&#8217;s how I treat it in general. If the requirement is you need to ingest data from system A , that informs some choices. If your requirement is less specific and a bit more hand wave your conceptual, then I&#8217;m definitely not going to slog to try and assign it to a node. I&#8217;m going to use it as broad context so that then if when the discussion comes up, I can describe my choices in context of that. But that&#8217;s it. So I think, yeah, like you in the early days, &#8216; cause this came from experience, the reason why I&#8217;m attached to this idea is I was working a job where none of this was in place and so we had to do this just out of necessity. And so I think, yeah, I&#8217;m like you, I started by very rigidly trying to. Almost one-to-one map the requirements to the flow and it doesn&#8217;t work. Or you end up with very rigid workflow and you don&#8217;t have any kind of freedom to make something better. And so now I yeah, I put, I&#8217;m selective over which ones I directly incorporate versus use context text.</p><p><strong>Shane</strong>: I think actually as I as I was thinking about it as well I often use the requirements to identify where I have complexity in my map. For example, if I have a requirement that data&#8217;s gotta be able to come from the source system, the system of capture or production and be available in a last mile capability to a in consumer.</p><p>In less than two seconds I&#8217;m gonna have some kind of streaming architecture. And so if I look at your map, yeah. That is a typical batch orientated architecture. Like I could stream it, but I&#8217;d be really surprised if it would be streaming with those layers. So if I then have to do another flow, .</p><p>If I have to do another row on that diagram, which immediately makes it more complex and that&#8217;s only so I can stream, then I gotta justify where that came from and then potentially I could rearchitect the layered architecture. You have to be stream only. If it meets every other requirement, but now it can go back to what&#8217;s forcing me to have that complexity and is there any way I can remove the complexity without introducing more complexity?</p><p>. Because sometimes having only one row means it&#8217;s trying to do too many things. . And therefore it&#8217;s even more complex. You&#8217;re just hiding it. So yeah, I think actually thinking about it, those broad brush requirements help me again, put a boundary around things and say, I have to do this for these reasons.</p><p>If those reasons aren&#8217;t valid or important, then I can stop doing those things.</p><p><strong>Rob</strong>: Yep. I also think. Some requirements are , very specific. And like we talked about earlier, they might only be applied to a finer grain version of this flow. So when I worked in defense, you can imagine there was quite a lot of strict rules about data encryption and personal information masking. So there, I wouldn&#8217;t worry about showing that, for instance, the high level workflow, but in the more data engineering focused one, that&#8217;s where I would include it. So again, there&#8217;s a trade off there in decision about, it&#8217;s almost twice the work, but it can be twice as impactful to have those two green views of this one system. One for the kind of exec level, one for the technical level.</p><p><strong>Shane</strong>: But again, that helps the collaboration conversation. So if you tell me that we need to mask people&#8217;s personal identifiable information in that scenario, their names, their date of births, maybe some of their deployment information. I&#8217;m gonna ask you, where are we masking it?</p><p>Am I masking it in the lake, the warehouse, and in the bi semantic layer?</p><p>Or are we saying actually the lake can hold the raw data? When we get to the data warehouse layer, then we&#8217;re gonna mask it, which means we now need to control access to that data lake layer that only certain people with certain, authority can see the data in there. So now I have a boundary, nobody&#8217;s allowed into that layer unless they pass a certain security level. So again, they&#8217;re just helping us make tradeoff decisions and also have a conversation about what happens, where and what doesn&#8217;t. And what does the contract look like?</p><p><strong>Rob</strong>: Yeah, exactly. And then again, it gives you that extra view, like you talked about kinda access control of, if you&#8217;re in the cloud, there is a different view of this diagram where you might talk about networking or access management. And again, the exec level probably don&#8217;t need to see that, even if it&#8217;s a requirement, you might just have that as a bullet underneath with a check mark. But then in the more technical view, you really say, we&#8217;ve made this group which has these permissions, and that&#8217;s how we&#8217;ve satisfied the requirement. Again, it&#8217;s all about, yeah, controlling the information you present and which grain you capture those requirements.</p><p><strong>Shane</strong>: So this idea of nodes and leaks has been around for ages. This idea of system thinking, it came outta lean manufacturing. It&#8217;s been around for a long time. The idea of business process mapping and understanding the flow of work being around for ages. The idea of enterprise architectures and diagrams that hold the ability to tell different stories at different levels, been around for ages.</p><p>Why do you think in the data domain, it&#8217;s very rarely used?</p><p><strong>Rob</strong>: It&#8217;s a great question. I dunno the answer. I think in my experience, data, people have always, for whatever reason, decided not to learn from, say, software engineering, like data engineers. And so data folks tend to be, can be very technical. They&#8217;re almost very cultish in like they do data. And I think there&#8217;s very few people relative to the number of data folks who actually understand the business and how processes work and how to communicate that. So I think one of the biggest things I learned from academia that I&#8217;ve brought to my career is communication is presenting to multidisciplinary audience audiences. And I think that&#8217;s something that it&#8217;s just skipped if you go from undergraduate degree to an internship and you just that, that piece of learning is missed a lot of the time. And so I&#8217;m not saying that&#8217;s the only other main reason, but it&#8217;s a practical reason I think that I&#8217;ve seen in my career that prevents this kind of thing from picking up direction.</p><p><strong>Shane</strong>: I think we segment the work into hyper specialization and then we start at the lowest level. So we introduce the idea of, a data engineer that&#8217;s gonna write code using. Or we bring in, that you&#8217;re gonna be an analytics engineer that&#8217;s gonna write a model in DBT and we don&#8217;t teach end-to-end system thinking as a framework, as a set of patterns that you&#8217;ve gotta understand first before you can go and do the work.</p><p>If you walk into a factory, and again, I worked in the factory as a kid 30 odd years ago, but you walk the line, you&#8217;d understand the flow when you get onboarded, the flow of work from the beginning of the factory to the end so that you understood where your station was what your part in it was.</p><p>And I don&#8217;t think we do that in the data world. I think the other thing is and one of the reasons I wanted to get you on is that article you wrote, it was simple to understand, for me at least. The thinking is really aligned to the way I think, and that probably helped. But what I often find, especially if I look at academic writing.</p><p>It&#8217;s really research based. It&#8217;s quite technical. It&#8217;s lots of complex ideas that aren&#8217;t distilled in the story I can understand. , I read them and I&#8217;m like, ah, I don&#8217;t get it. And so to take that complexity and write it with simplicity is actually really hard. Again, big ups to you for writing an article that distilled what is quite a simple idea, but can be complex down into something that&#8217;s easy to understand in the written word.</p><p><strong>Rob</strong>: Yeah, I think part of that comes from the kind of, even though I&#8217;ve jumped fields a lot in my career, the one consistent has been an interest in mentoring and developing others. And so to do that, you really need to be able to make information accessible, whether that&#8217;s just choosing the right type of diagram, using the right vocabulary, whatever it is. So a lot of these ideas that I am drawn to. Yeah, it&#8217;s often something which I&#8217;ve not seen someone else explain quite simply. So I take up the challenge to do it because I think it&#8217;s has value to even if only, yeah, two or three people read that and say, I now get it. I&#8217;m happy with that.</p><p>That&#8217;s, that&#8217;s worth it. But I think that kind of thinking as you said, the systems thinking, the context, being able to understand how these systems come together, the modular parts that they&#8217;re made up of, and then how to communicate that not just between technical teams, so not just between data engineers, but also data engineers and data analysts, but also data engineers with managers with C-Suite. That is I think, one of the biggest gaps. And especially now with, lots of buzz that you see about, we don&#8217;t need junior engineers. There&#8217;s not just a technical deficit there just because you can outsource some work to LLMs. There is a real issue of new starters, not picking up in quotes, the soft skills, communication skills, and learn how from the start to talk about these ideas, to communicate these ideas. And I think that&#8217;s something which is a concern for me and something which I&#8217;m podcasts like this, I think do a really good job of starting to address the gap of giving people another avenue to learn something like this quite easily.</p><p><strong>Shane</strong>: Yeah, I think there&#8217;s the whole argument right now about, role of the junior or the role of the senior when we all get 10 xd it&#8217;s gotta be interesting because I think education has to change. Because the skills that we learn, the technical skills of how to code are gonna be supported by, the tools of the future, which actually just leaves us the system problem, actually understanding how to daisy chain those tools and that code together to become far more valuable.</p><p>But just go back to that writing process. I find it relatively easy to write complex words, I can just brain dump and I can just write lots of complex stuff. But to then refine it down into something that is clear and simple and reduces the complexity and increases the cognition when you&#8217;re reading it.</p><p>I find that the hard effort, that&#8217;s where I&#8217;ve really gotta focus and iterate time and time again and spend my time. Is that what you find, or is it like the way you do your diagrams, you find it slightly different in terms of the process you use?</p><p><strong>Rob</strong>: No, I think it&#8217;s quite similar. Again, from my academic work, I would write quite technical, scientific papers, but then I would convert those into PowerPoint presentations for conference presentations and stuff. And so I theory writing stuff like this the same way. So I go from a technical idea that I understand and then I always start by creating the diagrams that capture what I&#8217;m thinking and then the words I can just naturally fit around that. &#8216; cause once I have the core concept in images, I can just walk through the process of what do you need to know to understand this image? Then what do I want you to take away from this image? And so that&#8217;s always my writing process of taking some complex and making it more accessible.</p><p><strong>Shane</strong>: I think again, that idea of writing some complex words and then drawing yourself a simple map and see what&#8217;s missing or what needs to be added, that helps that visual to written and back and forward </p><p><strong>Rob</strong>: That&#8217;s why I always have a notepad on my desk. &#8216; cause I always find, even if I&#8217;m reading or learning something, if I can draw it out, I can understand it because I&#8217;m a very visual learner, like you said. I think so, yeah. That&#8217;s how I go through this.</p><p><strong>Shane</strong>: So if people wanna follow you and read what you&#8217;re writing and get some more of these cool ideas, these cool patterns in a way that&#8217;s simple to understand, how do they find you?</p><p><strong>Rob</strong>: So the two best places to find me are on LinkedIn. So Robert S. Long, I think is my handle. And on Substack at Long last Analytics. So that&#8217;s the name of my consultancy, my surname&#8217;s long, and I like cheesy things, so I like the at long last, I&#8217;ve solved your problem aspect, so that&#8217;s where you can find me.</p><p><strong>Shane</strong>: I&#8217;ve been reading your stuff for a little while and before you mentioned that, I only just got the joke as I was, as we&#8217;re doing this podcast. I was like, ah, actually, hold on. That&#8217;s your last name. So well done that. I like cheesy as well, but normally I pick it up a lot quicker than that.</p><p>Excellent. All right anybody wants to read what Rob&#8217;s writing go to at long last Analytics on Substack hook him up on LinkedIn. Otherwise, I hope everybody has a simply magical day. </p><h2>&#171;oo&#187;</h2><div class="pullquote"><p><em>Stakeholder - &#8220;Thats not what I wanted!&#8221; <br>Data Team - &#8220;But thats what you asked for!&#8221;</em></p></div><p>Struggling to gather data requirements and constantly hearing the conversation above?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0Bu2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0Bu2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg" width="387" height="342" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:342,&quot;width&quot;:387,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:19725,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/160520537?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!0Bu2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Want to learn how to capture data and information requirements in a repeatable way so stakeholders love them and data teams can build from them, by using the Information Product Canvas.</p><p>Have I got the book for you!</p><p>Start your journey to a new Agile Data Way of Working.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://adiwow.com/168&quot;,&quot;text&quot;:&quot;Buy the Agile Data Guide now!&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://adiwow.com/168"><span>Buy the Agile Data Guide now!</span></a></p><h2>&#171;oo&#187;</h2>]]></content:encoded></item><item><title><![CDATA[A template to help you define more details in your "fluffy" Medallion data architecture]]></title><description><![CDATA[Putting some more "Meat and Potatoes" into your Data Architecures]]></description><link>https://agiledata.info/p/a-template-to-help-you-define-more</link><guid isPermaLink="false">https://agiledata.info/p/a-template-to-help-you-define-more</guid><dc:creator><![CDATA[Shagility]]></dc:creator><pubDate>Wed, 01 Oct 2025 23:15:53 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Ddzg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8273583-94ce-4b39-ba8a-cfbfdaae2b87_1175x1238.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>The Medallion &#8220;Architecture&#8221; is a good thing</h2><p>Because it has made conversations around Layered Data Architectures cool again.<br><br>But as far as a Data Architectures go it is a pretty light one (maybe thats why it has been so popular).</p><p>When you talk about what goes in your Silver layer, and I talk about what goes in my Silver layer, i&#8217;m never sure we are talking about the same thing or if we are talking at cross purposes.</p><p>And its not all the Medallion Architectures fault, when I ask you what layer you conform your data, you probably describe &#8220;conforming data&#8221; differently to me.</p><h2>Time for an Agile Data Guides Pattern Template</h2><p>So as I tend to do, after I have chunted about this problem for a while, I then look to see how I might solve it.</p><p>And in this case I decided that an Pattern Template would be the best way.</p><h2>Agile Data Guides - Data Architecture Layers Pattern Checklist</h2><p>So over the past few months I have been iterating and testing the Pattern Template and have got it to the stage that I want to share it widely and get some more feedback.<br><br>The template is currently a Google Sheet, as that was the easiest way for me to iterate it.</p><p></p><blockquote><p><a href="http://adiwow.com/5290">http://adiwow.com/5290</a></p></blockquote><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;http://adiwow.com/5290&quot;,&quot;text&quot;:&quot;Get the template&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="http://adiwow.com/5290"><span>Get the template</span></a></p><p></p><p>You can browse this version, you will need to copy it if you want to actually use it.</p><p>The Pattern Template is open source so feel free to grab it, use it, abuse it, change it, do what ever is needed to get value from it.</p><h3>Sharing is Caring</h3><p>Any feedback on what was useful, what was pants and what I should add or change next is always appreciated.</p><h2>Quick Overview of the Template</h2><h2>Instructions</h2><p>The <em><strong>Instructions</strong> </em>tab is a quick and very rough overview</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1IRv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8449410a-2f37-4e54-9cad-713187388cf7_1459x1242.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1IRv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8449410a-2f37-4e54-9cad-713187388cf7_1459x1242.png 424w, https://substackcdn.com/image/fetch/$s_!1IRv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8449410a-2f37-4e54-9cad-713187388cf7_1459x1242.png 848w, https://substackcdn.com/image/fetch/$s_!1IRv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8449410a-2f37-4e54-9cad-713187388cf7_1459x1242.png 1272w, https://substackcdn.com/image/fetch/$s_!1IRv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8449410a-2f37-4e54-9cad-713187388cf7_1459x1242.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1IRv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8449410a-2f37-4e54-9cad-713187388cf7_1459x1242.png" width="1456" height="1239" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8449410a-2f37-4e54-9cad-713187388cf7_1459x1242.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1239,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:156113,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/175058470?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8449410a-2f37-4e54-9cad-713187388cf7_1459x1242.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1IRv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8449410a-2f37-4e54-9cad-713187388cf7_1459x1242.png 424w, https://substackcdn.com/image/fetch/$s_!1IRv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8449410a-2f37-4e54-9cad-713187388cf7_1459x1242.png 848w, https://substackcdn.com/image/fetch/$s_!1IRv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8449410a-2f37-4e54-9cad-713187388cf7_1459x1242.png 1272w, https://substackcdn.com/image/fetch/$s_!1IRv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8449410a-2f37-4e54-9cad-713187388cf7_1459x1242.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Template</h2><p>The <em><strong>Template</strong></em> tab is where you do the busy work, I suggest you copy it to a new tab and call it what ever your Data Platform for Organisation is.<br><br>If your a Consultant, create a separate Google Sheet or a separate tab for each Organisations you work with.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uvMp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa94f0bc6-ed5a-46dc-b094-9a8d19da66a0_1516x1242.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uvMp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa94f0bc6-ed5a-46dc-b094-9a8d19da66a0_1516x1242.png 424w, https://substackcdn.com/image/fetch/$s_!uvMp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa94f0bc6-ed5a-46dc-b094-9a8d19da66a0_1516x1242.png 848w, https://substackcdn.com/image/fetch/$s_!uvMp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa94f0bc6-ed5a-46dc-b094-9a8d19da66a0_1516x1242.png 1272w, https://substackcdn.com/image/fetch/$s_!uvMp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa94f0bc6-ed5a-46dc-b094-9a8d19da66a0_1516x1242.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uvMp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa94f0bc6-ed5a-46dc-b094-9a8d19da66a0_1516x1242.png" width="1456" height="1193" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a94f0bc6-ed5a-46dc-b094-9a8d19da66a0_1516x1242.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1193,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:152706,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/175058470?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa94f0bc6-ed5a-46dc-b094-9a8d19da66a0_1516x1242.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uvMp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa94f0bc6-ed5a-46dc-b094-9a8d19da66a0_1516x1242.png 424w, https://substackcdn.com/image/fetch/$s_!uvMp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa94f0bc6-ed5a-46dc-b094-9a8d19da66a0_1516x1242.png 848w, https://substackcdn.com/image/fetch/$s_!uvMp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa94f0bc6-ed5a-46dc-b094-9a8d19da66a0_1516x1242.png 1272w, https://substackcdn.com/image/fetch/$s_!uvMp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa94f0bc6-ed5a-46dc-b094-9a8d19da66a0_1516x1242.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Define your layers</h3><p>First step on the Template tab is to define your Data Layers.</p><p>The Template has 4 layers give them names, you can use Bronze, Silver, Gold etc or any other names you use internally.</p><p>Here is what we use for our AgileData Platform.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3zko!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed155b04-384b-426a-9bb8-269719bcf1a0_1825x422.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3zko!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed155b04-384b-426a-9bb8-269719bcf1a0_1825x422.png 424w, https://substackcdn.com/image/fetch/$s_!3zko!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed155b04-384b-426a-9bb8-269719bcf1a0_1825x422.png 848w, https://substackcdn.com/image/fetch/$s_!3zko!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed155b04-384b-426a-9bb8-269719bcf1a0_1825x422.png 1272w, https://substackcdn.com/image/fetch/$s_!3zko!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed155b04-384b-426a-9bb8-269719bcf1a0_1825x422.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3zko!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed155b04-384b-426a-9bb8-269719bcf1a0_1825x422.png" width="1200" height="277.74725274725273" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ed155b04-384b-426a-9bb8-269719bcf1a0_1825x422.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:337,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:90036,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/175058470?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed155b04-384b-426a-9bb8-269719bcf1a0_1825x422.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3zko!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed155b04-384b-426a-9bb8-269719bcf1a0_1825x422.png 424w, https://substackcdn.com/image/fetch/$s_!3zko!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed155b04-384b-426a-9bb8-269719bcf1a0_1825x422.png 848w, https://substackcdn.com/image/fetch/$s_!3zko!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed155b04-384b-426a-9bb8-269719bcf1a0_1825x422.png 1272w, https://substackcdn.com/image/fetch/$s_!3zko!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed155b04-384b-426a-9bb8-269719bcf1a0_1825x422.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><br>If you need more Layers, then copy a Column and paste it right, make sure you copy all the lookup cells in each row.</p><p>If you hover over a row heading you will see a short description of what  Principle, Policy or Pattern that row is defining.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9HM4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7069946c-5e86-4220-ab54-fd0949b7be4f_1389x413.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9HM4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7069946c-5e86-4220-ab54-fd0949b7be4f_1389x413.png 424w, https://substackcdn.com/image/fetch/$s_!9HM4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7069946c-5e86-4220-ab54-fd0949b7be4f_1389x413.png 848w, https://substackcdn.com/image/fetch/$s_!9HM4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7069946c-5e86-4220-ab54-fd0949b7be4f_1389x413.png 1272w, https://substackcdn.com/image/fetch/$s_!9HM4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7069946c-5e86-4220-ab54-fd0949b7be4f_1389x413.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9HM4!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7069946c-5e86-4220-ab54-fd0949b7be4f_1389x413.png" width="1200" height="356.80345572354213" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7069946c-5e86-4220-ab54-fd0949b7be4f_1389x413.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:413,&quot;width&quot;:1389,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:48028,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/175058470?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7069946c-5e86-4220-ab54-fd0949b7be4f_1389x413.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9HM4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7069946c-5e86-4220-ab54-fd0949b7be4f_1389x413.png 424w, https://substackcdn.com/image/fetch/$s_!9HM4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7069946c-5e86-4220-ab54-fd0949b7be4f_1389x413.png 848w, https://substackcdn.com/image/fetch/$s_!9HM4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7069946c-5e86-4220-ab54-fd0949b7be4f_1389x413.png 1272w, https://substackcdn.com/image/fetch/$s_!9HM4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7069946c-5e86-4220-ab54-fd0949b7be4f_1389x413.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Click on the drop down in the Cell for that Row and Layer and you will get a list of Options.<br><br>Select one or many of those options up to you.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!M1Jb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43f594be-48b6-4072-a13b-868128f753fd_1523x532.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!M1Jb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43f594be-48b6-4072-a13b-868128f753fd_1523x532.png 424w, https://substackcdn.com/image/fetch/$s_!M1Jb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43f594be-48b6-4072-a13b-868128f753fd_1523x532.png 848w, https://substackcdn.com/image/fetch/$s_!M1Jb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43f594be-48b6-4072-a13b-868128f753fd_1523x532.png 1272w, https://substackcdn.com/image/fetch/$s_!M1Jb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43f594be-48b6-4072-a13b-868128f753fd_1523x532.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!M1Jb!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43f594be-48b6-4072-a13b-868128f753fd_1523x532.png" width="1200" height="419.5054945054945" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/43f594be-48b6-4072-a13b-868128f753fd_1523x532.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:509,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:100441,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/175058470?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43f594be-48b6-4072-a13b-868128f753fd_1523x532.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!M1Jb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43f594be-48b6-4072-a13b-868128f753fd_1523x532.png 424w, https://substackcdn.com/image/fetch/$s_!M1Jb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43f594be-48b6-4072-a13b-868128f753fd_1523x532.png 848w, https://substackcdn.com/image/fetch/$s_!M1Jb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43f594be-48b6-4072-a13b-868128f753fd_1523x532.png 1272w, https://substackcdn.com/image/fetch/$s_!M1Jb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43f594be-48b6-4072-a13b-868128f753fd_1523x532.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This then sets the rules for that Principle, Policy or Pattern for that Layer.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6EbG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0724f603-0ffe-451c-afd2-86e0a7fc0537_1523x532.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6EbG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0724f603-0ffe-451c-afd2-86e0a7fc0537_1523x532.png 424w, https://substackcdn.com/image/fetch/$s_!6EbG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0724f603-0ffe-451c-afd2-86e0a7fc0537_1523x532.png 848w, https://substackcdn.com/image/fetch/$s_!6EbG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0724f603-0ffe-451c-afd2-86e0a7fc0537_1523x532.png 1272w, https://substackcdn.com/image/fetch/$s_!6EbG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0724f603-0ffe-451c-afd2-86e0a7fc0537_1523x532.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6EbG!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0724f603-0ffe-451c-afd2-86e0a7fc0537_1523x532.png" width="1200" height="419.5054945054945" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0724f603-0ffe-451c-afd2-86e0a7fc0537_1523x532.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:509,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:96054,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/175058470?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0724f603-0ffe-451c-afd2-86e0a7fc0537_1523x532.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6EbG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0724f603-0ffe-451c-afd2-86e0a7fc0537_1523x532.png 424w, https://substackcdn.com/image/fetch/$s_!6EbG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0724f603-0ffe-451c-afd2-86e0a7fc0537_1523x532.png 848w, https://substackcdn.com/image/fetch/$s_!6EbG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0724f603-0ffe-451c-afd2-86e0a7fc0537_1523x532.png 1272w, https://substackcdn.com/image/fetch/$s_!6EbG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0724f603-0ffe-451c-afd2-86e0a7fc0537_1523x532.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The value of the Pattern Template is being able to see what is defined in each layer and more importantly see the differences between the layers.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!983O!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6d6eb92-562a-40a3-97b7-8582a0a9b253_1751x581.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!983O!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6d6eb92-562a-40a3-97b7-8582a0a9b253_1751x581.png 424w, https://substackcdn.com/image/fetch/$s_!983O!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6d6eb92-562a-40a3-97b7-8582a0a9b253_1751x581.png 848w, https://substackcdn.com/image/fetch/$s_!983O!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6d6eb92-562a-40a3-97b7-8582a0a9b253_1751x581.png 1272w, https://substackcdn.com/image/fetch/$s_!983O!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6d6eb92-562a-40a3-97b7-8582a0a9b253_1751x581.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!983O!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6d6eb92-562a-40a3-97b7-8582a0a9b253_1751x581.png" width="1200" height="398.0769230769231" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e6d6eb92-562a-40a3-97b7-8582a0a9b253_1751x581.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:483,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:63774,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/175058470?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6d6eb92-562a-40a3-97b7-8582a0a9b253_1751x581.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!983O!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6d6eb92-562a-40a3-97b7-8582a0a9b253_1751x581.png 424w, https://substackcdn.com/image/fetch/$s_!983O!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6d6eb92-562a-40a3-97b7-8582a0a9b253_1751x581.png 848w, https://substackcdn.com/image/fetch/$s_!983O!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6d6eb92-562a-40a3-97b7-8582a0a9b253_1751x581.png 1272w, https://substackcdn.com/image/fetch/$s_!983O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6d6eb92-562a-40a3-97b7-8582a0a9b253_1751x581.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Rinse and repeat for each row and layer until you have a populated template.</p><h2>Example - AgileData</h2><p>The <em><strong>Example - AgileData </strong></em>tab has an example of a completed template based on our AgileData Platform.<br><br>I use this to test the template as I iterate it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ddzg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8273583-94ce-4b39-ba8a-cfbfdaae2b87_1175x1238.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ddzg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8273583-94ce-4b39-ba8a-cfbfdaae2b87_1175x1238.png 424w, https://substackcdn.com/image/fetch/$s_!Ddzg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8273583-94ce-4b39-ba8a-cfbfdaae2b87_1175x1238.png 848w, https://substackcdn.com/image/fetch/$s_!Ddzg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8273583-94ce-4b39-ba8a-cfbfdaae2b87_1175x1238.png 1272w, https://substackcdn.com/image/fetch/$s_!Ddzg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8273583-94ce-4b39-ba8a-cfbfdaae2b87_1175x1238.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ddzg!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8273583-94ce-4b39-ba8a-cfbfdaae2b87_1175x1238.png" width="1200" height="1264.340425531915" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e8273583-94ce-4b39-ba8a-cfbfdaae2b87_1175x1238.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1238,&quot;width&quot;:1175,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:213530,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/175058470?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8273583-94ce-4b39-ba8a-cfbfdaae2b87_1175x1238.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ddzg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8273583-94ce-4b39-ba8a-cfbfdaae2b87_1175x1238.png 424w, https://substackcdn.com/image/fetch/$s_!Ddzg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8273583-94ce-4b39-ba8a-cfbfdaae2b87_1175x1238.png 848w, https://substackcdn.com/image/fetch/$s_!Ddzg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8273583-94ce-4b39-ba8a-cfbfdaae2b87_1175x1238.png 1272w, https://substackcdn.com/image/fetch/$s_!Ddzg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8273583-94ce-4b39-ba8a-cfbfdaae2b87_1175x1238.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Dictionary</h3><p>The <em><strong>Dictionary </strong></em>tab reads from the <em><strong>Lookup</strong></em> tab.</p><p>It gives you a Description for each row and for each value that is available in that row.</p><div class="pullquote"><p>These need work, reach out if you want to help iterate them to make them clearer.</p></div><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!F2NR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f55e73b-9b07-4782-b722-4733442500ff_1839x1238.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!F2NR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f55e73b-9b07-4782-b722-4733442500ff_1839x1238.png 424w, https://substackcdn.com/image/fetch/$s_!F2NR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f55e73b-9b07-4782-b722-4733442500ff_1839x1238.png 848w, https://substackcdn.com/image/fetch/$s_!F2NR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f55e73b-9b07-4782-b722-4733442500ff_1839x1238.png 1272w, https://substackcdn.com/image/fetch/$s_!F2NR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f55e73b-9b07-4782-b722-4733442500ff_1839x1238.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!F2NR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f55e73b-9b07-4782-b722-4733442500ff_1839x1238.png" width="1456" height="980" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0f55e73b-9b07-4782-b722-4733442500ff_1839x1238.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:980,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:375570,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/175058470?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f55e73b-9b07-4782-b722-4733442500ff_1839x1238.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!F2NR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f55e73b-9b07-4782-b722-4733442500ff_1839x1238.png 424w, https://substackcdn.com/image/fetch/$s_!F2NR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f55e73b-9b07-4782-b722-4733442500ff_1839x1238.png 848w, https://substackcdn.com/image/fetch/$s_!F2NR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f55e73b-9b07-4782-b722-4733442500ff_1839x1238.png 1272w, https://substackcdn.com/image/fetch/$s_!F2NR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f55e73b-9b07-4782-b722-4733442500ff_1839x1238.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5-dd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b688b2d-0fe0-401d-a6d1-217169364930_1839x1238.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5-dd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b688b2d-0fe0-401d-a6d1-217169364930_1839x1238.png 424w, https://substackcdn.com/image/fetch/$s_!5-dd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b688b2d-0fe0-401d-a6d1-217169364930_1839x1238.png 848w, https://substackcdn.com/image/fetch/$s_!5-dd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b688b2d-0fe0-401d-a6d1-217169364930_1839x1238.png 1272w, https://substackcdn.com/image/fetch/$s_!5-dd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b688b2d-0fe0-401d-a6d1-217169364930_1839x1238.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5-dd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b688b2d-0fe0-401d-a6d1-217169364930_1839x1238.png" width="1456" height="980" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2b688b2d-0fe0-401d-a6d1-217169364930_1839x1238.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:980,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:341604,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/175058470?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b688b2d-0fe0-401d-a6d1-217169364930_1839x1238.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5-dd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b688b2d-0fe0-401d-a6d1-217169364930_1839x1238.png 424w, https://substackcdn.com/image/fetch/$s_!5-dd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b688b2d-0fe0-401d-a6d1-217169364930_1839x1238.png 848w, https://substackcdn.com/image/fetch/$s_!5-dd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b688b2d-0fe0-401d-a6d1-217169364930_1839x1238.png 1272w, https://substackcdn.com/image/fetch/$s_!5-dd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b688b2d-0fe0-401d-a6d1-217169364930_1839x1238.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h3>Lookups</h3><p>The <em><strong>Lookups </strong></em>tab us where I have defined the Vlaues and their Descriptions that are used in the <em><strong>Template</strong></em> tab.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!a0z-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61faab8a-b5c9-4d2d-a701-60d835df7921_1839x1238.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!a0z-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61faab8a-b5c9-4d2d-a701-60d835df7921_1839x1238.png 424w, https://substackcdn.com/image/fetch/$s_!a0z-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61faab8a-b5c9-4d2d-a701-60d835df7921_1839x1238.png 848w, https://substackcdn.com/image/fetch/$s_!a0z-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61faab8a-b5c9-4d2d-a701-60d835df7921_1839x1238.png 1272w, https://substackcdn.com/image/fetch/$s_!a0z-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61faab8a-b5c9-4d2d-a701-60d835df7921_1839x1238.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!a0z-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61faab8a-b5c9-4d2d-a701-60d835df7921_1839x1238.png" width="1456" height="980" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/61faab8a-b5c9-4d2d-a701-60d835df7921_1839x1238.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:980,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:315849,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/175058470?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61faab8a-b5c9-4d2d-a701-60d835df7921_1839x1238.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!a0z-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61faab8a-b5c9-4d2d-a701-60d835df7921_1839x1238.png 424w, https://substackcdn.com/image/fetch/$s_!a0z-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61faab8a-b5c9-4d2d-a701-60d835df7921_1839x1238.png 848w, https://substackcdn.com/image/fetch/$s_!a0z-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61faab8a-b5c9-4d2d-a701-60d835df7921_1839x1238.png 1272w, https://substackcdn.com/image/fetch/$s_!a0z-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61faab8a-b5c9-4d2d-a701-60d835df7921_1839x1238.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Iterating the Pattern Template</h2><h3>More Lookup Values</h3><p>Feel free to add your own Lookups to the <em><strong>Lookups</strong></em> tab.</p><p>If you add a lookup value below the other values it should turn up on the <em><strong>Template</strong></em> tab automagically.</p><p>But you will need to manually add it to the <em><strong>Dictionary</strong></em> tab.  That tab is formula driven so just copy the formulas from another row and edit them.</p><h3>Notes</h3><p>The notes on the <em><strong>Template</strong></em> tab are manually copied and pasted from the <em><strong>Lookups</strong></em> tab as I haven&#8217;t spent any time working out how to automate that in Google Sheets.<br><br>So if you change anything you will need to manually copy and paste those.</p><h2>Next steps</h2><p>I will keep iterating the Pattern Template based on feedback and as I test it with more Data Teams and Organisations.</p><p>I am also keen to move it to a App to make it easier to maintain.</p><p>That App will need to be open source / free as in no pay wall to use.<br><br>If your keen to develop this with me, feel free to reach out.</p>]]></content:encoded></item><item><title><![CDATA[Define Once Reuse Often (DORO) and my friend Disco]]></title><description><![CDATA[Its amazing what you find valuable a second time around and end up reusing]]></description><link>https://agiledata.info/p/define-once-reuse-often-doro-and</link><guid isPermaLink="false">https://agiledata.info/p/define-once-reuse-often-doro-and</guid><dc:creator><![CDATA[Shagility]]></dc:creator><pubDate>Wed, 10 Sep 2025 05:36:54 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/0c236068-c6cd-4a7f-bf99-ab902ea26e42_10000x10000.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We are doing some experimentation on how we can improve ADI in the sub domain of Data Modeling.</p><p>As we worked with an Agile Data Network partner this week to help them onboard a new Customer, we decided to use an ADI first approach for all the data work in the AgileData App and Platform and see what would happen.</p><p>And of course part of all data work is data modeling.  </p><div class="pullquote"><p>It doesn&#8217;t matter if you consciously data model or not, as soon as you transform or store data you are modeling data.</p><p>We prefer to consciously model data instead of letting it happen unconsciously.</p></div><p>So we got ADI to look at the source system data that had been collected into History, it was data we had never modeled before, and got her to take us through the data modeling process.</p><p>Our Agile Data Network partner had already worked with the customer to understand their required Information Products and so had a good understanding of the Core Business Concepts, Core Business Processes, Facts, Measures and Metrics that would potentially meet their organisational needs.</p><p>This also stopped us letting ADI define a source system specific data model that would break on the first engagement with a Stakeholders chnaging requirements.</p><p>ADI did ok, but we always know we can do better, so time to McSpikey.</p><h2>Agile Data Disco</h2><p>Last year we did a raft of work for a Customer, where we reverse engineered 100&#8217;s of their Cognos Report definitions to help document their legacy data platform before they moved to a greenfield&#8217;s Modern Data Stack.  This work removed the need for a team of Business / Data Analysts to spend months documenting the legacy system.</p><p>We ended up building this in public and semi-productising it, calling it Agile Data Disco.</p><p> You can read about that journey here:</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;c16d90f2-3177-4d4f-af78-bed4d27df2e1&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;We are working on something new at AgileData, follow us as we build it in public #AgileDataDisco&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:2774203,&quot;name&quot;:&quot;Shagility&quot;,&quot;bio&quot;:&quot;I help data and analytics teams change the Way they Work in a Simply Magical Way&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f09a2d19-6707-4ef9-a4e3-a5e770fb640f_1406x853.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2024-06-05T23:05:53.759Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9d003782-1b80-4220-9bab-c84441acd5af_2726x2958.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://agiledata.substack.com/p/we-are-working-on-something-new-at&quot;,&quot;section_name&quot;:&quot;AgileData Product&quot;,&quot;video_upload_id&quot;:null,&quot;id&quot;:145357014,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:0,&quot;comment_count&quot;:0,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;Agile Data N&#8217; Info&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!ErtR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8892c64-a0c7-4c7b-9f49-a73be5280f22_1280x1280.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>We saw a potential market solving that data problem of understanding what a legacy data platform actually contains.<br><br><a href="https://agiledata.team/data-problems/use-case/legacy-data-platform-discovery/">https://agiledata.team/data-problems/use-case/legacy-data-platform-discovery/</a></p><p>You can see an interactive demo of the final Agile Data Disco product we built to help us do that Fractional Data Work here:</p><p><a href="https://agiledata.cloud/disco/#demo">https://agiledata.cloud/disco/#demo</a></p><p>One of the things we did as part of Disco was some interesting prompt engineering to take a single input (say a blob of SQL, a log from SQL execution or the defintion of a report) and from that create a series of useful populated Pattern Templates as the output.<br><br>The outputs we generate are:</p><ul><li><p>Information Product Canvas</p></li><li><p>Event Model</p></li><li><p>Conceptual Model</p></li><li><p>Physical Model</p></li><li><p>Reporting Model</p></li><li><p>Business Glossary</p></li><li><p>Data Dictionary</p></li><li><p>Metric Definitions</p></li><li><p>Bus Matrix</p></li><li><p>Source Mapping</p></li></ul><p>And we also did another McSpikey with Disco where I uploaded an image of a completed Information Product Canvas and had Disco generate all those object, from that one image.<br><br>You can read about that one or see the Interactive Demo for it here:</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;7b7bc894-7ecb-4b09-9ae9-7debb3bc946e&quot;,&quot;caption&quot;:&quot;A while ago we added the ability to upload an image to AgileData Disco, it was so customers could upload screenshots of their Dashboards and we could get Disco to document their Data Environments based on those images.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Can we use an Information Product Canvas image to start the data design process?&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:2774203,&quot;name&quot;:&quot;Shagility&quot;,&quot;bio&quot;:&quot;I help data and analytics teams change the Way they Work in a Simply Magical Way&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f09a2d19-6707-4ef9-a4e3-a5e770fb640f_1406x853.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-02-19T21:44:49.555Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ecc0bb9a-cfed-4d4b-8cd6-336d567e4c5e_183x137.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://agiledata.substack.com/p/can-we-use-an-information-product&quot;,&quot;section_name&quot;:&quot;AgileData Product&quot;,&quot;video_upload_id&quot;:null,&quot;id&quot;:157496656,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:0,&quot;comment_count&quot;:0,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;Agile Data N&#8217; Info&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!ErtR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8892c64-a0c7-4c7b-9f49-a73be5280f22_1280x1280.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><h2>Can you guess where we went with this</h2><p>Yup we took the core parts of Disco and did a McSpikey to see how much of the Disco prompt and reinforcement logic we could reuse to improve ADI.</p><h2>Results of the McSpikey still to come</h2><p>I still need to take the time to write up the results of the McSpikey properly, but the TL:DR is:</p><ul><li><p>Reusing the Disco prompt and reinforcement logic had massive value &#8220;time to  value&#8221; wise;</p></li><li><p>We ended up experimenting with breaking out a ADI sub agent - ADI the Agile Data Modeller, which had some real benefits in improvement in the ADI responses.</p></li></ul><h2>A word from our sponsor</h2><p>One of the things we got asked in the early days of Disco was rather than just document what was in the legacy Data Platform, could we automagically migrate it to a new data platform.</p><p>When we thought about it that looked like a very complex problem, that would take a few years of development to make feasible and we would end up with a many to many problem before it was viable, we would need to read from many tools and technologies and also need to write to many tools and technologies.  Until you had critical mass at both the read and writes ends the product wouldn&#8217;t be viable.</p><p>And we also saw data platform modernisation as a great catalyst for organisations to rethink the way their data teams worked, and to rearchitect a lot more than just their database, ETL tool and BI tool.  So we weren&#8217;t fans of supporting a better &#8220;like for like&#8221; modernisation pattern.</p><div class="pullquote"><p>Getting permission to replace your data stack, is often the easiest business case to get signed off to be able to change the way your data team works.<br><br>And also a way to finally pay back those years of technical debt (by rebuilding it all again)</p></div><p>But we also had in the back of our mind the idea that if we could take a Customers legacy data platform and automagically migrate it to the AgileData Platform with minimal human effort, that would be very valuable in removing one of the key points of friction for working with potential Fractional Data Service customers who had already had invested a shit ton of money in a data platform (legacy or modern).</p><p>Because our AgileData Platform is based on our very opinionated Ways of Working, Data Engineering and DataOps patterns, this wouldn&#8217;t be a &#8220;like for like&#8221; but an automated rebuild from new.</p><p>As we experiment with ADI the Agile Data Modeler we are wondering if that Agent on its own has some value.  Upload your current data model, and get back a bunch of candidate data models to review.</p><p>This wouldn&#8217;t be just a LLM going text to text, we would extend the reinforcement model we used for Disco, and bring in our opinionated Business, Concept and Physical data modeling patterns to it as well.</p><p>If I have to place a bet (and I do) I am going to carry on with the Context Plane bet.  </p><p>All the experimentation and development we do in that space is also immediately available in the AgileData App and Platform, so another form of DORO.</p><p>But if you think there is value in an ADI the Agile Data Modeler on her own, reach out and lets have a chat.  I would be keen to understand the use case you have in mind.</p><h2>An incoherent stream of Context</h2><p>You can find all the previous articles with my train of thought listed in this thread:<br><br><a href="https://agiledata.substack.com/t/context-plane">https://agiledata.substack.com/t/context-plane</a></p><p>We are building the Context Plane while flying it, so always looking for early adopters to help us decide the final destination.<br><br>If you want a virtual chat grab a slot here:<br><br><a href="https://contextplane.ai/contact-us/#bookemdanno">https://contextplane.ai/contact-us/#bookemdanno</a></p>]]></content:encoded></item><item><title><![CDATA[UX patterns for the Context Plane]]></title><description><![CDATA[Ways that the Context needs to be accessed and by whom]]></description><link>https://agiledata.info/p/ux-patterns-for-the-context-plane</link><guid isPermaLink="false">https://agiledata.info/p/ux-patterns-for-the-context-plane</guid><dc:creator><![CDATA[Shagility]]></dc:creator><pubDate>Tue, 09 Sep 2025 01:48:55 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!U1lG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f30cafc-552e-449d-b050-6ca123ff793e_1479x1212.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>&#8220;One size doesn&#8217;t fit all&#8221;. </p><p>Different personas and use cases demand different ways of interacting with Context.<br></p><blockquote><p><strong>&#8220;Context&#8221; of this post</strong></p></blockquote><p>I often find writing helps me coalesce and refine my thoughts when new patterns start to emerge, but aren&#8217;t very clear yet.  </p><p>So this article is a brain dump / train of thought continuation of the architecture needed to have one Context Plane to rule them all, as part of a proposed &#8220;AI Data Stack&#8221;.<br><br>This article provides an overview of the specific persona types / use cases I have identified so far that need to access the Context Plane and the typical UX patterns for some of those. </p><h2>Updated Context Plane Architecture</h2><p>This iteration in my thinking has a resulted in an update to the architecture diagram for the Context Plane:</p><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!U1lG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f30cafc-552e-449d-b050-6ca123ff793e_1479x1212.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!U1lG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f30cafc-552e-449d-b050-6ca123ff793e_1479x1212.png 424w, https://substackcdn.com/image/fetch/$s_!U1lG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f30cafc-552e-449d-b050-6ca123ff793e_1479x1212.png 848w, https://substackcdn.com/image/fetch/$s_!U1lG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f30cafc-552e-449d-b050-6ca123ff793e_1479x1212.png 1272w, https://substackcdn.com/image/fetch/$s_!U1lG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f30cafc-552e-449d-b050-6ca123ff793e_1479x1212.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!U1lG!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f30cafc-552e-449d-b050-6ca123ff793e_1479x1212.png" width="1200" height="983.2417582417582" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7f30cafc-552e-449d-b050-6ca123ff793e_1479x1212.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1193,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:610919,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/173056020?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f30cafc-552e-449d-b050-6ca123ff793e_1479x1212.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!U1lG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f30cafc-552e-449d-b050-6ca123ff793e_1479x1212.png 424w, https://substackcdn.com/image/fetch/$s_!U1lG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f30cafc-552e-449d-b050-6ca123ff793e_1479x1212.png 848w, https://substackcdn.com/image/fetch/$s_!U1lG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f30cafc-552e-449d-b050-6ca123ff793e_1479x1212.png 1272w, https://substackcdn.com/image/fetch/$s_!U1lG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f30cafc-552e-449d-b050-6ca123ff793e_1479x1212.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><br></p><h2>The four personas / use cases</h2><p>Here are the 4 persona types / use cases I have discovered so far:</p><ul><li><p><strong>Human, GUI centric</strong><br>A person who wants to use a web based App, chat or Graphical User Interface to access/discover/explore/update the Context.</p></li><li><p><strong>Human, Code Centric</strong><br>Wants to use a Command Line Interface (CLI) or Code based App to access/discover/explore/update the Context.</p></li><li><p><strong>System<br></strong>Systems can access the Context directly, either querying the Context or creating to it programatically.</p></li><li><p><strong>Agent to Agent</strong><br>AI Agents can access/discover/explore/update Context autonomously, collaborating with other agents without a human in the loop.</p><p></p></li></ul><h2>The UX patterns</h2><p>Here are the UX patterns I have experimented with that seem to make sense:</p><ul><li><p><strong>Human, GUI centric</strong></p><ul><li><p>GUI centric Data Catalog</p></li><li><p>GUI centric Chatbot</p></li></ul><p></p></li><li><p><strong>Human, Code Centric</strong></p><ul><li><p>GenAI App</p></li><li><p>CLI tool</p></li></ul><p></p></li><li><p><strong>System</strong></p><ul><li><p>API&#8217;s</p></li></ul></li></ul><p></p><h3><strong>GUI centric Data Catalog</strong></h3><p>Your typical browser based Data Catalog interface.</p><div id="youtube2-RZSCBIhGBn4" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;RZSCBIhGBn4&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/RZSCBIhGBn4?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Prefer a manual click through version?</p><p><a href="https://guides.agiledata.io/demo/cmflomypg048l170irpj5zf6h">https://guides.agiledata.io/demo/cmflomypg048l170irpj5zf6h</a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://guides.agiledata.io/demo/cmflomypg048l170irpj5zf6h&quot;,&quot;text&quot;:&quot;Click through demo&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://guides.agiledata.io/demo/cmflomypg048l170irpj5zf6h"><span>Click through demo</span></a></p><p></p><h3><strong>GUI centric Chatbot</strong></h3><p>Your typical browser Chatbot and Text to SQL interface.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jEDG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F072af401-42d0-4337-8274-2ecc4be25258_1571x1243.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jEDG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F072af401-42d0-4337-8274-2ecc4be25258_1571x1243.png 424w, https://substackcdn.com/image/fetch/$s_!jEDG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F072af401-42d0-4337-8274-2ecc4be25258_1571x1243.png 848w, https://substackcdn.com/image/fetch/$s_!jEDG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F072af401-42d0-4337-8274-2ecc4be25258_1571x1243.png 1272w, https://substackcdn.com/image/fetch/$s_!jEDG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F072af401-42d0-4337-8274-2ecc4be25258_1571x1243.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jEDG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F072af401-42d0-4337-8274-2ecc4be25258_1571x1243.png" width="1456" height="1152" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/072af401-42d0-4337-8274-2ecc4be25258_1571x1243.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1152,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:589833,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/173056020?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F072af401-42d0-4337-8274-2ecc4be25258_1571x1243.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jEDG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F072af401-42d0-4337-8274-2ecc4be25258_1571x1243.png 424w, https://substackcdn.com/image/fetch/$s_!jEDG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F072af401-42d0-4337-8274-2ecc4be25258_1571x1243.png 848w, https://substackcdn.com/image/fetch/$s_!jEDG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F072af401-42d0-4337-8274-2ecc4be25258_1571x1243.png 1272w, https://substackcdn.com/image/fetch/$s_!jEDG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F072af401-42d0-4337-8274-2ecc4be25258_1571x1243.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!O_57!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e7f188c-5dce-4ba6-9e76-745cecf2cd95_1571x316.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!O_57!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e7f188c-5dce-4ba6-9e76-745cecf2cd95_1571x316.png 424w, https://substackcdn.com/image/fetch/$s_!O_57!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e7f188c-5dce-4ba6-9e76-745cecf2cd95_1571x316.png 848w, https://substackcdn.com/image/fetch/$s_!O_57!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e7f188c-5dce-4ba6-9e76-745cecf2cd95_1571x316.png 1272w, https://substackcdn.com/image/fetch/$s_!O_57!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e7f188c-5dce-4ba6-9e76-745cecf2cd95_1571x316.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!O_57!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e7f188c-5dce-4ba6-9e76-745cecf2cd95_1571x316.png" width="1456" height="293" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1e7f188c-5dce-4ba6-9e76-745cecf2cd95_1571x316.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:293,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:74506,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/173056020?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e7f188c-5dce-4ba6-9e76-745cecf2cd95_1571x316.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!O_57!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e7f188c-5dce-4ba6-9e76-745cecf2cd95_1571x316.png 424w, https://substackcdn.com/image/fetch/$s_!O_57!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e7f188c-5dce-4ba6-9e76-745cecf2cd95_1571x316.png 848w, https://substackcdn.com/image/fetch/$s_!O_57!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e7f188c-5dce-4ba6-9e76-745cecf2cd95_1571x316.png 1272w, https://substackcdn.com/image/fetch/$s_!O_57!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e7f188c-5dce-4ba6-9e76-745cecf2cd95_1571x316.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mRY2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3738d6e5-2ff6-4249-9961-136440a03b05_1573x1236.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mRY2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3738d6e5-2ff6-4249-9961-136440a03b05_1573x1236.png 424w, https://substackcdn.com/image/fetch/$s_!mRY2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3738d6e5-2ff6-4249-9961-136440a03b05_1573x1236.png 848w, https://substackcdn.com/image/fetch/$s_!mRY2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3738d6e5-2ff6-4249-9961-136440a03b05_1573x1236.png 1272w, https://substackcdn.com/image/fetch/$s_!mRY2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3738d6e5-2ff6-4249-9961-136440a03b05_1573x1236.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mRY2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3738d6e5-2ff6-4249-9961-136440a03b05_1573x1236.png" width="1456" height="1144" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3738d6e5-2ff6-4249-9961-136440a03b05_1573x1236.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1144,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:283618,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/173056020?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3738d6e5-2ff6-4249-9961-136440a03b05_1573x1236.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mRY2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3738d6e5-2ff6-4249-9961-136440a03b05_1573x1236.png 424w, https://substackcdn.com/image/fetch/$s_!mRY2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3738d6e5-2ff6-4249-9961-136440a03b05_1573x1236.png 848w, https://substackcdn.com/image/fetch/$s_!mRY2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3738d6e5-2ff6-4249-9961-136440a03b05_1573x1236.png 1272w, https://substackcdn.com/image/fetch/$s_!mRY2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3738d6e5-2ff6-4249-9961-136440a03b05_1573x1236.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!O2Jv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53380c1a-af14-4225-a9f5-1a794576d1c8_1573x1236.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!O2Jv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53380c1a-af14-4225-a9f5-1a794576d1c8_1573x1236.png 424w, https://substackcdn.com/image/fetch/$s_!O2Jv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53380c1a-af14-4225-a9f5-1a794576d1c8_1573x1236.png 848w, https://substackcdn.com/image/fetch/$s_!O2Jv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53380c1a-af14-4225-a9f5-1a794576d1c8_1573x1236.png 1272w, https://substackcdn.com/image/fetch/$s_!O2Jv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53380c1a-af14-4225-a9f5-1a794576d1c8_1573x1236.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!O2Jv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53380c1a-af14-4225-a9f5-1a794576d1c8_1573x1236.png" width="1456" height="1144" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/53380c1a-af14-4225-a9f5-1a794576d1c8_1573x1236.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1144,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:293870,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/173056020?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53380c1a-af14-4225-a9f5-1a794576d1c8_1573x1236.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!O2Jv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53380c1a-af14-4225-a9f5-1a794576d1c8_1573x1236.png 424w, https://substackcdn.com/image/fetch/$s_!O2Jv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53380c1a-af14-4225-a9f5-1a794576d1c8_1573x1236.png 848w, https://substackcdn.com/image/fetch/$s_!O2Jv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53380c1a-af14-4225-a9f5-1a794576d1c8_1573x1236.png 1272w, https://substackcdn.com/image/fetch/$s_!O2Jv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53380c1a-af14-4225-a9f5-1a794576d1c8_1573x1236.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mKPr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5908839-527b-4474-a003-85be2c74a3cb_1573x1236.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mKPr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5908839-527b-4474-a003-85be2c74a3cb_1573x1236.png 424w, https://substackcdn.com/image/fetch/$s_!mKPr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5908839-527b-4474-a003-85be2c74a3cb_1573x1236.png 848w, https://substackcdn.com/image/fetch/$s_!mKPr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5908839-527b-4474-a003-85be2c74a3cb_1573x1236.png 1272w, https://substackcdn.com/image/fetch/$s_!mKPr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5908839-527b-4474-a003-85be2c74a3cb_1573x1236.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mKPr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5908839-527b-4474-a003-85be2c74a3cb_1573x1236.png" width="1456" height="1144" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d5908839-527b-4474-a003-85be2c74a3cb_1573x1236.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1144,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:200008,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/173056020?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5908839-527b-4474-a003-85be2c74a3cb_1573x1236.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mKPr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5908839-527b-4474-a003-85be2c74a3cb_1573x1236.png 424w, https://substackcdn.com/image/fetch/$s_!mKPr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5908839-527b-4474-a003-85be2c74a3cb_1573x1236.png 848w, https://substackcdn.com/image/fetch/$s_!mKPr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5908839-527b-4474-a003-85be2c74a3cb_1573x1236.png 1272w, https://substackcdn.com/image/fetch/$s_!mKPr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5908839-527b-4474-a003-85be2c74a3cb_1573x1236.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h3><strong>GenAI App </strong></h3><p>Typical Claude &#8220;AI Agent&#8221; app tool interface.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Y-1G!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11776f26-ed4f-453c-b685-da3ac852d3e5_1580x1225.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Y-1G!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11776f26-ed4f-453c-b685-da3ac852d3e5_1580x1225.png 424w, https://substackcdn.com/image/fetch/$s_!Y-1G!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11776f26-ed4f-453c-b685-da3ac852d3e5_1580x1225.png 848w, https://substackcdn.com/image/fetch/$s_!Y-1G!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11776f26-ed4f-453c-b685-da3ac852d3e5_1580x1225.png 1272w, https://substackcdn.com/image/fetch/$s_!Y-1G!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11776f26-ed4f-453c-b685-da3ac852d3e5_1580x1225.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Y-1G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11776f26-ed4f-453c-b685-da3ac852d3e5_1580x1225.png" width="1456" height="1129" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/11776f26-ed4f-453c-b685-da3ac852d3e5_1580x1225.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1129,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:287899,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/173056020?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11776f26-ed4f-453c-b685-da3ac852d3e5_1580x1225.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Y-1G!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11776f26-ed4f-453c-b685-da3ac852d3e5_1580x1225.png 424w, https://substackcdn.com/image/fetch/$s_!Y-1G!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11776f26-ed4f-453c-b685-da3ac852d3e5_1580x1225.png 848w, https://substackcdn.com/image/fetch/$s_!Y-1G!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11776f26-ed4f-453c-b685-da3ac852d3e5_1580x1225.png 1272w, https://substackcdn.com/image/fetch/$s_!Y-1G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11776f26-ed4f-453c-b685-da3ac852d3e5_1580x1225.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>CLI Tool</strong></h3><p>Typical Gemini CLI command line interface.</p><p>[Screenshot TBA]</p><p></p><h3><strong>API&#8217;s</strong></h3><p>Typical API endpoints.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7F_e!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3df0d9f5-0573-4319-af6d-e116bc72d56f_1570x1244.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7F_e!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3df0d9f5-0573-4319-af6d-e116bc72d56f_1570x1244.png 424w, https://substackcdn.com/image/fetch/$s_!7F_e!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3df0d9f5-0573-4319-af6d-e116bc72d56f_1570x1244.png 848w, https://substackcdn.com/image/fetch/$s_!7F_e!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3df0d9f5-0573-4319-af6d-e116bc72d56f_1570x1244.png 1272w, https://substackcdn.com/image/fetch/$s_!7F_e!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3df0d9f5-0573-4319-af6d-e116bc72d56f_1570x1244.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7F_e!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3df0d9f5-0573-4319-af6d-e116bc72d56f_1570x1244.png" width="1456" height="1154" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3df0d9f5-0573-4319-af6d-e116bc72d56f_1570x1244.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1154,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:240018,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/173056020?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3df0d9f5-0573-4319-af6d-e116bc72d56f_1570x1244.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7F_e!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3df0d9f5-0573-4319-af6d-e116bc72d56f_1570x1244.png 424w, https://substackcdn.com/image/fetch/$s_!7F_e!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3df0d9f5-0573-4319-af6d-e116bc72d56f_1570x1244.png 848w, https://substackcdn.com/image/fetch/$s_!7F_e!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3df0d9f5-0573-4319-af6d-e116bc72d56f_1570x1244.png 1272w, https://substackcdn.com/image/fetch/$s_!7F_e!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3df0d9f5-0573-4319-af6d-e116bc72d56f_1570x1244.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8KDs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32b481df-9774-49c3-b405-c14de7d8ee95_1570x1244.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8KDs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32b481df-9774-49c3-b405-c14de7d8ee95_1570x1244.png 424w, https://substackcdn.com/image/fetch/$s_!8KDs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32b481df-9774-49c3-b405-c14de7d8ee95_1570x1244.png 848w, https://substackcdn.com/image/fetch/$s_!8KDs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32b481df-9774-49c3-b405-c14de7d8ee95_1570x1244.png 1272w, https://substackcdn.com/image/fetch/$s_!8KDs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32b481df-9774-49c3-b405-c14de7d8ee95_1570x1244.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8KDs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32b481df-9774-49c3-b405-c14de7d8ee95_1570x1244.png" width="1456" height="1154" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/32b481df-9774-49c3-b405-c14de7d8ee95_1570x1244.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1154,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:226052,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/173056020?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32b481df-9774-49c3-b405-c14de7d8ee95_1570x1244.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8KDs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32b481df-9774-49c3-b405-c14de7d8ee95_1570x1244.png 424w, https://substackcdn.com/image/fetch/$s_!8KDs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32b481df-9774-49c3-b405-c14de7d8ee95_1570x1244.png 848w, https://substackcdn.com/image/fetch/$s_!8KDs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32b481df-9774-49c3-b405-c14de7d8ee95_1570x1244.png 1272w, https://substackcdn.com/image/fetch/$s_!8KDs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32b481df-9774-49c3-b405-c14de7d8ee95_1570x1244.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Missing UX patterns</h2><p>The key UX pattern I have yet to discover is how the Agent to Agent UX works.</p><p>I think we will need to do a McSpikey with the Google A2A to understand the options in that space a little more.</p><h2>The Technology patterns</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qvhG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe837de5c-289c-4e22-94e9-b3f701b3e6c2_5996x2984.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qvhG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe837de5c-289c-4e22-94e9-b3f701b3e6c2_5996x2984.png 424w, https://substackcdn.com/image/fetch/$s_!qvhG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe837de5c-289c-4e22-94e9-b3f701b3e6c2_5996x2984.png 848w, https://substackcdn.com/image/fetch/$s_!qvhG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe837de5c-289c-4e22-94e9-b3f701b3e6c2_5996x2984.png 1272w, https://substackcdn.com/image/fetch/$s_!qvhG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe837de5c-289c-4e22-94e9-b3f701b3e6c2_5996x2984.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qvhG!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe837de5c-289c-4e22-94e9-b3f701b3e6c2_5996x2984.png" width="1200" height="597.5274725274726" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e837de5c-289c-4e22-94e9-b3f701b3e6c2_5996x2984.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:725,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qvhG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe837de5c-289c-4e22-94e9-b3f701b3e6c2_5996x2984.png 424w, https://substackcdn.com/image/fetch/$s_!qvhG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe837de5c-289c-4e22-94e9-b3f701b3e6c2_5996x2984.png 848w, https://substackcdn.com/image/fetch/$s_!qvhG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe837de5c-289c-4e22-94e9-b3f701b3e6c2_5996x2984.png 1272w, https://substackcdn.com/image/fetch/$s_!qvhG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe837de5c-289c-4e22-94e9-b3f701b3e6c2_5996x2984.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We already had a lot of the technology patterns in place before we started experimenting with the Context Plane.</p><p>Things like the browser based Data Catalog capability.</p><p>We also had a lot of ADI already built based on &#8220;AI Assisted&#8221; features we have experimented with over the last 6 odd years.</p><p>Our App and Platform architecture has always been based on API&#8217;s in the middle:</p><p>App &gt; API &gt; Context &gt; Code / Data</p><p>The main iteration technology wise has been the addition of a MCP server.   This has allowed the use of tools like Claude and Gemini CLI.</p><p>We have iterated ADI to use the MCP server to access the Context (well we actually use a hybrid access model but ill leave the diagram as simple as this for now.)</p><h2>So many new questions</h2><h3>User?</h3><p>UX stands for User Experience, but some of these persona types and use cases are machines not humans, should they still be referred to as Users?</p><h3>BI Semantic Layer?</h3><p>Where does the typical BI Tools and the &#8220;BI Semantic Layer&#8221; pattern fit into this?</p><p>For Context Plane we are only holding an Organisations Context not their Data, so we can&#8217;t execute any queries, like we can in the AgileData App and Platform.</p><p>Or do we want to look at generating the query the human can cut and paste into the data platform.</p><p>Or do we want a Context Agent to push the Query to a BI agent inside the Organisations agent ecosystem?</p><p>We don&#8217;t provide a caching layer or query rewrite patterns which is what the BI Semantic Layers / Metric Layers are doing these days.  Im pretty sure we don&#8217;t want to go there.</p><p>When will BI Tools move to using MCP servers as a way of querying the data?</p><p>When will they all put Agents in from of their BI Semantic Layers"?</p><h3>One step forward, but a raft of new uncertainties</h3><p>So many new questions, so few answers.</p><p>Looks like there even more McSpikeys to add to the list!</p><h2>Wood from the Trees</h2><p>Still a way to go before I have a coherent set of Patterns that I can Coach / Mentor / Teach somebody else for the &#8220;Context Plane&#8221;, and the &#8220;AI Data Stack&#8221; or present as a robust Architecture map.</p><p>But as I have already said, writing my half formed ideas helps me think.</p><h2>An incoherent stream of Context</h2><p>You can find all the previous articles with my train of thought listed in this thread:<br><br><a href="https://agiledata.substack.com/t/context-plane">https://agiledata.substack.com/t/context-plane</a><br><br>We are building the Context Plane while flying it, so always looking for early adopters to help us decide the final destination<br><br>If you want a virtual chat grab a slot here:<br><br><a href="https://contextplane.ai/contact-us/#bookemdanno">https://contextplane.ai/contact-us/#bookemdanno</a></p><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[AgileData Data Match, AgileData Engineering Pattern #7]]></title><description><![CDATA[The Data Match pattern provides an automated, granular comparison capability to efficiently identify and report discrepancies between two datasets, moving from row counts to specific data values.]]></description><link>https://agiledata.info/p/agiledata-data-match-agiledata-engineering</link><guid isPermaLink="false">https://agiledata.info/p/agiledata-data-match-agiledata-engineering</guid><dc:creator><![CDATA[Shagility]]></dc:creator><pubDate>Thu, 04 Sep 2025 21:24:18 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!_NEi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31b50478-2ff6-437e-abba-c0c94f9fe50b_1056x509.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1><strong>Data Match</strong></h1><h2><strong>Quicklinks</strong></h2><blockquote><p><strong><a href="https://agiledata.substack.com/i/172820886/description">Description</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/172820886/pattern-context-diagram">Context Diagram</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/172820886/agiledata-pattern-template">Pattern Template</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/172820886/press-release-template">Press Release Template</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/172820886/agiledata-app-platform-example">AgileData App / Platform Example</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/172820886/agiledata-podcast-episode">AgileData Podcast Episode</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/172820886/agiledata-podcast-episode-mindmap">AgileData Podcast Mind Map</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/172820886/agiledata-podcast-episode-transcript">AgileData Podcast Transcript</a></strong></p></blockquote><h2><strong>Agile Data Engineering Pattern</strong></h2><p>An AgileData Engineering Pattern is a repeatable, proven approach for solving a common data engineering challenge in a simple, consistent, and scalable way, designed to reduce rework, speed up delivery, and embed quality by default.</p><h2><strong>Pattern Description</strong></h2><p>The <strong>Data Match</strong> pattern provides an <strong>automated, granular comparison</strong> capability to efficiently identify and report discrepancies between two datasets, moving from row counts to specific data values. </p><p>This 'data diff' solution transforms <strong>hours of manual data reconciliation into minutes</strong> by optimising comparisons for cloud analytics databases like BigQuery, serving as a <strong>support feature for on-demand exception handling</strong> rather than a continuous trust rule.</p><h2><strong>Pattern Context Diagram</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_NEi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31b50478-2ff6-437e-abba-c0c94f9fe50b_1056x509.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_NEi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31b50478-2ff6-437e-abba-c0c94f9fe50b_1056x509.png 424w, https://substackcdn.com/image/fetch/$s_!_NEi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31b50478-2ff6-437e-abba-c0c94f9fe50b_1056x509.png 848w, https://substackcdn.com/image/fetch/$s_!_NEi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31b50478-2ff6-437e-abba-c0c94f9fe50b_1056x509.png 1272w, https://substackcdn.com/image/fetch/$s_!_NEi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31b50478-2ff6-437e-abba-c0c94f9fe50b_1056x509.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_NEi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31b50478-2ff6-437e-abba-c0c94f9fe50b_1056x509.png" width="1056" height="509" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/31b50478-2ff6-437e-abba-c0c94f9fe50b_1056x509.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:509,&quot;width&quot;:1056,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:53914,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/172820886?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31b50478-2ff6-437e-abba-c0c94f9fe50b_1056x509.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_NEi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31b50478-2ff6-437e-abba-c0c94f9fe50b_1056x509.png 424w, https://substackcdn.com/image/fetch/$s_!_NEi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31b50478-2ff6-437e-abba-c0c94f9fe50b_1056x509.png 848w, https://substackcdn.com/image/fetch/$s_!_NEi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31b50478-2ff6-437e-abba-c0c94f9fe50b_1056x509.png 1272w, https://substackcdn.com/image/fetch/$s_!_NEi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31b50478-2ff6-437e-abba-c0c94f9fe50b_1056x509.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h2><strong>Pattern Template</strong></h2><h3>Pattern Name</h3><p><strong>Data Match</strong></p><h3>The Problem It Solves</h3><p>You know that moment when you're trying to figure out <strong>why your data numbers don't add up</strong> between two systems or tables? Or you're trying to check if <strong>everything from your source has made it to your target</strong>? </p><p>Often, you're faced with hours, or even days, of painstaking manual reconciliation, writing complex SQL queries, or dealing with inefficient brute-force comparisons that cost a fortune in compute resources. </p><p>This pattern solves the problem of <strong>quickly and efficiently identifying discrepancies</strong> between two datasets, saving immense time and frustration.</p><h3><strong>When to Use It</strong></h3><p>Use Data Match primarily as an <strong>exception thing</strong> or a <strong>support feature</strong>. It's most useful:</p><ul><li><p>When <strong>something goes wrong</strong> and you suspect data misalignment between a source and a target.</p></li><li><p>For <strong>reconciling data</strong> after a migration or a complex transformation, especially when trying to pinpoint missing records.</p></li><li><p>When you need to <strong>quickly compare two tables or datasets</strong> to find differences without writing custom SQL.</p></li><li><p>When <strong>manual reconciliation is proving horrendous</strong> due to large volumes or complex logic.</p></li><li><p>It's not designed as a core trust rule for every data movement, but rather for <strong>on-demand verification</strong>.</p></li></ul><h3>How It Works</h3><p>This pattern turns a complex data reconciliation task into a few simple clicks.</p><p><strong>Trigger:</strong> A user needs to verify data consistency between two datasets because a discrepancy is suspected, or an audit is required.</p><p><strong>Inputs:</strong></p><ul><li><p>A <strong>"table on the left"</strong> (source data) and a <strong>"table on the right"</strong> (target data). This could include data uploaded from an Excel spreadsheet as a "new tile".</p></li><li><p>Specific <strong>"things in each table they want to double check"</strong>, such as primary keys or particular columns.</p></li><li><p>Access to a <strong>data catalog</strong> where all relevant "tiles" (data assets) are loaded.</p></li></ul><p><strong>Steps:</strong></p><ol><li><p>The user <strong>selects the first dataset</strong> (e.g., "tile A") and the <strong>second dataset</strong> (e.g., "tile B") from an interface.</p></li><li><p>The user specifies the <strong>columns or keys</strong> within each dataset that need to be compared.</p></li><li><p>The user initiates the comparison, often with a simple "hit go" or a <strong>"1 2 3 4 five click exercise"</strong>.</p></li><li><p>Under the covers, the system performs an <strong>increasingly granular match</strong>:</p><ul><li><p>It starts by comparing <strong>row counts</strong>.</p></li><li><p>Then, it compares <strong>keys</strong> between the two tables.</p></li><li><p>Finally, it compares <strong>specific data values</strong> (e.g., "date of births"). This layering of rules <strong>optimises the comparison</strong> and avoids costly brute-force operations.</p></li></ul></li><li><p>The system <strong>optimises the underlying queries</strong> for the specific database environment (e.g., BigQuery), leveraging features like column storage and partition pruning for efficiency.</p></li></ol><p><strong>Outputs:</strong></p><ul><li><p>A <strong>report</strong> detailing "all the things in the left that aren't in the right or vice versa".</p></li><li><p><strong>Specific identification of discrepant records</strong>, such as a list of "customer IDs that haven't flowed".</p></li></ul><h3>Why It Works</h3><p>Data Match works because it <strong>automates and optimises a typically complex and manual process</strong>. It replaces hours of writing and running custom SQL with an intuitive, guided workflow, essentially providing a "data diff" capability as a service. </p><p>The pattern's effectiveness comes from its <strong>layered approach to comparison</strong>, moving from high-level checks (like row counts) to granular value comparisons, which makes it highly efficient and cost-effective, particularly for large datasets in cloud analytics databases. </p><p>It's like having an <strong>automated detective</strong> that quickly sifts through vast amounts of data to highlight the exact discrepancies, allowing analysts to focus on <em>why</em> the data is different, rather than <em>how</em> to find the differences.</p><h3>Real-World Example</h3><p>Consider a scenario where a data engineering team is trying to <strong>reconcile customer data</strong> that has been processed through new business rules with an existing Excel spreadsheet used by the business. Despite their efforts, they constantly find themselves "one customer out" after processing 100,000 customers, and each discrepancy is for a different, often obscure, reason. Manually finding that single missing customer is a "horrendous" and time-consuming task.</p><p>With <strong>Data Match</strong>, the team can quickly upload the Excel data as a new "tile," then use Data Match to compare it directly with their processed customer data. The tool rapidly <strong>highlights the exact single record that is out</strong>, turning "many hours of frustration" into "minutes" of investigation. This allows the team to spend their time understanding the root cause of the discrepancy with the business, rather than painstakingly searching for it.</p><p>Another example involves a <strong>data migration project</strong> where 100,000 customer records were sent via an API to a new vendor system, but only 80,000 appeared in the new system. Manually debugging this took "hours if not days". If Data Match had been available, they could have "back flushed" the final data loaded by the vendor as a "tile" and then compared it with the data they sent. This would have <strong>immediately identified the 20,000 records that didn't make it</strong>, saving significant time and effort in proving where the discrepancy occurred (e.g., showing the vendor that changes were made on their side, despite an agreement not to).</p><h3>Anti-Patterns or Gotchas</h3><ul><li><p><strong>Brute-Force Comparisons on Large Datasets:</strong> Trying to match everything between two very large tables without any optimisation or "layering of rules" will be <strong>extremely costly</strong> in terms of compute, credits, or tokens.</p></li><li><p><strong>Using Non-Optimised Tools:</strong> Relying on generic open-source libraries that are not specifically optimised for your cloud analytics database (e.g., a tool skewed towards row storage databases like Postgres when you're using a column-oriented database like BigQuery) will lead to <strong>inefficient queries and high costs</strong>, failing to leverage the database's performance benefits.</p></li><li><p><strong>Overuse as a Primary Trust Mechanism:</strong> Data Match is an <strong>"exception thing,"</strong> not a core "trust rule" to be run for every data movement. Over-relying on it for continuous validation can be inefficient and indicates a potential gap in proactive data quality monitoring.</p></li></ul><h3>Tips for Adoption</h3><ul><li><p><strong>Implement for On-Demand Use:</strong> Position Data Match as a powerful, on-demand <strong>"support feature"</strong> for when anomalies occur or specific reconciliations are needed, rather than an always-on data quality check.</p></li><li><p><strong>Optimise for Your Platform:</strong> If developing an internal version, ensure it's <strong>specifically tailored and optimised for your primary data platform</strong> (e.g., BigQuery) to maximise efficiency and minimise costs.</p></li><li><p><strong>Integrate with Data Catalogues:</strong> Make it easy for users to pick and compare any "tile" (data asset) loaded in your data catalogue, reducing the overhead of manual configuration.</p></li><li><p><strong>Focus on Post-Detection Analysis:</strong> Emphasise that Data Match quickly identifies <em>what</em> is different, enabling data professionals to then spend their valuable time on <em>why</em> the data differs and <em>how</em> to fix it.</p></li></ul><h3>Related Patterns</h3><ul><li><p><strong>Data Diff:</strong> This is the general term for the concept that Data Match embodies.</p></li><li><p><strong>Tracing Values:</strong> This related feature helps users specifically look for the flow of individual data points once discrepancies are identified by Data Match.<br> </p></li></ul><h2><strong>Press Release Template</strong></h2><h3>Capability Name</h3><p>Data Match</p><h3>Headline </h3><p>AgileData Launches <strong>Data Match</strong> to Slash Data Reconciliation Time from Hours to Minutes for Data Teams</p><h3>Introduction</h3><p>AgileData is thrilled to announce the availability of <strong>Data Match</strong>, a powerful new capability designed to simplify and accelerate the process of identifying discrepancies between two datasets. This feature empowers data analysts, engineers, and business users to quickly verify data consistency and pinpoint missing or mismatched records with unprecedented ease, ensuring greater confidence in their data.</p><h3>Problem</h3><p>"As a data professional, I've spent countless hours, sometimes even days, painstakingly trying to figure out <strong>why my numbers don't match</strong> between two systems or after a data migration. It's a horrendous, manual process of writing complex SQL or sifting through spreadsheets, often just to find that one elusive missing record. I just want to know what's different, quickly, so I can fix it."</p><h3>Solution </h3><p>Data Match<strong> </strong>transforms this laborious task into a quick, intuitive process. Users simply select two datasets (or "tiles"), specify the keys or columns they wish to compare, and with a few clicks, the system performs an <strong>optimised, granular comparison</strong>. It efficiently checks everything from row counts to specific data values, then generates a clear report highlighting all discrepancies. This eliminates the need for manual SQL queries and immediately pinpoints the exact records that are out of sync, saving <strong>hours of frustration and compute costs</strong>.</p><h3>Data Platform Product Manager</h3><p> "With <strong>Data Match</strong>, we're not just offering a new feature; we're fundamentally improving <strong>trust and auditability</strong> within our data ecosystem. It provides our users with an on-demand, highly efficient tool to quickly validate data alignment, ensuring that discrepancies are identified swiftly, reinforcing confidence in our data pipelines and overall data quality."</p><h3>Data Platform User</h3><p>"Honestly, <strong>Data Match is a game-changer</strong>. What used to take me 'hours, if not days,' to manually reconcile data or prove a discrepancy, now literally takes 'minutes' with just a few clicks. I don't have to remember complex queries; I just hit 'go' and get my answers, letting me focus on solving the <em>why</em>, not just finding the <em>what</em>."</p><h3>Get Started </h3><p>Ready to transform your data reconciliation process from hours to minutes? <strong>Data Match</strong> is available now within the AgileData platform. Connect with your AgileData team today to learn more about how to leverage this powerful capability, or visit agiledata.io for further details on adopting new patterns to craft your Agile Data way of working.</p><h2>AgileData App / Platform Example</h2><p></p><p></p><h2>AgileData Podcast Episode</h2><p><a href="https://podcast.agiledata.io/e/data-match-agiledata-engineering-pattern-7-episode-75/">https://podcast.agiledata.io/e/data-match-agiledata-engineering-pattern-7-episode-75/</a><br></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://podcast.agiledata.io/e/data-match-agiledata-engineering-pattern-7-episode-75/&quot;,&quot;text&quot;:&quot;Listen to Podcast Episode&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://podcast.agiledata.io/e/data-match-agiledata-engineering-pattern-7-episode-75/"><span>Listen to Podcast Episode</span></a></p><p></p><blockquote><p><strong>Subscribe:</strong> <a href="https://podcasts.apple.com/nz/podcast/agiledata/id1456820781">Apple Podcast</a> | <a href="https://open.spotify.com/show/4wiQWj055HchKMxmYSKRIj">Spotify</a> | <a href="https://www.google.com/podcasts?feed=aHR0cHM6Ly9wb2RjYXN0LmFnaWxlZGF0YS5pby9mZWVkLnhtbA%3D%3D">Google Podcast </a>| <a href="https://music.amazon.com/podcasts/add0fc3f-ee5c-4227-bd28-35144d1bd9a6">Amazon Audible</a> | <a href="https://tunein.com/podcasts/Technology-Podcasts/AgileBI-p1214546/">TuneIn</a> | <a href="https://iheart.com/podcast/96630976">iHeartRadio</a> | <a href="https://player.fm/series/3347067">PlayerFM</a> | <a href="https://www.listennotes.com/podcasts/agiledata-agiledata-8ADKjli_fGx/">Listen Notes</a> | <a href="https://www.podchaser.com/podcasts/agiledata-822089">Podchaser</a> | <a href="https://www.deezer.com/en/show/5294327">Deezer</a> | <a href="https://podcastaddict.com/podcast/agiledata/4554760">Podcast Addict</a> |</p></blockquote><div id="youtube2-G7L5JDMIP7E" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;G7L5JDMIP7E&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/G7L5JDMIP7E?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><h2>AgileData Podcast Episode MindMap</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!glVt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c75717c-6c0e-44b3-bbd8-2b0530b11857_4949x10936.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!glVt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c75717c-6c0e-44b3-bbd8-2b0530b11857_4949x10936.png 424w, https://substackcdn.com/image/fetch/$s_!glVt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c75717c-6c0e-44b3-bbd8-2b0530b11857_4949x10936.png 848w, https://substackcdn.com/image/fetch/$s_!glVt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c75717c-6c0e-44b3-bbd8-2b0530b11857_4949x10936.png 1272w, https://substackcdn.com/image/fetch/$s_!glVt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c75717c-6c0e-44b3-bbd8-2b0530b11857_4949x10936.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!glVt!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c75717c-6c0e-44b3-bbd8-2b0530b11857_4949x10936.png" width="1200" height="2651.3736263736264" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0c75717c-6c0e-44b3-bbd8-2b0530b11857_4949x10936.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:3217,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:2581633,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/172820886?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c75717c-6c0e-44b3-bbd8-2b0530b11857_4949x10936.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!glVt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c75717c-6c0e-44b3-bbd8-2b0530b11857_4949x10936.png 424w, https://substackcdn.com/image/fetch/$s_!glVt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c75717c-6c0e-44b3-bbd8-2b0530b11857_4949x10936.png 848w, https://substackcdn.com/image/fetch/$s_!glVt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c75717c-6c0e-44b3-bbd8-2b0530b11857_4949x10936.png 1272w, https://substackcdn.com/image/fetch/$s_!glVt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c75717c-6c0e-44b3-bbd8-2b0530b11857_4949x10936.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h2>AgileData Podcast Episode Transcript</h2><p><strong>Shane</strong>: Welcome to the Agile Data Podcast. I'm Shane Gibson. And I'm Nigel Vining. Hey, Nigel. Another data engineering bytes. Today we are gonna talk about a feature that we term data match. So tell me what it is and why I care </p><p><strong>Nigel</strong>: Data match. Came out of the age old question, how do I know something in my source is in my target?</p><p>Or as we like to say, what's in the left isn't in the right or vice versa. This is generally called a data diff in a lot of places. Generally, it's a Pattern of doing an increasingly granular match of something on the left, which is generally a table of. Data and something on the right. So we start off and we say, is the number of records the same?</p><p>Yes, it is. Great. Cool. Are all the keys the same between these two tables? Yep, they are. Cool. Now is the date of births on the table on the left the same as the date of births on the table. And we effectively, we go from the very wide row count down to something very specific. Do the actual values match from left to right?</p><p>Now that all sounds pretty straightforward and it technically is, but under the covers there's a whole lot of SQL and engineering patterns that are happening to basically run all those queries. So that's not something we would expect a analyst to generally do 'cause it's a bit of a faf. So we came up with this feature called Data Match, where we effectively lead a user.</p><p>Pick a table on the left. Pick a table on the right. Pick the thing in each table. They want to double check and hit, go under the covers. Then we optimize those comparisons and then we produce a report straight back to the user saying these are all the things in the left that aren't in the right, or vice versa.</p><p>So we've made it a 1, 2, 3, 4, 5 click exercise and you can reconcile anything in your environment and it's. </p><p><strong>Shane</strong>: I think this one from memory was an interesting problem. So we had a customer that we were doing the data work for. They had a series of business logic or rules that was in a fairly horrendous Excel spreadsheet.</p><p>So we used our way of working and we extracted the, the. Concept of those rules and we modeled the data properly and we applied those rules. And whenever we were trying to reconcile the numbers we got with the numbers in the spreadsheet, we were always one thing out. So let's just say it was a reconciling customers, they would have a hundred thousand and one customers.</p><p>We would have a hundred thousand customers. So we'd manually go through, find that one customer, work out that it was a timing problem or make sure we ran it at the same time. There would be one customer out and we'd go and check it. And then there was a bit of logic that they had, they didn't tell us about.</p><p>So we had to add that rule and somehow we just got into this loop where. We always won customer out and it was always for a different reason, but the cost of doing that manual reconciliation was horrendous. Data match allowed me to go, I can run that really quick. We grab the Excel data, I'd upload it, just dump it in like we do, get a new tile, compare it to the numbers we were producing consistently, and it would then highlight the one record that was out, you know.</p><p>Very short amount of time, and then I could spend all the time trying to work with them about why they had this record that we didn't. Or vice versa. So yeah, it just again, took something that was many hours of frustration and made up minutes, which was great. The idea of layering those rules though, that's important because otherwise you're just gonna brute force two very large tables.</p><p>Match everything and that is gonna cost you a shit ton of compute, a shit ton of credits, a shit ton of tokens depending on how your cloud analytics database vendor is charging you for that compute. </p><p><strong>Nigel</strong>: Yeah, so we poked a couple of reasonably well known open source libraries when we first started, 'cause we're like, we're not gonna reinvent the wheel.</p><p>This seems to be a fairly solved thing. Surely there's just gonna be a package we can pull down, point it to two tables and hit go. And that's, and we will run it. Technically there are, and we did start with one and it did work. Where we tend to run into, where we ran into rolls was it needed quite a lot of configuration, so we effectively had to come up with a whole wrap to pass it, enough configuration to make it work.</p><p>And that was fine. That was more just a bit of app development to give it what it needs. But then some of the problem was, it was, as is usually the case, it had been developed to run. On a particular database, I think it was Postgres from memory, which is quite common, or was MySQL. So it was heavily skewed towards a row storage database and how row storage databases work.</p><p>And so it was optimized. So the queries. So the queries when we came to run them on BigQuery, they ran and that was fine, but we didn't really get any. The benefits of BigQuery being a column and database and partition pruning and the like. So we played with it and played with it, and it got closer and closer.</p><p>In the end, we thought actually it'd be just quicker to write a template that would run a BigQuery and we'll effectively do the same thing, but we'll make a template and make it. Specific BigQuery. And we did, and that's effectively where we got to so we can optimize what we give to BigQuery. So it's very efficient and it runs very quickly and it doesn't really cost us anything.</p><p>'cause we know where the performance and cost savings are with BigQuery and that's how we got to our Pattern. We effectively just took an open source, one found the strengths and weaknesses. Rolled a variant of it, uh, for us, for BigQuery. </p><p><strong>Shane</strong>: And I think the other thing is we only run this when we need to. So it's not baked in as a core trust rule for every movement of data through every layer for every tile.</p><p>Is it? </p><p><strong>Nigel</strong>: No. This is effectively an exception thing. This is when something goes wrong. This is somewhere where we can quickly go click and say, ah, there's 10 customers that aren't. Aren't in this table where we'd expect them to be. So it's a really quick way without having to regress to and go and customize something because it already has all the tiles and a catalog loaded.</p><p>You can just go pick tile A, pick tile B, compare them, show me the differences, uh, and go away. So it takes the first layer of context and the overhead, sorry, of thinking about it gives you your answer in a report. Then you can go and do, as you said, do the analysis. 'cause now I've got a list of customer IDs that haven't flowed.</p><p>I can grab one of those customer IDs and actually go specifically look for it. And that's a really quite simple proposition because that nicely flows onto some of the other features we've built around looking, tracing values. </p><p><strong>Shane</strong>: I remember when we did that data migration use case, remember, where we grabbed data from a legacy source system and then pushed it through us, and then made it available as a API so that the new vendor could migrate the old data into the new platform.</p><p>And we had that gentleman's agreement, which was we do all the logic to match the new business rules for the new system. So effectively they'd hit the API for the data, grab the data, load it straight into their system, and there'll be no transformations between those steps. So that we always knew when we needed to change the way the data looked, it sat with us.</p><p>And when we did that test run and all of a sudden, let's say customer again, we passed. A hundred thousand customers out and only 80,000 turned up in their system. And we spent all that time manually trying to figure out why. And actually the answer was they had done some changes on their side between getting the data and loading it through their APIs, even though they said they wouldn't.</p><p>If I had just been able to take the final result that they'd loaded from their system and back flushed it in as a tile and then said, compare, that would've told me exactly which records didn't make it. And then yes, I would still have to talk to 'em about how come they didn't make it. But again, that would've saved hours if not days of proving we send a hundred thousand, you loaded 80.</p><p>We know. Therefore, it's somewhere between those steps and it's nothing to do with. Everything to the left of us would've saved us time. If we had to build it back then. </p><p><strong>Nigel</strong>: Yeah, it's, that's why, I guess it's in the app, it's what I would call a support feature. It's something we don't use very often, but if we need to, it's there to quickly do something and we don't have to remember how do I data diff, what queries do I need to run?</p><p>Grab out some queries, change the table names and the key names in them to run them. Again, it's click, click, here's my report. You know, it's a small overhead, but. When you're trying to do a whole lot of things. Yeah. That you're grateful for it. </p><p><strong>Shane</strong>: Yep. Hours to minutes. That's what I care about. Yep. </p><p><strong>Nigel</strong>: Excellent.</p><p><strong>Shane</strong>: Alright. I hope everybody has a simply magical day.</p>]]></content:encoded></item><item><title><![CDATA[Patterns to define the ROI of a data product with Nick Zervoudis]]></title><description><![CDATA[AgileData Podcast #74]]></description><link>https://agiledata.info/p/patterns-to-define-the-roi-of-a-data</link><guid isPermaLink="false">https://agiledata.info/p/patterns-to-define-the-roi-of-a-data</guid><dc:creator><![CDATA[Shagility]]></dc:creator><pubDate>Thu, 04 Sep 2025 06:24:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/J9aDUJu9d5s" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Join Shane Gibson as he chats with Nick Zervoudis about patterns that you can use to quickly and easily define the ROI of your data products, before you build them.</p><blockquote><p><strong><a href="https://agiledata.substack.com/i/172748370/listen">Listen</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/172748370/google-notebooklm-mindmap">View MindMap</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/172748370/google-notebooklm-briefing">Read AI Summary</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/172748370/transcript">Read Transcript</a></strong></p></blockquote><p></p><h2>Listen</h2><p>Listen on all good podcast hosts or over at:</p><p><a href="https://podcast.agiledata.io/e/patterns-to-define-the-roi-of-a-data-product-with-nick-zervoudis-episode-74/">https://podcast.agiledata.io/e/patterns-to-define-the-roi-of-a-data-product-with-nick-zervoudis-episode-74/</a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://podcast.agiledata.io/e/dimensional-data-modeling-patterns-with-johnny-winter-episode-73/&quot;,&quot;text&quot;:&quot;Listen to the Agile Data Podcast Episode&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://podcast.agiledata.io/e/dimensional-data-modeling-patterns-with-johnny-winter-episode-73/"><span>Listen to the Agile Data Podcast Episode</span></a></p><blockquote><p><strong>Subscribe:</strong> <a href="https://podcasts.apple.com/nz/podcast/agiledata/id1456820781">Apple Podcast</a> | <a href="https://open.spotify.com/show/4wiQWj055HchKMxmYSKRIj">Spotify</a> | <a href="https://www.google.com/podcasts?feed=aHR0cHM6Ly9wb2RjYXN0LmFnaWxlZGF0YS5pby9mZWVkLnhtbA%3D%3D">Google Podcast </a>| <a href="https://music.amazon.com/podcasts/add0fc3f-ee5c-4227-bd28-35144d1bd9a6">Amazon Audible</a> | <a href="https://tunein.com/podcasts/Technology-Podcasts/AgileBI-p1214546/">TuneIn</a> | <a href="https://iheart.com/podcast/96630976">iHeartRadio</a> | <a href="https://player.fm/series/3347067">PlayerFM</a> | <a href="https://www.listennotes.com/podcasts/agiledata-agiledata-8ADKjli_fGx/">Listen Notes</a> | <a href="https://www.podchaser.com/podcasts/agiledata-822089">Podchaser</a> | <a href="https://www.deezer.com/en/show/5294327">Deezer</a> | <a href="https://podcastaddict.com/podcast/agiledata/4554760">Podcast Addict</a> |</p></blockquote><div id="youtube2-J9aDUJu9d5s" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;J9aDUJu9d5s&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/J9aDUJu9d5s?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>You can get in touch with Nick via <a href="https://www.linkedin.com/in/nzervoudis">LinkedIn</a> or over at:</p><div class="embedded-publication-wrap" data-attrs="{&quot;id&quot;:1085365,&quot;name&quot;:&quot;Value from Data &amp; AI&quot;,&quot;logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!DDQi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2088c5cd-8bfa-4950-8665-021c768e9e53_500x500.png&quot;,&quot;base_url&quot;:&quot;https://blog.valuefromdata.ai&quot;,&quot;hero_text&quot;:&quot;A newsletter about data &amp; AI product management&quot;,&quot;author_name&quot;:&quot;Nick Zervoudis&quot;,&quot;show_subscribe&quot;:true,&quot;logo_bg_color&quot;:&quot;#ffffff&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPublicationToDOMWithSubscribe"><div class="embedded-publication show-subscribe"><a class="embedded-publication-link-part" native="true" href="https://blog.valuefromdata.ai?utm_source=substack&amp;utm_campaign=publication_embed&amp;utm_medium=web"><img class="embedded-publication-logo" src="https://substackcdn.com/image/fetch/$s_!DDQi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2088c5cd-8bfa-4950-8665-021c768e9e53_500x500.png" width="56" height="56" style="background-color: rgb(255, 255, 255);"><span class="embedded-publication-name">Value from Data &amp; AI</span><div class="embedded-publication-hero-text">A newsletter about data &amp; AI product management</div><div class="embedded-publication-author-name">By Nick Zervoudis</div></a><form class="embedded-publication-subscribe" method="GET" action="https://blog.valuefromdata.ai/subscribe?"><input type="hidden" name="source" value="publication-embed"><input type="hidden" name="autoSubmit" value="true"><input type="email" class="email-input" name="email" placeholder="Type your email..."><input type="submit" class="button primary" value="Subscribe"></form></div></div><div class="pullquote"><p><strong>Tired of vague data requests and endless requirement meetings? The Information Product Canvas helps you get clarity in 30 minutes or less?</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://agiledataguides.com/ipc&quot;,&quot;text&quot;:&quot;Fix Your Data Requirements&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://agiledataguides.com/ipc"><span>Fix Your Data Requirements</span></a></p></div><h2>Google NotebookLM Mindmap </h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BD-9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec1053a1-1cec-4670-8d7d-2d0addf0e0b6_6961x22336.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BD-9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec1053a1-1cec-4670-8d7d-2d0addf0e0b6_6961x22336.png 424w, https://substackcdn.com/image/fetch/$s_!BD-9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec1053a1-1cec-4670-8d7d-2d0addf0e0b6_6961x22336.png 848w, https://substackcdn.com/image/fetch/$s_!BD-9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec1053a1-1cec-4670-8d7d-2d0addf0e0b6_6961x22336.png 1272w, https://substackcdn.com/image/fetch/$s_!BD-9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec1053a1-1cec-4670-8d7d-2d0addf0e0b6_6961x22336.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BD-9!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec1053a1-1cec-4670-8d7d-2d0addf0e0b6_6961x22336.png" width="1200" height="3850.5494505494507" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ec1053a1-1cec-4670-8d7d-2d0addf0e0b6_6961x22336.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:4672,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:8105045,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/172748370?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec1053a1-1cec-4670-8d7d-2d0addf0e0b6_6961x22336.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BD-9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec1053a1-1cec-4670-8d7d-2d0addf0e0b6_6961x22336.png 424w, https://substackcdn.com/image/fetch/$s_!BD-9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec1053a1-1cec-4670-8d7d-2d0addf0e0b6_6961x22336.png 848w, https://substackcdn.com/image/fetch/$s_!BD-9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec1053a1-1cec-4670-8d7d-2d0addf0e0b6_6961x22336.png 1272w, https://substackcdn.com/image/fetch/$s_!BD-9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec1053a1-1cec-4670-8d7d-2d0addf0e0b6_6961x22336.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h2>Google NoteBookLM Briefing</h2><h3>Briefing Document: Quantifying ROI for Data Products</h3><p><strong>Source:</strong> Excerpts from "AgileData 74 - Patterns to define the ROI of a data product with Nick Zervoudis" <strong>Speakers:</strong> Shane Gibson (Host), Nick Zervoudis (Guest, Independent Consultant and Trainer, Founder of Value From Data and AI)</p><h3><strong>1. Introduction: The Challenge of Quantifying Data ROI</strong></h3><p>The podcast episode highlights a common and significant problem in the data domain: the difficulty in quantifying the Return on Investment (ROI) for data projects and products. Organisations often struggle to move beyond identifying potential actions and outcomes from data to actually assigning a monetary value to those outcomes.</p><ul><li><p><strong>Core Problem:</strong> Stakeholders can describe the desired action and outcome (e.g., "reduce customer churn," "increase sales"), but "then you get crickets" when asked to quantify the financial impact.</p></li><li><p><strong>Nick Zervoudis' Background:</strong> Nick has a career in data, bridging the gap between technical and non-technical people, initially in consulting (Capta Invent) and then in data product management (PepsiCo, CK Delta). He's now an independent consultant, emphasising "value from data and AI." His experience spans internal and external data products, including data platforms, analytics products (dashboards, CSVs), and machine learning applications.</p></li></ul><h3><strong>2. The Shift to Data Product Thinking and Value-Centricity</strong></h3><p>The speakers note a growing, but still evolving, trend towards applying product management principles to data. This "data as a product" approach is seen as crucial for addressing the ROI challenge.</p><ul><li><p><strong>Product Thinking for Data:</strong> "It's interesting that there's this move in the last couple years to bring product management thinking or data as a product that way of working from the product domain into the data domain. And I think it's been great. I think we've seen some real changes..."</p></li><li><p><strong>Value and Customer Centricity:</strong> While some companies have embraced this for 25 years, many are "lagards" slowly adopting "value centric and customercentric way" of working with data.</p></li><li><p><strong>Moving Beyond "Feature Factories":</strong> Data teams often act as "feature factories" or "data request" fulfillers, building what stakeholders demand without understanding the underlying problem or value. This leads to unused dashboards and wasted effort.</p></li></ul><h3><strong>3. Key Strategy: Fermi Estimation for ROI (Back-of-the-Envelope Calculations)</strong></h3><p>A central theme is the importance of using quick, rough estimations &#8211; "Fermi estimations" &#8211; rather than striving for perfect precision at the outset.</p><ul><li><p><strong>Fermi Estimation:</strong> Named after Enrico Fermi, who made quick, order-of-magnitude estimations (e.g., the TNT equivalent of a nuclear blast). The goal is to get the "order of magnitude right," not exact numbers.</p></li><li><p><strong>Simplicity is Key:</strong> Data professionals often overcomplicate ROI calculations, thinking they need "exact numbers" and "the same rigor as a lot of the other data work." Instead, "a lot of the time all you need is a back of the envelope calculation."</p></li><li><p><strong>Example: Churn Reduction:</strong> If stakeholders want to reduce churn, even rough estimates of customer lifetime value, churn rate, and potential reduction (e.g., 10%) can quickly reveal if the opportunity is worth $500, $5,000, or $500,000.</p></li><li><p><strong>Prioritisation Tool:</strong> These rough estimates allow for quick comparison of many opportunities (e.g., 15-150 ideas) to identify the most valuable ones, or those with the highest value per unit of effort.</p></li><li><p><strong>"S</strong>*** First Draft" Approach:** Instead of asking stakeholders for a number on a blank sheet, provide them with a "s***** first draft" of your calculation. This makes it "so much easier for both technical and nontechnical stakeholders to basically critique something or provide me with an input I'm looking for if I give them the scaffolding."</p></li></ul><h3><strong>4. Collaborative Approach and Stakeholder Engagement</strong></h3><p>Quantifying ROI and building effective data products is not an isolated task for the data team; it requires deep collaboration with stakeholders.</p><ul><li><p><strong>Internal Consulting:</strong> Nick's experience in consulting and acting as an "internal consultant" for business units has taught him the value of asking "dumb questions" and drawing flowcharts with stakeholders to understand processes.</p></li><li><p><strong>Building Trust:</strong> Constant engagement and collaboration build "better relationships and trust." Disappearing for months after initial requirements gathering, only to return with a "homework" product, often leads to building the "wrong thing" or stakeholder resistance.</p></li><li><p><strong>Bringing Finance Along:</strong> Involving finance colleagues in the financial estimation part ensures that "when the business case shows up on their desk they don't go who is this what is this thing," but rather recognise it as something they "worked on together."</p></li></ul><h3><strong>5. Understanding the "Why" and the "Trifecta" of Benefits</strong></h3><p>Data professionals should push beyond simple data requests to understand the underlying business problem and how data can contribute to key financial benefits.</p><ul><li><p><strong>Beyond Data Requests:</strong> When a stakeholder requests "weekly sales data," Nick's response is, "no that that's not what we're here. That that's just a solution you have in mind." The data team should act like a doctor, probing symptoms to understand the actual problem.</p></li><li><p><strong>Focus on Business Outcomes:</strong> The goal is to understand "the business problem" and how data can influence "the business outcome."</p></li><li><p><strong>The "Trifecta" of Value:</strong> Most benefits can be categorised into:</p></li></ul><ol><li><p><strong>Cost Saving:</strong> Reducing operational expenses.</p></li><li><p><strong>Revenue Improvement:</strong> Generating more sales or income.</p></li><li><p><strong>Risk Reduction:</strong> Mitigating potential financial or operational risks.</p></li></ol><ul><li><p><strong>Probing Questions:</strong> By asking "what action are you going to take?" and "what outcome do you think you're going to deliver?", data professionals can uncover the true need.</p></li></ul><h3><strong>6. Metric Trees: Visualising Business Relationships and Sensitivity</strong></h3><p>Metric trees are presented as a valuable tool for understanding the interconnectedness of business inputs and outputs, enabling more informed decision-making.</p><ul><li><p><strong>Understanding Relationships:</strong> Metric trees help to "understand the relationship between the different inputs in my business and how these translate into outputs."</p></li><li><p><strong>Business Sensitivity:</strong> They reveal "what is my business's sensitivity for those different things." For example, how a 10% increase in mailing list subscribers cascades through click-through rates, conversion rates, and profitability.</p></li><li><p><strong>Simplicity for Stakeholders:</strong> While the underlying calculations might be complex, the visual representation and the "output that a business stakeholder sees has to be super simple so that they can also understand this whole concept of making datadriven decisions."</p></li><li><p><strong>Avoiding Over-Engineering:</strong> Data professionals' tendency to seek extreme accuracy (e.g., "spend 3 months grabbing it, modeling it, getting the actual abandonment rate") can delay value. Metric trees support the "light touches" of the discovery/ideation phase.</p></li></ul><h3><strong>7. Measuring Success and the Measurement &amp; Evaluation Workstream</strong></h3><p>Proving ROI requires a deliberate plan for measuring the impact of data products, ideally integrated from the project's start.</p><ul><li><p><strong>Pre-emptive Measurement:</strong> "It's so much easier to actually figure out the ROI of something if we've done this exercise that goes, what is the business outcome we're going to be influencing here?"</p></li><li><p><strong>Dedicated Workstream:</strong> For significant projects, Nick recommends an "insist that there needs to be a measurement and evaluation workstream as part of the project."</p></li><li><p><strong>Defining Success Metrics:</strong> This workstream defines "what are the metrics for success." If the necessary data isn't readily available (e.g., in a metrics tree), "we need to set up some kind of measurement for this new thing."</p></li><li><p><strong>Beyond Usage Metrics:</strong> Simply measuring dashboard usage (e.g., "opened and run") is often insufficient. Qualitative feedback (interviews, surveys) is "so much richer" than viewing time or open rates.</p></li><li><p><strong>Linking to Action:</strong> True value comes from enabling "better decisions" and influencing specific actions. Dashboards should be integrated into workflows (e.g., "every Monday morning I open this dashboard... and I make one two three actions off the back of it").</p></li><li><p><strong>Deliberate Data Collection:</strong> The "big data data lake approach" often fails because crucial data points are missing. Being deliberate about the business problem helps identify necessary data points, and if they don't exist, "we need to create those data points. It's not a nice to have, it's a must-have condition."</p></li></ul><h3><strong>8. Balancing Foundational Work with Value Delivery</strong></h3><p>The discussion touches on the age-old tension between building robust data foundations and delivering immediate business value.</p><ul><li><p><strong>Avoid "Platform First" Pitfalls:</strong> "We spend two years doing a big migration promising the business that after the big migration we'll finally be able to deliver value and what do you know two years later co gets fired new co comes in first thing they do they want to rebuild the platform..." This is a common and detrimental cycle.</p></li><li><p><strong>Bundle Value with Foundations:</strong> It's crucial to "bundle any kind of let's call it technical debt or platform investment or foundational investment together with something that's going to deliver value to the business and deliver in minimum increments of value."</p></li><li><p><strong>Intentional Technical Debt:</strong> Technical debt isn't inherently bad; it's "borrowed from our future selves... in order to usually test something." There's "no point building something super robust and scalable if we don't know it's worth scaling in the first place."</p></li></ul><h3><strong>9. Quantifying the "I" (Investment) and Prioritisation</strong></h3><p>Understanding the cost side of ROI is equally important, particularly for internal prioritisation.</p><ul><li><p><strong>Beyond Value:</strong> ROI requires both value and investment. The "I part is basically the cost." This includes incremental costs (additional hours, contractors, compute) rather than sunk fixed costs.</p></li><li><p><strong>Internal Accountability:</strong> Data teams should know their operating costs and aim to "be delivering more than that, like a multiple." (e.g., "If we're costing 100K a week, then any given week, we should be delivering at least 110 if not 200K back to the business").</p></li><li><p><strong>Using Financial Language for Prioritisation:</strong> When prioritising, use "financial numbers" to justify decisions to stakeholders. For example, "your thing is going to cost the business 200K, but based on our projections, it's only going to make us an extra 100K."</p></li><li><p><strong>Data Product Manager's Role:</strong> While committees often prioritise based on "who has got the biggest voice," a data product manager should ideally make the final decision based on value, involving stakeholders in the process.</p></li></ul><h3><strong>10. Data as a Value Driver, Not Just a Cost Centre</strong></h3><p>The speakers challenge the notion of data (and even other shared services like HR/IT) solely as cost centres.</p><ul><li><p><strong>Opportunity Cost:</strong> Treating departments as cost centres can make "a lot of things become invisible to the business," particularly opportunity costs (e.g., the cost of using inefficient old software).</p></li><li><p><strong>Innovating and Unlocking Value:</strong> Data is a "more nent profession" that helps "innovate," "improve the quality of decisions," "unlock new streams of revenue," and "build new products," especially with the rise of AI.</p></li><li><p><strong>Avoiding Commoditisation:</strong> Nick doesn't want the data team to act like a cost centre because "then we're just going to default to doing bare minimum low value adding tasks that can be commoditized."</p></li></ul><h3><strong>11. Qualities of a Good Data Product Manager</strong></h3><p>The episode concludes by identifying key aptitudes for successful data product managers.</p><ul><li><p><strong>Ownership:</strong> The most critical quality. Being "invested in the outcomes you're trying to enable," not just completing tasks. Good PMs "fill in the gaps" across technical, marketing, or financial analysis areas.</p></li><li><p><strong>Curiosity:</strong> A "sense of curiosity to learn more about your users, about your business, about the technical underpinnings of your product, about what the data actually shows and means." This prevents becoming a mere "information sifter" and enables proactive, strategic impact.</p></li><li><p><strong>Problem-First Mindset:</strong> "Be inquisitive around what the problem is understand the problem itself before you worry about the solution." This aligns with "product thinking" and "jobs to be done" frameworks.</p></li></ul><h3><strong>Key Takeaways for Action:</strong></h3><ol><li><p><strong>Embrace Fermi Estimations:</strong> Don't strive for perfect accuracy upfront. Use quick, "back-of-the-envelope" calculations to get an order of magnitude for ROI, especially in the discovery and ideation phases.</p></li><li><p><strong>Collaborate Extensively:</strong> Involve stakeholders (including finance) from the start. Share "s***** first drafts" of calculations and co-create understanding of processes and value.</p></li><li><p><strong>Focus on Business Outcomes:</strong> Always ask "why" and link data requests to specific actions, measurable outcomes, and the "trifecta" of cost saving, revenue improvement, or risk reduction.</p></li><li><p><strong>Implement Measurement &amp; Evaluation:</strong> For any significant data product, build a measurement and evaluation workstream into the project plan, defining success metrics and how they will be tracked.</p></li><li><p><strong>Balance Foundations with Value:</strong> Bundle foundational data work with initiatives that deliver tangible, incremental business value, avoiding lengthy "platform-first" projects.</p></li><li><p><strong>Quantify Investment:</strong> Understand and communicate the cost of data initiatives alongside their potential value to inform prioritisation decisions.</p></li><li><p><strong>Cultivate Ownership &amp; Curiosity:</strong> For data professionals, especially those in product roles, these aptitudes are crucial for understanding complex problems and driving impactful solutions.</p></li></ol><p></p><div class="pullquote"><p><strong>Tired of vague data requests and endless requirement meetings? The Information Product Canvas helps you get clarity in 30 minutes or less?</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://agiledataguides.com/ipc&quot;,&quot;text&quot;:&quot;Fix Your Data Requirements&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://agiledataguides.com/ipc"><span>Fix Your Data Requirements</span></a></p></div><p></p><h2>Transcript</h2><p><strong>Shane</strong>: Welcome to the Agile Data Podcast. I'm Shane Gibson.</p><p><strong>Nick</strong>: And I'm Nick Zervoudis.</p><p><strong>Shane</strong>: Hey, Nick. Thanks for coming on the show. Today we are going to talk about the thing that I never actually see, which is quantifying return on investments from data teams. Before we rip into that though, why don't you give the audience a bit of background about yourself self.</p><p><strong>Nick</strong>: Yeah. And it's great to be here. So I'm Nick. I've worked my whole career in data, but always in that kind of squishy role that somewhere between the purely technical people and the people that haven't looked at an equation since about 25 years ago, at first, that was in consulting. First for a boutique consultancy, specialized in data science.</p><p>Then for Capgemini Invent. Then I made the leap to product management. I worked for PepsiCo and then for CK Delta's a. Subsidiary of a large conglomerate owning a lot of infrastructure businesses around the world, and now I'm on my own. I made the leap to be an independent consultant and trainer, , specialized around data product management because I see so many organizations still make the same , really basic mistakes around how should we get value out of our data investments, which is also why I've named my company value from Data and ai.</p><p>That's the short story about me. I'm someone that studied politics, but somehow has been in data his whole career. And even though I never considered myself technical, I've now gotten to the point where I'm often introduced at least by the business stakeholders as, oh, Nick's the technical guy.</p><p>He understands the data side. Except when I then talk to the data scientists, then, I don't think they think I'm a clueless person that doesn't understand anything they do, but that's how it feels compared to them. So here we are</p><p><strong>Shane</strong>: Yeah, I'm a big fan of Hitchhiker's Guide to the Galaxy, so I talk about people like that as Babelfish, depending on which part of the organization you're talking to, you're interpreting for the other part , that's not in the room normally. When you're in product management for those companies, was it physical products or digital products .</p><p>What part of the product management were you in?</p><p><strong>Nick</strong>: always data products. And here when I say data product I'm mish mashing a few different types. Like you've got data platforms that are enabling other use cases. They're not necessarily delivering value end-to-end themselves. Data and analytics products, things like. A dashboard or shipping out CSVs to end customers.</p><p>All the way to using machine learning and olms. And that's been the types of products I've managed my whole career, including when I was in consulting and was an undercover data PM and we didn't call it data pm 'cause that title didn't exist and we didn't even know that product management was a thing.</p><p>And then the other thing that's worth calling out is that I've also dabbled in both internal facing products where your customers are colleagues of yours in different departments and external data products where we're literally selling data sets or selling dashboard access to customers that are interested in buying that data.</p><p><strong>Shane</strong>: I think it's interesting that there's this move in the last couple of years to bring product management thinking or data as a product. That way of working from the product domain into the data domain. And I think it's been great. I think we've seen, some real changes to the way we work with data and the processes we use and everything we do. Still quite lacking. There's still a lot to learn and a lot to bring to the table. And one of the things, that I wanted to talk about is this idea of return on investment because. When I'm working with organizations, it's hard enough to get them to get to the stage where they'll determine the potential action that will be taken from the data or information that's gonna be delivered and what the outcome to the organization might be from that action. But it's very rare, if ever, that I can see them make the next jump to actually quantify the return on investment for that outcome. They may say, I need some data. I wanna understand all the customers that haven't bought something in the last six months. And you say, great. If I answer that question with data and information, what are you gonna do? Ah we know that they're likely to leave, so we're gonna go and do a churn offer, send out some emails, give them a discount. Okay, if that churn campaign is successful, what's the outcome? Oh we'll retain some customers, we'll increase our margin, get some more sales, and you go, great. If you had to quantify that as a number, what would it be? And then you get crickets, and. Maybe it's because they don't actually have the data to input to help them make that decision, and understand what that would be. But there just seems to stop, so how have you dealt with that, this idea of return on investment. How do you actually approach it?</p><p><strong>Nick</strong>: Yeah, it's a great question and I wanna answer it by going on a bit of a tangent first because I wanna comment on even what you said that, oh, in the last couple years, companies are waking up to the need for product thinking. And it's true, like a lot more companies are doing it. And then there's other companies or other data teams that have been doing it for the last 25 years.</p><p>So I feel like there's a very skewed distribution. In terms of which companies are doing things in that, value centric and customer centric way versus all the laggards in the same way that if you look at how many people are using maybe LLMs in production today, it's a very small number of companies around the world.</p><p>If you looked at it two years ago, it was an even smaller number. And we're in that stage where the laggards and even the people in the middle of the adoption curve are slowly getting into it. And I think that's relevant because for me, a lot of , these let's say, best practices and now how do we figure out the ROI, it's not something that's super complicated or cutting edge or, oh, you need to have all your ducks in a row, all your data in a beautiful metrics tree before you can start asking those questions.</p><p>Because let's jump into your example, right? I thought you'd give me a harder case to start with, right? From what you've said, it sounds pretty straightforward.</p><p><strong>Shane</strong>: let's start with the simple ones, </p><p>If we see the how you can apply the patents to a simple scenario, then it gives you the sense of the pattern. And then from there we can go into the horrible edge cases where we know it's a little bit harder. But again, I still find people struggle with the simple cases.</p><p><strong>Nick</strong>: So here's why. And I'm, not just saying it to be smug or anything, but why I think the example you gave me is straightforward because the link between the data request, let's say, and the outcome, the business outcome that the customer is looking to achieve is pretty straightforward, we wanna reduce our customer churn, that means that even roughly, , I'm sure even if their BI is not very good and there's some data quality issues, they'll have some idea of, okay, how many customers are churning today? What is our average order value? What is our transaction frequency?</p><p>And therefore, based off those numbers, I know my customer lifetime value roughly. I know my number of churn customers roughly, and I can go, okay, what if I can reduce that by 10%? What is that worth? Is that worth $500? Is it $5,000? Is it $500,000? For me, that's the starting point.</p><p>And I think the mistake we make in terms of these kind of ROI and value estimations as data professionals is we think it has to be something that, we're gonna calculate with our Python notebook that we need the exact numbers, that it needs to be precise, and it needs to have the same rigor as a lot of the other data work that we do, when actually a lot of the time all you need is a back of the envelope calculation,</p><p>I like using the term Fermi estimation, named after Enrico Fermi, who allegedly just before the Trinity nuclear test. Whipped out on a piece of paper, his estimation of what's the TNT equivalent of the blast that they were about to witness, and he got it. I think it, it was, he estimated 10,000. It was 24,000. I've probably missed the exact numbers. The point is he got the order of magnitude right? He was just off by a factor of two. And for these ROI estimations, it's the same thing. It's I'm not looking to understand is this gonna make us 600 K or 650 K or 700 k?</p><p>It's more, okay, this opportunity, roughly speaking, is in the 500 K to 1 million, and this other one is in the 10 K to 20 K. And this optimization, one of my engineers wants to do, to bring down costs is gonna save us $500 a year. And so if I look at those three examples, it becomes very easy for me to go, okay, roughly speaking, which of these is more valuable?</p><p>Or maybe more valuable per unit effort we're gonna spend. 'cause maybe the optimization will take one day and the trend modeling thing will take, I don't know, one year. And so that, that is a very good starting point. And then the other thing I wanted to comment, 'cause you say, oh, I ask my stakeholders, how do I go about doing this or give us a number and they're like, oh, I don't know. Do you know what makes it so much easier for them to give you a number? If you make that back of the envelope calculation, super basic, you sketch it out on a slide, on an Excel sheet, whatever, and then you show it to them. You're sharing your screen or you show the piece of paper and then they go, oh no, that's not right because this is not our lifetime value.</p><p>Or, oh no, that's not right because whatever other assumption you've made that's wrong. I found that it's so much easier for both technical and non-technical stakeholders to basically critique something or provide me with an input I'm looking for. If I give them the scaffolding, I'm like, here's the shitty first draft.</p><p>Now you tell me what's wrong. Instead of, Hey, here's a blank sheet of paper, please fill it in. Even if the blank sheet of paper is like a template for them to fill in, it's still harder to get an answer from them compared to, here's my wrong assumptions. Now, correct them.</p><p>So much easier if I go, Hey, here's the rough firm estimation I've done where I've assumed your lifetime value is this order times, this order frequency, and then that your churn rate is this much and that your growth is gonna be like that.</p><p>And then becomes real easy for them to start giving me an answer, because you said, oh I asked them, and how do we quantify that? And they don't know. I think a lot of the time it's that maybe when we ask, especially a non-data savvy stakeholder, that kind of question, they might think that we're asking for something much bigger, oh, these data guys, they're here to do smart analytics and statistic stuff with their fancy PhDs and computer science degrees and whatnot. It's no. Actually it's, some of these models are super simple. And for me, what I love about this approach is that it also means that it's so easy to do really quickly for a large number of opportunities, right?</p><p>I considered, let's say we've got 15 ideas or 15 requests or even 150. And if you can spend just a few seconds for each one to figure out, okay, roughly speaking, this is gonna save three hours a week from this employee who on average gets paid $50 an hour. So the value of this automation is this much, versus this is a cost saving that'll save this much.</p><p>And sometimes you might be missing one of the variables, right? Like in the churn model, it's okay, will this reduce churn rate by 5%, 10%, 1%? And you can just put a number that feels reasonable and what I found is usually that rough, 32nd estimation. Is almost exactly the same as the two week version of that estimation where we build the prototype churn model and we estimate what it's gonna be when it's in production and we test a bunch of different data sets.</p><p>And then all those times this happened in that order, I felt a bit silly when my manager may have been, look just plug in, 10% uplift. And I was like, no, but where did you get that number from? What if it's not 10%? And then I, come back two weeks later. 'cause I was insistent that this is not a case that we can just do a firming estimation and come back.</p><p>And actually it was pretty much the same number. I was like, damn, okay. Not necessarily wasted two weeks, but wasted two weeks.</p><p><strong>Shane</strong>: I agree with that. I mean, One of the things I say is when we work on the canvases we always get asked for how long is it gonna take to deliver that product. And yeah, typically we are doing this discovery really early. So any. Detailed estimation is waste because let's face it, humans are really bad at estimating anyway. But even if we weren't, we're at the discovery stage, we're still ideating, like you said, there could be a hundred ideas going and spending all that time estimating how long each one of those is gonna take 'cause waste at this level. We haven't prioritized that. We haven't said this is the top five so let's not worry about it.</p><p>Let's just do a quick t-shirt sizing, pull a number out your bum, the number will actually be quite good 'cause we're good at guessing and then move on, and we can do more detailed estimates if we have to at a later stage. So I'm with you on that. Do it light, do it quick and use it where the value is. But to do that is quite a skill. So if I think about it, you have to understand how data works, how effectively metrics works, and potentially metrics trees like you mentioned. And we have to understand the. , Organizational processes. And we have to be able to combine both of those to be able to quickly articulate how lifetime value works or how a churn number will work and what the impact of that is.</p><p>And that is a skill, and it's not a skill that I see a lot. What's your thoughts? Do you see it a lot?</p><p><strong>Nick</strong>: I think for sure being comfortable doing it. It takes some practice, like anything else, and it's gonna feel harder, even at least just mentally. But for me, it comes naturally because I've spent most of my career either in consulting or in data teams where we acted like internal consultants for our different business units.</p><p>So I'm very used to not really knowing very much about a domain, and asking a lot of dumb questions to my stakeholders about, Hey, how does this operation actually work? I draw the flow chart as we go. Maybe I'm even sharing my screen as I'm drawing out a process that a stakeholder's describing and they go, oh no, you forgot about this.</p><p>Or no, I forgot to mention about that step. Or actually, this part is more complicated. 'cause when A happens, we do B, and when C happens, we do D or whatever else. So the point I'm trying to make here is this is not an exercise that a data professional needs to do on their own . It is a collaborative exercise we need to do with our stakeholders for a couple of reasons.</p><p>Number one, because like I alluded to, we don't have the full picture. Even worse, we might think we have it based on the explanation we were given. And actually it turns out our stakeholders didn't mention a whole bunch of other details that were really important. But also, secondly, for me, it's also about building better relationships and trust with our stakeholders, i've seen this happen so many times when, and it's in data or more tech more generally, where the kind of tech team comes, they have some kind of discovery workshop and then they disappear for six months and then they show up with the thing they've built and they go, here you go. Please test it for us.</p><p>And one, what happens then is you often end up with having built the wrong thing. Because again, the picture you got during that initial requirements gathering exercise was incomplete. But also too, let's say you actually nailed it, right? You built exactly what was needed. You estimated the value potential perfectly.</p><p>Then that stakeholder turns around and goes who are you? Who are you to tell me that I should be using this dashboard? Now I've been doing this job the way I've been doing it with Excel for the last 20 years. So it's also about bringing our stakeholders with us on the journey, and that's just as true about the financial estimation parts. That's maybe we need to bring our finance colleagues along the journey, or the client's finance colleagues so that when the business case shows up on their decks, they don't go, who is this? What is this thing that these consultants or that the data team wants to do again?</p><p>They go, oh yeah, this the thing that Shane and I worked on together when I gave them the numbers from the budget to plug into the business case. And then I learned what a, I don't know, random forest algorithm is. 'cause I was curious and it becomes so much easier to work together. And the same is true on the user facing side.</p><p>Jono is a slightly different topic, but for me it's actually conceptually the same thing. It's so much better to build together with our customers, be they external or internal, than to do something on our own and then show up and say, here's my homework. And then you find out that the homework is wrong or that they just don't like the font you've used. 'cause that's not the font they're used to.</p><p><strong>Shane</strong>: I agree. I think that constant feedback helps us iterate and, figure out where we've heard wrong or where they forgot to tell us something. Or as you said, if you wait six months, something's gonna change anyway. That may be the most burning problem six months ago, but there's a real big chance that when we go back with the answer, six months later, they've moved on.</p><p>They've fixed it with Excel, and there's something else far more important, so it's no longer top of mind. So that idea of, constant engagement and collaboration is so much value. And one of the things I think you were just talking about is this, back to old school, almost business process mapping, this idea of nodes and links and saying to the organization how does the process work? Tell me something happens and then what's next? And let me draw a circle and say, this is the thing happening and here's aligned to the next thing happening. And from there we can start identifying those measures that can form that return on investment. , And one of the things I think you talked about was, looking at it from, cost saving, revenue improvement. And there's always a risk one as well, that's the trifecta that you can use is, are we gonna save money? Are we gonna make money? Are we gonna reduce risk?</p><p>That's the three that they always tend to come back to in my view. So I think, yeah, combining all those patterns together is really valuable.</p><p><strong>Nick</strong>: A hundred percent. And for me, exactly that trifecta is the starting point of what is the benefit we're trying to influence at the end of the day, because when someone says, Hey, I need a data set showing weekly sales. I'm like no, that, that's not what we're here. That, that's just a solution you have in mind.</p><p>You're asking the doctor to prescribe you specific medication, and then just like the doctor goes, okay, I understand, you went on WebMD and you've self-diagnosed that you have this, but. Let's just check your symptoms to be safe. For me it's the same thing, where we go, okay, let's understand the business problem.</p><p>And this is for me, super simple thing, but that so many data professionals get wrong about it. They see any request that comes from stakeholders as a, an order a command. And a lot of the time, one it absolutely is not like that. And the person making the request is clueless about what they need and they're coming to you for help, but maybe they've also not learned the right way to do that, to say, Hey, our sales are down.</p><p>I need to figure out why. I have a hypothesis that maybe it's because one of the regions is underperforming. That's why my request on the Jira ticket said, give me a sales breakdown by region. But then when they ask that you can start asking more probing questions to figure out, okay, what is this hypothesis exactly?</p><p>How can we test it? Maybe actually there's an element of, statistical know-how that this test is gonna require that a dashboard with a line chart is not gonna solve for the stakeholder. Maybe we can test alternative hypotheses in parallel that, if you know how to write code and to use different models is maybe trivial versus you make the dashboard, it spits out the line chart, the stakeholder looks at it, doesn't see an obvious pattern, or thinks this is gonna be too hard to make sense of, and they give up on it.</p><p>And you've ended up with dashboard number 952 that no one in the business is ever using.</p><p><strong>Shane</strong>: There's a couple of things embedded in there. So one is, if we look at product teams and the software industry or software domain often they talk about feature factories. Where somebody comes and demands a feature doesn't tell you what it's gonna be used for. Our equivalent is a data request, Here's a data request, give me the data. Don't ask me what I want it for. I still blame Jira for both. 'cause basically both of those problems are managed in stupid Jira ticketing systems. I think the other thing you mentioned is, , a doctor saying you've gone and Googled WebMD or whatever. I think we're hitting that new world, right? Where actually our stakeholders are gonna LLM the answer and come to us with a data request that's based on a AI bot telling them what the answer is. So it's gonna get worse before it gets better. , But one thing you mentioned was this idea of metric trees and, I'm old people mentioned now that I bring up ghosts of data pass and I remember a while ago we were doing balance scorecards cause and effects.</p><p>This idea of saying if we have a bunch of things we measure and we understand the business processes, then effectively we can see some causation or relationships between those metrics and where we see those causation or those relationships has some value, we can infer some things and help us make better decisions. What's your view on that is metric trees another. Good thing, but just a reincarnation of things we've done before or is it something different?</p><p><strong>Nick</strong>: I guess 'cause one, I've not been around this game for as long and two, I wouldn't call myself, someone that's gone super deep into metrics trees and understands them deeply. So I'm probably gonna give you a very lay person's answer. But it's that, yeah, a lot of the core concepts are not new, 'cause for me, fundamentally it's just about saying. I need to understand the relationship between the different inputs in my business and how these translate into outputs and to basically figure out what is my business's sensitivity for those different things, so for example, let's say I've got three potential initiatives I'm considering.</p><p>One is I want grow my mailing list. The other is I wanna improve my checkout rate, and the other is I wanna optimize my pricing to make more profit per sale. If you just tell me those three things, unless if I have a deeply intuitive metrics tree in my head, I'm not as the business owner gonna automatically know which one of these is gonna make the biggest difference for my business, whereas if I have a, forget about the fancy terms, if I just know , that flow. Of, okay, how many people do we have in our mailing list? What is our click through rate? Every time we send an email, , and then of that, how many people that land on our website actually go and convert?</p><p>And then what is the average profitability of our products? If I have this information somewhere, and if I know , those relationships, then I can plug in different assumptions and go, okay, what if I were to increase my mailing list by 10%? How would that cause a cascade in all the other numbers?</p><p>Okay. If you get 10% more people in the mailing list, because you actually have a super low click through rate and most of your traffic comes from other sources, your revenue would only go up by 0.1%. Whereas if you were to improve checkout rate because you're getting traffic for all these other sources, but your basket abandonment rate is quite high, actually, that would translate into quite a bit more revenue.</p><p>Then lastly, if you know the average margin you're making per product, if you were able to increase prices for certain strategic items, actually that on its own would lead to more profit than the other two things combined times 10, made up example. But for me it's more about the high school economics , not even high school, more like middle school mathematics that you might have to do.</p><p>When we did the few exercises around firming estimation with my students as they were doing it, I was like, oh my God, this feels like I'm an elementary school teacher. 'cause I'm just asking these guys to literally, they're doing multiplication. There's nothing more to it. I've given them all the assumptions and all they have to do is pick which numbers to multiply where.</p><p>And that's how I want the data team to show it to their business stakeholders. Even if there's a lot of complexity behind how we calculate basket abandonment rate or margin or whatever else. At the end of the day, the output, especially the output that a business stakeholder sees, has to be super simple so that they can also understand this whole concept of making data driven decisions.</p><p>That's ultimately what we're trying to do. I think metrics trees are great because even just visually, they help us understand that relationship between different metrics instead of, here's our metrics report. It has multiple tabs and lots of charts, and you basically can't really figure out how to mentally connect the dots between all these things unless if you have that metric strain in your head.</p><p><strong>Shane</strong>: What I can see is data people. We love the detail, so as soon as you say card abandonment rate. I can imagine a data person going we can't just use an estimate for that. 'cause we know the data's there, so we'll just go and spend, three months grabbing it modeling it, , getting the actual abandonment rate.</p><p>And then that'll make our firm aim model so much more accurate when we try and determine the ROI and I can see that logic, but then it's about time to market. It's about, again, we don't know that is the most important information or data product to build next. We're trying to use this technique in the discovery ideation phase not furthered down, and therefore light touches have value. But do you find that, do you find that data people naturally want to go and grab the data, do a whole lot of work, and make it as accurate as possible?</p><p><strong>Nick</strong>: For sure. And I fall into that trap too sometimes, because I've been among data. People for long enough that now I do the same. But look, at the end of the day, there's gonna be some things where actually knowing the real value is quite important. Either because your estimate might be super off or because actually if we're doing these small optimization things where we're just trying to increase one thing by 1%, 'cause that 1% is still worth a big number, actually being off by a couple percentage points could make the difference between something being a super profitable investment and something setting money on fire.</p><p>But for me the key thing here about both product thinking and data and building data products instead of data projects and also metrics trees, which in a way it's just the type of and collection of data products is that we don't go, oh, there's a new project we're doing now. The team needs to go and figure out the basket abandonment rate and figure out all these other metrics. cause we're trying to do this one-off kind of project. And even if it's not a one-off project, but it's a specific use case. We shouldn't have to build all these data models. And then you have 10,000 versions of what lifetime value looks like in your business. You go, we will invest a lot of our time as a data team.</p><p>Exactly. Because it is complicated and a few of these things will take many weeks to do into building out these key metrics. And so we have our collection of metrics that form into a tree and basically that you then go, anytime I want to test a new hypothesis, I wanna explore something, I can rely on these core standard reusable data assets, instead of going, everything is a new project and everything is a new DBT model and everything is a new Tableau dashboard and whatever else. For me, that's where the power of it lies. It's a slightly separate topic to the ROI estimation part. I still think a lot of the time you're better off starting with your firming estimates, and then of course if you have the real data to plug into one of those assumptions, great.</p><p>But if you don't wait until you do. Unless if it's a really high stakes investment, , if you are gonna be committing to a seven figure investment in your business, then yeah, maybe back of the envelope isn't good enough to get fully started. But if, as you said earlier, if we're just trying to figure out what are the top four opportunities right now out of the 15 we have, and then once we've zeroed in on the top four based on our Rough and Ready firm, me estimates, then we can do more homework.</p><p>Then we can send the data team to go and do some more calculations if needed.</p><p><strong>Shane</strong>: I can actually imagine that as you do the firm a calculations, you're gonna find. There are certain inputs that you use on a regular basis. Especially if you are working in a specific domain, so let's say the domain boundaries are all based on organizational hierarchy or maybe business process.</p><p>So , we have a sales organization that deals with the sales side of things and they're the highest value part of the organization that we've been asked to help right now, because we are bound within that domain. I can imagine that, the inputs we use for the estimate, that ROI calculation are sometimes gonna be reusable, we're gonna go, ah, maybe it's lifetime value. Maybe it's the funnel and how many people we're converting from, , suspect to prospect, there's gonna be this thing where we're constantly using that as part of our calculation. So we can say because we're using it so often,, seems to have value.</p><p>Therefore doing a little bit more work on what the actual number looks like is gonna be really useful for us going forward. So again, we can use it as a way of figuring out what's the valuable thing to build effectively.</p><p><strong>Nick</strong>: For sure and I wanna make two comments about this. First is it's not like a lot of these metrics we are calculating specifically to work out ROI, these are also gonna be the metrics that are part of the data product or related to adjacent data products, if for example, we don't know our cost to fulfillment, but we're doing a use case that relates to reducing our cost to fulfillment, , getting to the precise number of what is our fulfillment cost is also part of the business case, so we start with our rough estimate, 'cause we have an average provided by finance, and then when we wanna break it down by product, that's part of the optimization project. That means that next time round we actually have a data product that is our cost of fulfillment per ku, , as an example.</p><p><strong>Shane</strong>: The other thing I can imagine people wanting to do then is go into the detail to say if this is the return on investment that we've estimated for that information product actually we should measure it at the end to see whether we actually did deliver it. And that makes sense. But then we come to a whole lot of complexity because there's a whole lot of other factors that will influence whether we are getting a reduction in churn or whether we are, reducing the cost to serve or the cost to produce. And it's really hard to isolate those other factors, to say this one thing we did. Now my view is I don't care. If we do some effort and it looks like it's moving the lever, as long as we keep doing effort and the lever keeps moving, we are always getting to a better place. But how do you find it? Do you find that some organizations or some teams want to then prove the ROI six months down the track?</p><p><strong>Nick</strong>: For sure. And I agree that sometimes the rigor needs to be more than a simple pre-post analysis, especially when you've got dozens of confounding factors or if you're at a place where actually you're carrying out many experiments in parallel. for me, that's one of those things.</p><p>It's let's cross that bridge when we get to it, we don't need to assume that by default we need to carry out randomized control trials and super rigorous ab tests in order to figure out what's moving the needle. But then the other thing I wanna call out is that everything you've just mentioned is something that typically does not get talked about at the start of one of these initiatives.</p><p>And so the data team, 'cause they've received a request in Jira, they go and fulfill it. There's probably a million back and forths because the request initially that was made, actually, that's not exactly what the stakeholder wanted, but you built it. Now that they've seen it, they realize they wanted something different.</p><p>Anyway, eventually you get to your done state and then you go, oh man, it would be great to know the ROI of this thing we've built for this stakeholder. And then you realize that it's unclear what the definition of success is. You've not done that exercise to go, okay, what is the exact decision that's gonna be made by which person and how will we know that they're making this decision because of this data?</p><p>And for me it's a little bit like how I'd have classmates in high school who would write their essay first and then go look for citations to prove it. And you know what? In high school that worked But a little bit later down the line, it, it can't, 'cause you need to build up the essay, so to speak, off the back of citations.</p><p>Unless if you're just making things up. And similarly, I find it's so much easier to actually figure out the ROI of something, if we've done this exercise that goes, what is the business outcome we're gonna be influencing here? Because when you ask that question, you very often realize that, okay, the thing we thought we needed to build actually is a little bit different, or how do we make sure that we embed this decision support information into the decision making process of a stakeholder? How do we then measure or have some even approximate way of knowing when someone took a decision using that data that we gave them, as opposed to they just went with their gut, same as they have been, but they also opened the dashboard just to see what's there, </p><p><strong>Shane</strong>: I was gonna say our best measure at the moment is that somebody actually used it, that the dashboard's not sitting there being unused for six months. The best we typically have is it got used lots. Not that it was used for anything, but it was open and run..</p><p><strong>Nick</strong>: Oh yeah. And for me, even though I think it's useful, especially it's useful to know that something has not been opened, actually the usefulness of that telemetry drops off very rapidly after that. Because you very often have situations where someone is opening the dashboards, but then you don't know exactly what they're doing with it.</p><p>If it's even useful for them. And especially if there's some kind of top-down pressure of, guys, we spent all this money on this new set of mi, , you all need to be accessing it, or it'll affect your performance score. Then easily someone can just open the thing, not even have their eyeballs at it, and then close it a little bit later, or they open it and maybe they actually, viewing time has increased because your dashboard is more complicated and the user's not able to actually get things out of it.</p><p>And for me, the reason I think it's a mistake is because in most businesses, the number of users you have is just not that big, which means you don't need to rely on quantitative cold-hearted information when you can just ask people, you can have one-to-one catchups.</p><p>You can interview people, you can even have a survey form where you ask a few open-ended questions like, Hey, what do you like about it? What do you not like about it? Do you have any ideas for how we can improve it? And use that qualitative data because it's so much richer than, okay, average viewing time has gone up by three seconds.</p><p>What does that mean? How does it relate to the business? Also really importantly, 'cause I feel like whenever people are telling me they're struggling to connect the work they're doing to business outcomes and to ROI, it tends to be bi, because the way a lot of BI is built and structured in most organizations is in a way that I just think is wrong, because it's not actually helping enable better decisions. It's just, Hey, let's build a dashboard about this. It's not, okay. As part of my workflow as an operations manager, we're gonna change it. So actually now every Monday morning I open this dashboard, which has these specific KPIs and I make 1, 2, 3 actions off the back of it every week.</p><p>That we've built instrumentation to measure how those actions play out, and so we can have an idea of how those actions improve in quality over time. Most of the time you don't have that kind of mapping. You don't have any measurement of either the action itself or the thing the action is looking to influence.</p><p>I'm taking an action to improve our copy so that next week's newsletter gets a higher click through rate, which means then, yeah, it's literally impossible to measure retrospectively the impact of that dashboard that you built that someone maybe looks at. But it's very unclear what the benefit is.</p><p>It's either unclear because you as the data professional just don't know exactly what they do with it and it's a black box, or it's unclear. 'cause actually there is no real benefit. Or at least it's super, super fluffy.</p><p><strong>Shane</strong>: what would you recommend to an organization, let's say the organization's got to the stage where the data team are engaging with stakeholders. They're thinking in terms of products. They're asking those questions of, with this data, what action are you gonna take?</p><p>And if that action successful, what outcome do you think you're gonna deliver? And then they've led them and done some, whiteboard numbers to say, okay let's quantify this in a really rough way, we think the value to the organization's this. And then they go and build it, and they build it quickly and lots of feedback and it's actually what was needed? How do they deal with that last bit, how do they deal with now looping back and saying, we actually want to know whether it was used, whether it helped that action, whether it helped deliver that outcome.</p><p><strong>Nick</strong>: for me, generally speaking, unless we're talking about a super small, trivial requests that we're just gonna turn around and it's not a big project. I would basically insist that there needs to be a measurement and evaluation work stream as part of the project. To use the analogy I used earlier, we need to have a kind of research and citation work stream happening in parallel as part of the project.</p><p>In the same way that there's gonna be a sort of scoping phase design. We're gonna get approval for the wire frame, then in parallel. We're building the data model at the same time as building the dashboard. At some point in that parallel stage, we're also defining what are the metrics for success.</p><p>And if those metrics are not, based on numbers that are readily available because we've got our metrics tree because it's part of another, report, an mi, a dashboard, then we go, okay, now we need to set up some kind of measurement for this new thing that is gonna be part of the success metric.</p><p>And then as part of that, we're probably gonna build a second dashboard. That's gonna be measuring that effectiveness, i'll give you an example from a project we were doing a few years ago with a container terminal. Really complex operation where actually anything you try to change is gonna have knock on consequences on many other parts of the operation.</p><p>But after doing a combination of qualitative and quantitative analysis, we're like, okay, one of the things that we can dramatically improve for the overall productivity of the port and the north star metric of the port, which is the productivity of the big cranes loading and unloading vessels as they come in, is actually if we can optimize the way we allocate drivers in the trucks that can have a big impact on the overall productivity of the port.</p><p>So in that case, there were some metrics that we had already, like the big crane productivity, that was a metric that was firmly established in the organization 'cause they reported on it constantly. Then there were some truck driver productivity related metrics that basically didn't exist in the existing MI suite.</p><p>So it was like, okay, if we're gonna build this product to optimize driver allocations, then we also need to add something to the management information that we produce so that we can be measuring the effectiveness of this model, so we can see is it actually delivering the value we were expecting it to deliver?</p><p>And for me this points to a really important problem that a lot of organizations face, which is it's not just that, oh, if you have more data that's better and you can make more decisions, you need to have the right data. And it's why I really hate the kind of big data, data lake approach of, oh, let's just dump all the data we have.</p><p>Then the data scientists or the data miners will find value out of it. 'cause then usually what happens is when you start a project and you look at what's in the data lake, you realize that actually the super important column for the model you're trying to build, actually it doesn't make its way into the data lake.</p><p>Or it never existed in the source system in the first place. And so when we're deliberate about the business problem we're trying to solve, we can also be deliberate about what are the data points we will need and do they exist? And if they don't exist as part of this project, we need to create those data points.</p><p>It's not a nice to have it's a must have condition for the success of the project.</p><p><strong>Shane</strong>: then I'm gonna jump naturally to the next step, which is, okay, so let's say that it's optimized for the big crane and therefore we are going to affect the optimization or the workflow of the drivers. I'm naturally then gonna want to build out , that metric first so that I can benchmark the current state. So then when we make the changes, we can then see have we had a positive or negative impact? And then I can see how that would then potentially delay the production of the information product that has value, because now I'm spending time getting ready for the benchmarking data before I then go and build the thing that actually is gonna make the change. Is that a natural trap to fall into or is that something that If you can, you should do.</p><p><strong>Nick</strong>: Obviously we're talking about it hypothetically, and it's gonna be a little bit different in each organization, but even in this hypothetical, the two objections I'd have to, that criticism would be one. Okay if this is gonna be a project that'll last several weeks, let's just sequence it.</p><p>So that we start collecting this data from the very start of the project. We make it the first thing we do, not the last thing we do, actually, we've done this a bunch of times where we said, you know what? You guys aren't collecting this data 'cause it gets deleted at the end of the day. But you know what, if we start collecting it now, by the time we will need to use it for the model we're building, we will have just enough data.</p><p>So that's one thing. And in some cases maybe this will mean, yeah, the project will take longer, but if that's what's needed, then that's what's needed. It's the kind of objection that if it was a let's say a technical objection to do with the core of what's being built, it would be so much easier mentally to say, look, this is a must have, right?</p><p>I cannot build a model without this data. Whereas it's very tempting to say, okay, fine. I guess we can build the model without also collecting success metrics. Yeah, maybe you can get away with it once or twice. Orric. Actually, if you're systematically doing that, yeah, you're systematically not gonna know the benefit of what you're doing.</p><p>Which for me,, it's so weird because the whole point of having a data team is to make an organization more data-driven. And what being data-driven means to me is basically being evidence-driven. We're not making decisions because of just gut feel or copying our competition blindly, but we're using evidence.</p><p>But then we are so bad at basically following our own advice and using data to inform what on earth should we be doing and is it worth doing? So that, that's the kind of first objection. The second objection is you probably should be collecting that data anyway, same as what we were talking about with metrics, trees, the things that are gonna be your success factors.</p><p>Not always, sometimes they will be super specific to a project, but other times they're gonna be core metrics that you should start tracking anyway because there's gonna be other initiatives later down the line that we should be doing, but. I would rather let's say, collect that metric about the truck driver productivity as part of this project.</p><p>That is a specific optimization we're trying to do instead of do. The other thing data teams often say is, which is, oh, we need to invest in data foundations. Oh, let's build out all the metrics first so that then we can track success later. For me that's bad for , two reasons. Number one, maybe you will track the right success metrics or the right metrics, or maybe when you get to your new use case, you realize this wasn't the right thing anyway.</p><p>But also, secondly, you are proposing building something that delivers no inherent value on its own. Which is all too common. All too common that we spend two years doing a big migration promising the business that after the big migration we'll finally be able to deliver value. And what do you know, two years later, CDO gets fired.</p><p>New CDO comes in first thing they do, they wanna rebuild the platform and promise again that the results will come two years later. So for me, it's also how can we bundle any kind of, let's call it technical debt or platform investment or foundational investment together with something that's gonna deliver value to the business,</p><p>and delivering minimum increments of value. Not, oh, let's do the platform stuff first and the value making stuff later when we've left the company and it's someone else's problem.</p><p><strong>Shane</strong>: And got the new tools on our CV and got promoted in the next job, 'cause that two years was really great fun. It got paid a fortune and got a better job. We used to blame Waterfall for big requirements up front and foundational builds.</p><p>And then lots of organizations went down the agile path and we started seeing six months sprint zeros, where again, it was just pure foundational build. No, engagement with stakeholders, no value. We do have to balance it out between ad hoc behavior where we don't do any foundational work. So there is that, horrible balance of building the airplane while you're flying it, that is the balance. But it is balance, it's about the context.</p><p><strong>Nick</strong>: I agree. I'm not advocating for basically being the clueless business person that does not understand what the engineers are on about when they talk about technical debt and just wants ship me these features 'cause they're gonna make us money. But we'd need to be intentional about it.</p><p>Any piece of technical debt for me is not inherently bad. It's debt I have borrowed from our future selves, from our future data team. In order to usually test something, there's no point building something super robust and scalable if we don't know it's worth scaling in the first place.</p><p>But also maybe it's because tactically actually it's gonna really help if we can use this to provide some results for the business before the end of the quarter, either because there's a financial upside or just because it's gonna help us get more trust with them so then they can spend more time with us, involve us more early on in decisions around what we should do and all that good stuff.</p><p><strong>Shane</strong>: of the things that naturally people want to do is take those outcomes and those ROI statements and link it back to the corporate strategy. There's a strategy, PowerPoint somewhere with four boxes, or there's the top, 11, 15, 25 initiatives that are the most important in the organization. And, it's easy to gamify, it's really easy to say , these metrics support the strategy. It's actually damn hard to find metrics that don't support your strategy in some way using creative language. Do you ever worry about it? Do you ever bother saying that, these ROI statements, these metrics line up with the corporate strategy and the initiatives, or do you just ignore it?</p><p><strong>Nick</strong>: because I'm not advocating for just, oh, we should link the initiatives we're doing to some other broader business initiative called, I don't know. Be loved by our customers or get trusted by our partners, whatever. For me, that's fluff.</p><p>Sometimes it's very useful to link it maybe so it falls under the right OKR or that it gets the attention of the right people. But at the end of the day, most things, you should be able to express their value in financial terms, it's either money in the bank, we have generated incremental revenue, or we have saved costs that definitely would've accrued otherwise.</p><p>In other cases, it's a little bit fluffier, but still has a financial number next to it. If we build an AI chatbot that helps our employees get questions about HR policies faster it's not necessarily that, okay, we've saved 10 hours a week from our HR employee, and that's money in the bank because we're still paying that HR employee full time.</p><p>Now, if you have a massive organization actually go, no, we will downsize that team. That will be money in the bank. But in other cases, you can just quantify the productivity saving, you go, we have saved the legal department 15 hours a week times 20 employees paid a hundred dollars an hour, and you go, okay, that's not money in the bank, but that is the value of the productivity that the legal team now can redeploy elsewhere. And for me, it's not as good as money in the bank, but sometimes it's necessary that's where our estimate stops. But that estimation is still so much better than just going, we are making the company more productive and AI ready.</p><p><strong>Shane</strong>: then do you ever bring in the cost side of the data team? One of the things people often say to me is, the data team are trying to do the right thing, but nobody from the organization will engage. The request gets handed across, they. Want somebody from that part of the organization to work with them as subject matter expert, whatever. And I often say to them have you worked out what the data team costs every week? And have you gone back and said, Hey, this is four weeks worth of build. That's a hundred thousand dollars of time. Is it worth a hundred thousand dollars to the organization? And often data teams don't do that, they don't quantify their own costs. Do you do that as part of understanding the ROI or do you not worry about it as much?</p><p><strong>Nick</strong>: Of course, 'cause ROI is not just what is the value, it's return on the investment. So the I part is basically the cost. And here it can get complicated because what are the fixed costs that we're just assuming are sunk in? Very basic example, will I consider the Microsoft teams license of each of my data scientists as part of the investment?</p><p>Or do I just go, look, this is a fixed cost the business has made and we're talking about incremental return on investment. Meaning how many additional hours of a data scientist time are we gonna spend? Or maybe if we need to hire contractors, it's about the cost of those contractors, or it's about the incremental compute that this will cause because we're now gonna have to pay Databricks a bunch more money because we're doing all these, building all these new models so that takes some.</p><p>I think finessing and it basically, it requires you to understand what, I don't wanna say what you can get away with in your organization, but more how does your organization think? Are there some costs that ultimately they're subsumed in some centralized budget? And so one, you'd be doing yourself a disservice if you included those because the rest of the business isn't.</p><p>And two would maybe skew the picture of whether this is a worthwhile activity for me I also think the thing that doesn't really work as effectively is to just turn around to the business and go, this request is gonna cost a hundred K. 'cause usually it's not like they will pay for that if you have some kind of cross charge model or you're like, look, we would have to hire new people to support this new initiative.</p><p>And so we need to work together to make the business case for finance to give us this additional budget. Then you can definitely do it and it makes sense and it potentially will come out of the marketing team's budget not out of yours. Again, depends how your organization does budgeting. But for me I think it more in terms of if I'm running the data team and I know that cost, I know how much it's costing us to keep the lights on every week.</p><p>First of all, I wanna make sure that at the very least we are delivering more than that, like a multiple. If we're costing a hundred KA week than any given week, we should be delivering at least 110, if not 200 k back to the business. And to be reporting that upwards on a regular basis. 'cause it's sometimes also easy to forget.</p><p>And if it's not about that, if it's not about the kind of holistic picture of what is the data team doing, at the very least, using the, even without the cost part, the value part to figure out, okay, my team could work on any of these three things for the next two months, which one is worth doing?</p><p>And then using that language, not just to decide internally amongst ourselves, which use case are we gonna push for, but also if that involves telling two stakeholders that their baby's gonna get deprioritized because we need to work on the other use case, point them to those financial numbers.</p><p>Be like, look, this will cost, your thing is gonna cost the business 200 K, but based on our projections, it's only gonna make us an extra a hundred K. And the way I usually frame it is I go, look, if you can find a way to get acceptance for that kind of thing, maybe we can do it. But it's really hard for me to either deprioritize someone else.</p><p>10 XROI initiative that they've put in front of us. Or just to justify it to my boss that hey we're gonna be costing the business money because Bob really wants us to build churn prediction model version 57. 'cause version 56 wasn't good enough.</p><p><strong>Shane</strong>: that comes back to that really interesting question of who does the prioritization. Because often in an organization, it's a committee, it's a group of stakeholders , whoever's got the biggest voice or whoever's the hippo gets to choose and it gets given to the data team. Whereas if we have this idea of a data product manager, or data product leader or whatever you wanna call them, then often they are the ones that should be making the final decision, they should be saying, okay, based on the value then these are the ones that are the most valuable to the organization.</p><p>So this is the one we're gonna build. But often data teams don't work that way. Again, it comes back they get given a data request and somebody outside the team effectively prioritizes and tells them what's the next most valuable thing is, but that person's not held to account for the value.</p><p><strong>Nick</strong>: I think my answer is to which model works best. Also depends on the organization and the politics. Who controls the budgets, the power, what happens when, but in general, I'm against either extreme. I'm definitely against the business, has decided the priorities and they just chuck them to the data team who have data product owners whose job is actually to just decide maybe the exact sequence of delivering these things.</p><p>And there's no product work being done there. We can rant about safe and scale agile another day. But then secondly I think it's also not good if especially if it's framed as the data PM is the final decision maker. And for me it's fine if maybe me as the data pm I've got all these requests from different departments and I do that prioritization, but then I wanna involve my stakeholders in that.</p><p>And here it, it gets a bit murky. It's not, I'm gonna give everyone a vote. I'll be like, look guys, we've got these five initiatives and they all have a different number of zeros attached to them in terms of what they're gonna bring to the business. And so maybe if Bob says, look, I know mine is only a six digit opportunity compared to everyone else's seven digit opportunity, but actually for whatever let's say strategic reason or because it might cause bigger problems later down the line, it's more important and there it's gonna be a balance between, sometimes we're gonna be more consensus driven depending on how the organization works.</p><p>And other times I go, look, Bob, if this is so important, basically you need to escalate upwards because my hands are tied. I cannot put your a hundred K thing in front of someone else's 9 million opportunity, assuming they're gonna take a similar amount of time Anyway,</p><p><strong>Shane</strong>: I think we see that in the software product domain. We see that product managers often are reliant on influence to get the right things built, and that's effectively their job, I find it interesting that if I look at other shared service type , part of an organization, so if I look at HR and I look at finance. They are a cost, and yet they very rarely have to do ROI statements for the value they deliver. And for some reason data does. And my hypothesis is because data typically came out of the IT side of the org. And that was always, seems to be somehow having to justify what they're spending their time on more than those shared service organizations.</p><p>What's your view on that? Is that what or do you not see that?</p><p><strong>Nick</strong>: Great question. The kind of thing that comes to mind first is actually the fact that we see, for example, HR as a cost center, and often it also as a cost center, especially the more base it functions is a problem, it's a problem because it means that a lot of things become invisible to the business, the fact that we see our expense tracking software, that's just a fixed cost, it's a cost center. It's the cost of doing business. And that's why it has selected the most competitive offer in terms of which horrible expense software we're gonna use. And what doesn't happen then because it doesn't have to prove the value or the return of using that software is, there's no, explanation of opportunity cost we're not looking at, for example, okay, on average, because we're using our. Shitty home-built software from 20 years ago that our employees need to use Internet Explorer taxes to submit their expenses instead of just giving Concur some money and using that instead, or something even newer that can than Concur that wastes on average five hours a month from 20% of our employees that do lots of business travel.</p><p>And when we add up the total cost of that, actually it's a multiple of using the fancier software as a service expense software that it doesn't wanna use. So my first challenge is actually, I think a lot of teams and departments are seen as cost centers when actually that's not right either.</p><p>Because we're not evaluating opportunity costs, we're just evaluating costs in a vacuum. But then secondly, I'd say, I think it's fine and good for data to have this slightly unfair expectation of proving our value. The reason is maybe you tell me. Look, by and large, we don't need to think about the ROI of the laptops we have and of using Microsoft Office because it's well established, it's commoditized.</p><p>We know that we just need to have it in our business. We just need to have some HR information software. We just need to have a payroll provider. And for some of those actually, yeah, cheapest is best. Doesn't make a big difference. If we're using the newer fancier payroll provider, its job is to just get money from A to B, in data, that's less the case, in data. Partly because we're more nascent profession. We haven't figured out how to professionalize our industry, if that's even the right thing to do. There's a lot more question marks. But also, secondly, we're not just providing BAU services to the business.</p><p>We're helping innovate. Or at least for me the more value additive things a data team can do. It's not, we're gonna churn out bi dashboards, showing metrics for the sake of metrics. It's, we will help the business use data to improve the quality of decisions to potentially unlock new streams of revenue, to build new products.</p><p>Now with AI and the fact that you basically need your data function to build a lot of the infrastructure, if you wanna have an AI feature in your product, that's becoming more obvious. But it was true before that as well, it was true before that machine learning could be used not just to recommend a decision in a BI dashboard that someone can ignore, but actually automate a certain part of the process, to automatically spit out recommended content for someone looking to watch a movie or looking to buy something from an e-commerce store or whatever it might be. I don't want the data team to act like a cost center, because then we're just gonna default to doing. Bare minimum low value adding tasks that can be commoditized and expressed in the form of ServiceNow or Jira Analytics and maybe we need to do a little bit of that, but if that's our main focus, I'm gonna change careers, </p><p><strong>Shane</strong>: yeah you made me giggle on the expense one. I remember it was probably two examples actually. One was somebody working the defense forces that could go out and spend billions of dollars on new tanks but then had to go through seven layers of approval and the expense system to get their parking ticket. The cost to park their car paid and then somebody that was, Working for a software company and used to do seven digit sales and again, same thing would spend four hours of their own personal time on the weekend going through the expense system to claim back the coffees. It's just amazing, when you think about the cost of that. Alright, just looking at time, I just want to close it out with a question around what makes a good product manager and let me frame it in a certain way. Back in the day when I was working, with teams that were trying to adopt Agile and we're primarily going down the scrum path, so picking up the patents from Scrum and yeah, good point on safe. We probably don't have time for me to rant enough about that one, but in Scrum one of the roles that was common was the idea of a scrum master. And what I found was business analysts for some reason. Naturally did the scrum master role , if I was looking for somebody to come in as a new person into that role and somebody had a business analyst background, I found they approached that role in a certain way that made them in the team successful. Whereas in my experience, and this is my experience, if somebody came in from a fairly long project management background, they didn't.</p><p>After a while I saw the Pattern and the Pattern really was a project manager wanted to stand in front of the team, and a business analyst was happy to stand at the back. , But what have you found if you were looking and saying, there is a natural set of skills or a natural role that lends itself to adopting the role of a data product manager. What have you seen? Where would that be? I.</p><p><strong>Nick</strong>: I've thought about this quite a bit. I'd say it's two things that we can call aptitudes, but I think they're learnable as well. It's not, oh, inherently you need to be like this. It's not about are you technical, are you not, it's not, have you worked in data? It's not, have you worked in product before?</p><p>To be a good data product manager and also a product manager In general, I think two things make the most difference. Ownership is number one. It basically means are you invested in the outcomes you're trying to enable? Or do you just go, my job description said I needed to do this task and I've done it so it's not my problem.</p><p>The product is still not making money for the business. That sense of ownership has a huge impact in how you go about executing in your role. And product managers often have to basically fill in the gaps. And in one team, that gap might be more on the technical leadership side. On another one it might be to do with helping marketing or maybe you're doing the marketing 'cause there's no one to support you there.</p><p>And another one, you need to do financial analysis to figure out the right pricing or to work out the cost. So you need to be fluid. And the best way for someone to embrace that is to have that sense of ownership because they're invested in the outcome, not just thinking of their role in terms of processes and outputs similar to those project management types that you mentioned.</p><p>And then the second thing is a sense of curiosity. Because again every product is different, every team is different. You need to have and cultivate a sense of curiosity to learn more about your users, about your business, about the technical underpinnings of your product, about what the data actually shows and means.</p><p>Because otherwise you're gonna end up being either the kind of PM persona that doesn't understand the tech at all, doesn't join. The engineers delegates it fully and they just become an information sifter. They go to their stakeholders and they're like, here's the copy paste answer. The engineering team gave me about why the product doesn't work as you liked.</p><p>Instead of, Hey, because I understand where your feedback is coming from. I've been working with the engineering team for us to figure out how we need to rebuild this for it to make sense. And it's also the same in terms of understanding your stakeholders, if you don't have a sense of curiosity to be like, tell me more and let me understand this problem.</p><p>You're never gonna build up that picture of the business and the pain points that you need to connect the dots and do more than just turnover requests that are coming reactively to you, in order for you to be proactive and in order for you to be able to actually propose initiatives to the business that are gonna make a strategic impact, you need to have that curiosity to understand what is it that we're doing both on the business side and the tech side.</p><p>And for me, those are muscles that you can exercise or that you can let atrophy. And so I always recommend to people that are either in the role or thinking about the role is how can you start demonstrating those things, not just to say in a job interview that, hey, I've done them, but to be well placed to actually do a good job when you get the job.</p><p><strong>Shane</strong>: I think one of the things you said there was for me, one of the really important ones be inquisitive around what the problem is. Understand the problem itself before you worry about the solution. Because otherwise, the solution may not solve the problem, may be a great solution, but it may not solve the problem, which again, comes from, that product thinking is understanding the problem the customer has, maybe jobs to be done.</p><p>I do like the jobs to be done stuff. helps articulate things in a way that makes me think different and therefore I've gotta keep challenging myself to well, actually what is the job to be done? What is the problem to be solved? , So again, just keep asking yourself that, and if you can't articulate it, that means you don't understand it. Which means if you're stuck, which you are between , organizational stakeholders and the data team you've gotta understand both sides, and that's a good way of thinking about it. </p><p>Excellent. If people want to get a hold of you, what's the best way to find you?</p><p>See what you're doing, what you're writing. Get in touch.</p><p><strong>Nick</strong>: Best place is probably LinkedIn. You can look me up. There's only two Nicker buds out there, and the other one does not do anything related to data. I also write some longer form articles on Substack, but all that stuff is linked on my LinkedIn where I spend most of my time kind of writing shorter form things. </p><p><strong>Shane</strong>: Did you start out on Substack or did you start in Medium and jump to Substack?</p><p><strong>Nick</strong>: No, I started out on Substack. Before that I had a personal blog that was not data related at all.</p><p><strong>Shane</strong>: it's interesting that kind of, LinkedIn and Substack seems to be the default now for people that are sharing their knowledge in the data space and, , I don't know about you. We find both of them lacking in certain areas. So it's gonna be interesting to see if a new product turns up that takes over both the networking capability and the content sharing.</p><p>We'll see.</p><p><strong>Nick</strong>: I just keep getting Chrome extension ideas for how to make both of them suck less, and I might start building out some of them just to help my own problems. I'm a big fan of both. </p><p><strong>Shane</strong>: Well if you can solve that one for me. Something that sucks less on both of them was something I'd buy as well. So look forward to that one. Excellent. All right. , Thanks for your time. , That's been good in terms of articulating some clear patterns around ROI, things that are relatively simple. As one cicada says, don't boil the ocean. I think that's one of the key messages that I got was don't overthink it, do enough that it's useful and then just keep getting better and better at it.</p><p><strong>Nick</strong>: Thanks for having me, Shane. Thanks for listening to all my ramblings and yeah, a hundred percent. Keep it simple. It's easy to do, complicated. The hard thing is to simplify.</p><p><strong>Shane</strong>: It's been a great chat. Thank you for that, and I hope everybody has a simply</p><h2>&#171;oo&#187;</h2><div class="pullquote"><p><em>Stakeholder - &#8220;Thats not what I wanted!&#8221; <br>Data Team - &#8220;But thats what you asked for!&#8221;</em></p></div><p>Struggling to gather data requirements and constantly hearing the conversation above?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0Bu2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0Bu2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg" width="387" height="342" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:342,&quot;width&quot;:387,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:19725,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/160520537?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!0Bu2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Want to learn how to capture data and information requirements in a repeatable way so stakeholders love them and data teams can build from them, by using the Information Product Canvas.</p><p>Have I got the book for you!</p><p>Start your journey to a new Agile Data Way of Working.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://adiwow.com/168&quot;,&quot;text&quot;:&quot;Buy the Agile Data Guide now!&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://adiwow.com/168"><span>Buy the Agile Data Guide now!</span></a></p><h2>&#171;oo&#187;</h2>]]></content:encoded></item><item><title><![CDATA["Things" and "Thing Types" in the Context Plane]]></title><description><![CDATA[How I think about the things and the thing types that need to be in the Context Plane to power "AI Agents"]]></description><link>https://agiledata.info/p/things-and-thing-types-in-the-context</link><guid isPermaLink="false">https://agiledata.info/p/things-and-thing-types-in-the-context</guid><dc:creator><![CDATA[Shagility]]></dc:creator><pubDate>Wed, 03 Sep 2025 22:30:56 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!_naJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F446cea6d-8c0e-4b81-a18d-ddc9e5a262d3_2179x666.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>How I think about the things and the thing types that need to be in the Context Plane to power "AI Agents"<br></p><blockquote><p><strong>&#8220;Context&#8221; of this post</strong></p></blockquote><p>I often find writing helps me coalesce and refine my thoughts when new patterns start to emerge, but aren&#8217;t very clear yet.  </p><p>So this article is a brain dump / train of thought continuation of the architecture needed to have one Context Plane to rule them all, as part of a proposed &#8220;AI Data Stack&#8221;.<br><br>This article provides an overview of a way to describe both the Object Types and the Object Categories that will hold the metadata I think should be included in the Context Plane as part of a new &#8220;AI Data Stack&#8221;.</p><h4>The Context Plane Architecture</h4><p>I have written some initial thoughts about the architecture of the Context Plane and the things that shoud be stored or federated in it before.<br></p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;d2a6fc03-99d2-40bd-9780-b20ffef35db7&quot;,&quot;caption&quot;:&quot;How I think about the things that need to be in the \&quot;Context Plane\&quot; to power \&quot;AI Agents\&quot;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Things in the \&quot;Context Plane\&quot;&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:2774203,&quot;name&quot;:&quot;Shagility&quot;,&quot;bio&quot;:&quot;I help data and analytics teams change the Way they Work in a Simply Magical Way&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f09a2d19-6707-4ef9-a4e3-a5e770fb640f_1406x853.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-07-06T21:35:38.775Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!cCKT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d5d532f-3237-4750-8ed9-b58a17151676_11260x3336.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://agiledata.substack.com/p/things-in-the-context-plane&quot;,&quot;section_name&quot;:&quot;AgileData Product&quot;,&quot;video_upload_id&quot;:null,&quot;id&quot;:167407832,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:1,&quot;comment_count&quot;:0,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;Agile Data N&#8217; Info&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!ErtR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8892c64-a0c7-4c7b-9f49-a73be5280f22_1280x1280.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p> </p><p>As a quick reminder my current thinking is the architecture diagram for the Context Plane should look something like this:</p><p><br></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vPvT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c378928-cf5b-4933-88d3-a7117e3652b4_5996x3336.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vPvT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c378928-cf5b-4933-88d3-a7117e3652b4_5996x3336.png 424w, https://substackcdn.com/image/fetch/$s_!vPvT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c378928-cf5b-4933-88d3-a7117e3652b4_5996x3336.png 848w, https://substackcdn.com/image/fetch/$s_!vPvT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c378928-cf5b-4933-88d3-a7117e3652b4_5996x3336.png 1272w, https://substackcdn.com/image/fetch/$s_!vPvT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c378928-cf5b-4933-88d3-a7117e3652b4_5996x3336.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vPvT!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c378928-cf5b-4933-88d3-a7117e3652b4_5996x3336.png" width="1200" height="667.5824175824176" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7c378928-cf5b-4933-88d3-a7117e3652b4_5996x3336.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:810,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vPvT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c378928-cf5b-4933-88d3-a7117e3652b4_5996x3336.png 424w, https://substackcdn.com/image/fetch/$s_!vPvT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c378928-cf5b-4933-88d3-a7117e3652b4_5996x3336.png 848w, https://substackcdn.com/image/fetch/$s_!vPvT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c378928-cf5b-4933-88d3-a7117e3652b4_5996x3336.png 1272w, https://substackcdn.com/image/fetch/$s_!vPvT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c378928-cf5b-4933-88d3-a7117e3652b4_5996x3336.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Object Types stored in the Context Plane</h2><p>In that previous article I did a bit of a stocktake of all the Patterns and Pattern Templates I see in the Data Domain that hold what I would term as Context.</p><p>I have been mulling those over more as we do each McSpikey (experiment) to learn more by doing and to reduce some of the many uncertainties we have identified.</p><p>I am currently sitting with this list of Object Types that seem to have value being stored or federated in the Context Plane:</p><ul><li><p>Outcomes</p></li><li><p>Actions</p></li><li><p>Business Questions</p></li><li><p>Business Glossary of terms, aliases and descriptions (use tags for Aliases)</p></li><li><p>Conceptual Data Model</p></li><li><p>Logical Data Model</p></li><li><p>Physical Data Model</p></li><li><p>Data Dictionary (with schema, fields types, field descriptions etc, also flags)</p></li><li><p>Facts</p></li><li><p>Transformation code</p></li><li><p>Data Quality Rules</p></li><li><p>Data Contract (Boundary of other Objects))</p></li><li><p>Measures, and their formulas</p></li><li><p>Metrics and their formulas</p></li><li><p>Information Applications (Reports, dashboards, AI Agents etc)</p></li><li><p>Information Products (Boundary of other Objects)</p></li><li><p>Data Quality Scores</p></li><li><p>Notifications</p></li><li><p>Usage Statistic</p></li><li><p>Data Sync Statistics</p></li><li><p>Number of rows in tiles</p></li><li><p>Principles</p></li><li><p>Policies</p></li><li><p>Patterns</p></li><li><p>Personas</p></li><li><p>Previous effort of change</p></li></ul><h2>Semantic Language is important</h2><p>I often rail against the lack of clear Semantic definition in the Data Domain, but I am often as loose with my definitions as anybody else.<br><br> I had original used the following terms:</p><ul><li><p>Context Object</p></li><li><p>Context Object Types</p></li></ul><p>But as is the LinkedIn way <strong><a href="https://www.linkedin.com/in/gabrieltanase">Gabriel Tanase</a> </strong>kindly reviewed my crap semantic definitions and provided a much better version.</p><p><a href="https://www.linkedin.com/feed/update/urn:li:activity:7366194247613501441?commentUrn=urn%3Ali%3Acomment%3A%28activity%3A7366194247613501441%2C7366400089965129728%29&amp;dashCommentUrn=urn%3Ali%3Afsd_comment%3A%287366400089965129728%2Curn%3Ali%3Aactivity%3A7366194247613501441%29">https://www.linkedin.com/feed/update/urn:li:activity:7366194247613501441?commentUrn=urn%3Ali%3Acomment%3A%28activity%3A7366194247613501441%2C7366400089965129728%29&amp;dashCommentUrn=urn%3Ali%3Afsd_comment%3A%287366400089965129728%2Curn%3Ali%3Aactivity%3A7366194247613501441%29</a></p><div class="pullquote"><p>The confusion here may be terminological, between "Objects" and "Object Types".<br><br>The "Context Objects" you listed are, in my view, "Context Object Types", since you are speaking in general, about classes of stuff. <br><br>OTOH, an individual instance would be a "Context Object", obviously of exactly one "Context Object Type". E.g., an individual Action is one Context Object, that belongs to the "Actions" Context Objects Type.<br><br>The four things you call "Context Types" are just super-classes / domains / groupings of Context Object Types.<br></p></div><p>So based on that I have ended up with:</p><ul><li><p><strong>Context Object</strong><br>A single instance of something in the Context Plane (e.g. the action &#8220;Approve Loan&#8221;).</p></li><li><p><strong>Context Object Type</strong><br>The class or category that defines what kind of object it is (e.g. <em>Actions</em>).</p></li><li><p><strong>Context Object Type Category</strong><br>A higher-level grouping of related object types that share a domain (e.g. <em>Business Context</em>).</p></li></ul><h2>Not reinventing the Wheel</h2><p>Gabriel also pointed me at some excellent documentation from Collibra in that thread, for example:<br><br><a href="https://productresources.collibra.com/docs/collibra/latest/Content/Settings/OperatingModel/to_operating-model-settings.htm">https://productresources.collibra.com/docs/collibra/latest/Content/Settings/OperatingModel/to_operating-model-settings.htm</a></p><p>Im always keen to reuse Patterns from others, rather than reinventing the wheel so to speak, and so I went down a rabbit hole of the excellent Collibra documentation.</p><p>I came away with a feeling of an architecture that is founded on the Pattern of &#8220;thing is a thing of a thing&#8221;.   </p><p>Which of course is an infinitely flexible Pattern, but to me has always been an Anti-Pattern and so not a Pattern I want to adopt.</p><p>I will need to spend more time looking into bth the Data Catalog and Information Science domains as I am pretty confident that the Patterns I am looking for already exist.<br><br>But there is also the value of trying to defining these patterns myself, as I learn by doing (and struggling).  Plus when I finally think i have nutted something else and I spot that same pattern articulated in lots of other places then I knwo I am on to something repeatable and valuable.</p><h2><br>Context Object Type Categories</h2><p>Back on task, these are the Context Object Type Categories I have ended up with.</p><ul><li><p><strong>Business Context</strong><br>Captures the intent, language, and needs that connect data work to business language.</p></li><li><p><strong>Structural Context</strong><br>Describes the technical metadata that defines how data is stored, shaped, and connected.</p></li><li><p><strong>Operational Context</strong><br>Provides the live signals and guardrails (trust scores, usage stats, policies, and access rules etc) that keep data reliable and governed in practice.</p></li><li><p><strong>Agent Context</strong><br>Provides the prompts and guardrails that guide AI agents in applying context.<br></p></li></ul><p>Yes Structural Context is the same as what is commonly called  &#8220;Technical Metadata&#8221;.</p><p>My definition for Operational Context has got examples, which means I  do not have a clear semantic defintion for it yet, so I need to keep iterating that one.</p><h2>Mapping Context Object Types to Categories</h2><p>I stress tested my categorisation by trying to assign the Context Object Types to one and only one category.</p><blockquote><p>Moving to the AssistedAI pattern and letting my ChatGPT friend expand my bullet point list to become richer text and write a more detailed story, ever so slightly edited by me &#8230;.</p></blockquote><p>When people talk about data, they often focus on the raw tables, pipelines, and dashboards. But the real power comes from the <strong>Context</strong> that surrounds them, the language, the intent, the rules, and the patterns that let both Humans and AI agents understand what data means and how it should be used.</p><p>In the Context Plane, we capture that knowledge as a set of <strong>Object Types</strong>. These are building blocks that represent everything from business questions to transformation code, from policies to personas. Together, they create a shared layer of context that connects business, data, operational and agentic worlds.</p><p>To make sense of them, we group these Objects Types into four categories: <strong>Business Context, Structural Context, Operational Context, and Agent Context.</strong></p><div><hr></div><h2><strong>Business Context</strong></h2><p>Object Types in the Business Context category capture the <em>why</em> behind the data: the intent, the questions, and the people who care.</p><ul><li><p><strong>Business Questions</strong> &#8211; The driving questions that stakeholders ask and want answered.</p></li><li><p><strong>Actions</strong> &#8211; What people (or processes) actually do with data once they have it.</p></li><li><p><strong>Outcomes</strong> &#8211; The results or changes in the business that happen because of those actions.</p></li><li><p><strong>Business Glossary</strong> &#8211; Shared terms, their aliases, and agreed definitions, so everyone speaks the same language.</p></li><li><p><strong>Conceptual Data Model</strong> &#8211; The high-level map of business concepts (customers, products, orders) and how they relate.</p></li><li><p><strong>Logical Data Model</strong> &#8211; A more detailed structure that shows how those concepts can be represented in data.</p></li><li><p><strong>Personas</strong> &#8211; Representations of the different types of people who use or consume data, each with their own needs.</p></li><li><p><strong>Information Products</strong> &#8211; The boundary objects that package up context into a consumable unit, such as a unified customer view, a churn model, or a financial performance pack.</p></li></ul><div><hr></div><h2><strong>Structural Context</strong></h2><p>Object Types in the Structural Context category describe the <em>what</em> of data: the way it&#8217;s stored, shaped, and transformed to become useful.</p><ul><li><p><strong>Physical Data Model</strong> &#8211; The actual schema and structures in the database.</p></li><li><p><strong>Data Dictionary</strong> &#8211; A catalog of fields, data types, and descriptions, along with useful flags.</p></li><li><p><strong>Facts</strong> &#8211; Core, measurable events or counts in the data (e.g. sales transactions, logins).</p></li><li><p><strong>Transformation Code</strong> &#8211; The SQL, scripts, or pipelines that turn raw data into shaped, ready-to-use structures.</p></li><li><p><strong>Data Quality Rules</strong> &#8211; Checks and guardrails that ensure the data stays trustworthy.</p></li><li><p><strong>Measures</strong> &#8211; Defined calculations, such as revenue or average handling time.</p></li><li><p><strong>Metrics</strong> &#8211; Business-relevant measures with formulas and thresholds (e.g. Net Promoter Score, Churn Rate).</p></li><li><p><strong>Information Applications</strong> &#8211; The outputs where information is applied and consumed: reports, dashboards, AI agents, apps.</p></li></ul><div><hr></div><h2><strong>Operational Context</strong></h2><p>Object Types in the Operational Context category describe the <em>how well</em> and <em>at what cost</em> dimensions of data. These objects let us monitor and manage the reliability and change effort of the system.</p><ul><li><p><strong>Usage Statistics</strong> &#8211; Insights into who is using which data, when, and how often.</p></li><li><p><strong>Data Quality Scores</strong> &#8211; Aggregated indicators that show how healthy the data is across rules and checks.</p></li><li><p><strong>Data Sync Statistics</strong> &#8211; Performance and freshness indicators of data movement.</p></li><li><p><strong>Number of Rows in Tiles</strong> &#8211; A quick proxy for data size and growth in a given slice of the platform.</p></li><li><p><strong>Notifications</strong> &#8211; Alerts that something requires attention, whether it&#8217;s a failed job or a threshold breach.</p></li><li><p><strong>Principles, Policies, Patterns</strong> &#8211; The guiding rules and repeatable approaches that shape how data is managed.</p></li><li><p><strong>Previous Effort of Change</strong> &#8211; A signal of how much time, cost, and disruption a past change required, helping estimate future impact.</p></li></ul><div><hr></div><h2><strong>Agent Context</strong></h2><p>Finally, Object Types in the Agent Context category deals with the <em>prompts</em> and instructions that guide AI agents in using all this context. These Objects define the boundaries and cues for generative systems to behave predictably and helpfully.</p><ul><li><p><strong>Prompts</strong> &#8211; The structured inputs, templates, and guardrails that tell an AI agent how to use context objects to answer questions, generate insights, or perform actions.</p></li></ul><div><hr></div><h2><strong>Why it Matters</strong></h2><p>Each Object Type might seem less than valuable in isolation, but together they form a unified layer of meaning, that can be used by Humans and AI Agents alike to gain understanding.</p><p>By capturing both the <strong>why</strong> (business intent) and the <strong>what</strong> (structural detail), as well as the <strong>how well</strong> (operational health) and the <strong>how to guide agents</strong> (AI context), the Context Plane ensures that every stakeholder&#8212;whether a business leader, a data engineer, or an AI agent&#8212;works from the same playbook.</p><p>That&#8217;s how you move from disconnected metadata and data assets to a coherent ecosystem where context is always present, always available, and always shared.</p><blockquote><p>And back to all me again &#8230;.</p></blockquote><h3>And once again with a map</h3><p>Visual maps often give a better quick view than plain text, so here you go:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_naJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F446cea6d-8c0e-4b81-a18d-ddc9e5a262d3_2179x666.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_naJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F446cea6d-8c0e-4b81-a18d-ddc9e5a262d3_2179x666.png 424w, https://substackcdn.com/image/fetch/$s_!_naJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F446cea6d-8c0e-4b81-a18d-ddc9e5a262d3_2179x666.png 848w, https://substackcdn.com/image/fetch/$s_!_naJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F446cea6d-8c0e-4b81-a18d-ddc9e5a262d3_2179x666.png 1272w, https://substackcdn.com/image/fetch/$s_!_naJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F446cea6d-8c0e-4b81-a18d-ddc9e5a262d3_2179x666.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_naJ!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F446cea6d-8c0e-4b81-a18d-ddc9e5a262d3_2179x666.png" width="1200" height="366.75824175824175" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/446cea6d-8c0e-4b81-a18d-ddc9e5a262d3_2179x666.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:445,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:474984,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/172724737?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F446cea6d-8c0e-4b81-a18d-ddc9e5a262d3_2179x666.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_naJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F446cea6d-8c0e-4b81-a18d-ddc9e5a262d3_2179x666.png 424w, https://substackcdn.com/image/fetch/$s_!_naJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F446cea6d-8c0e-4b81-a18d-ddc9e5a262d3_2179x666.png 848w, https://substackcdn.com/image/fetch/$s_!_naJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F446cea6d-8c0e-4b81-a18d-ddc9e5a262d3_2179x666.png 1272w, https://substackcdn.com/image/fetch/$s_!_naJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F446cea6d-8c0e-4b81-a18d-ddc9e5a262d3_2179x666.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Far from Done Done</h3><p>Based in my experience to date I will be iterating the Object Types and their categorisation for a while yet, but I think I have the the four Context Object Type Categories pretty stable.</p><p>Famous last words?</p><h2>Wood from the Trees</h2><p>Still a way to go before I have a coherent set of Patterns that I can Coach / Mentor / Teach somebody else for the &#8220;Context Plane&#8221;, and the &#8220;AI Data Stack&#8221; or present as a robust Architecture map.</p><p>But as I have already said, writing my half formed ideas helps me think.</p><h2>An incoherent stream of Context</h2><p>You can find all the previous articles with my train of thought listed in this thread:<br><br><a href="https://agiledata.substack.com/t/context-plane">https://agiledata.substack.com/t/context-plane</a></p><p>We are building the Context Plane while flying it, so always looking for early adopters to help us decide the final destination.<br><br>If you want a virtual chat grab a slot here:<br><br><a href="https://contextplane.ai/contact-us/#bookemdanno">https://contextplane.ai/contact-us/#bookemdanno</a></p>]]></content:encoded></item><item><title><![CDATA[Context Plane AI Agent in a Web App]]></title><description><![CDATA[it works but is it valuable, viable feasible and useful?]]></description><link>https://agiledata.info/p/context-plane-ai-agent-in-a-web-app</link><guid isPermaLink="false">https://agiledata.info/p/context-plane-ai-agent-in-a-web-app</guid><dc:creator><![CDATA[Shagility]]></dc:creator><pubDate>Wed, 20 Aug 2025 21:29:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!1G3d!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09193048-4a54-46ea-aa65-58d6b6ed374a_1717x1247.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Is having an AI Agent in the Web App for the Context Plane a good idea?</p><blockquote><p><strong>&#8220;Context&#8221; of this post</strong></p></blockquote><p>I often find writing helps me coalesce and refine my thoughts when new patterns start to emerge, but aren&#8217;t very clear yet.  </p><p>So this article is a brain dump / train of thought continuation of the product and architecture needed to have one Context Plane to rule them all, as part of a proposed &#8220;AI Data Stack&#8221;.<br><br>This article provides an overview of adding the AI Agent capability to the AgileData Web App.</p><h2>Not starting from scratch</h2><p>We have been experimenting with an in App &#8220;Chatbot&#8221; helper in the the AgileData App for multiple years so I was lucky we already had the core UX needed for us to experiment with this feature.</p><p>And we had already integrated Gemini with the AgileData App a while ago.  We already used it to assist or automate a few of the more complex bits of data work.</p><p>So it was &#8221;just&#8221; a case of changing the Gemini Model we were using to switch over to use Gemini 2.5 Pro to get the reasoning capability and also adding the connector to expose it to our MCP Server.<br></p><div class="pullquote"><p>Simples!</p><p>(he says as Nigel did all the work) </p></div><p>So from a AgileData App UX point of view nothing changed.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oimh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a6088b0-4e71-49ea-8b52-2bcead007e3b_1714x1248.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oimh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a6088b0-4e71-49ea-8b52-2bcead007e3b_1714x1248.png 424w, https://substackcdn.com/image/fetch/$s_!oimh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a6088b0-4e71-49ea-8b52-2bcead007e3b_1714x1248.png 848w, https://substackcdn.com/image/fetch/$s_!oimh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a6088b0-4e71-49ea-8b52-2bcead007e3b_1714x1248.png 1272w, https://substackcdn.com/image/fetch/$s_!oimh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a6088b0-4e71-49ea-8b52-2bcead007e3b_1714x1248.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oimh!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a6088b0-4e71-49ea-8b52-2bcead007e3b_1714x1248.png" width="1200" height="873.6263736263736" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3a6088b0-4e71-49ea-8b52-2bcead007e3b_1714x1248.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1060,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:99696,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/171434525?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a6088b0-4e71-49ea-8b52-2bcead007e3b_1714x1248.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!oimh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a6088b0-4e71-49ea-8b52-2bcead007e3b_1714x1248.png 424w, https://substackcdn.com/image/fetch/$s_!oimh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a6088b0-4e71-49ea-8b52-2bcead007e3b_1714x1248.png 848w, https://substackcdn.com/image/fetch/$s_!oimh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a6088b0-4e71-49ea-8b52-2bcead007e3b_1714x1248.png 1272w, https://substackcdn.com/image/fetch/$s_!oimh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a6088b0-4e71-49ea-8b52-2bcead007e3b_1714x1248.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Same Prompt</h2><p>We have a simple set of prompts already defined in the AgileData Platform for ADI.</p><pre><code>You are ADI, the AI assistant inside the AgileData.io product. Your purpose is to guide a data team of one&#8212;a data analyst&#8212;in configuring a data warehouse using the AgileData App and Platform.

Context &amp; Capabilities:

You have access to:
&#9;&#8226;&#9;The AgileData user guides
&#9;&#8226;&#9;The screens and workflows within the AgileData app
&#9;&#8226;&#9;The APIs that power the AgileData app and platform
&#9;&#8226;&#9;The data stored in the AgileData platform

The Challenge:
&#9;&#8226;&#9;Every set of data has unique challenges.
&#9;&#8226;&#9;There is no single, linear process for working with data&#8212;users must &#8220;pick a path&#8221; or &#8220;choose their own adventure.&#8221;
&#9;&#8226;&#9;Each user will dynamically use the AgileData app in different ways based on their specific needs.

Who You Are Helping:
&#9;&#8226;&#9;The user is data-savvy but not a data professional, not a highly technical user, not a data expert.
&#9;&#8226;&#9;They may not always know exactly what data task they need to do&#8212;but they know the outcome they are aiming for.

How You Should Respond:
&#9;&#8226;&#9;Be clear and practical: Guide the user toward the best approach to complete their data task efficiently.
&#9;&#8226;&#9;Adapt to their needs: If they are unclear on what to do, ask clarifying questions and suggest possible paths forward.
&#9;&#8226;&#9;Use all available information: Leverage guides, app screens, APIs, and platform data to provide the most relevant answers.
&#9;&#8226;&#9;Encourage exploration: Offer multiple options for how they could proceed, allowing them to discover the best way forward.
&#9;&#8226;&#9;Think like a mentor, not a manual: Rather than just explaining how the platform works, help the user make decisions that will get them to their goal.
&#9;&#8226;&#9;Present them with the relevant url to the relevant screen if that will help them complete the task

When you have multiple choices:
&#9;&#8226;&#9;Pick the one you think is the most suitable and do that.
&#9;&#8226;&#9;Append the other options you had to your response and tell the Data Analyst which choice you selected, let them know they can tell you to pick another option.


Your goal is not just to answer questions but to help the user successfully build their data warehouse in the most effective way possible&#8212;no matter what challenges they face.</code></pre><p>As you can see they are based on previous experiments where we were focussed on helping a Data Analyst build a Data Warehouse as a &#8220;team of one&#8221;.</p><p>I left this prompt in place for this experiment, but I should do another McSpikey where I tailor them for the Context Plane use cases.</p><p>In previous experiments where I used Gemini CLI and Claude to access the Context in the Context Plane, there was no &#8220;system&#8221; prompts in place at all so this should mean I get slightly different responses from Gemini 2.5 Pro in the AgileData App vs Gemini 2.5 Pro in the Gemini CLI.</p><p>ADI also has access to our Doco and also examples screens from the App, so again it has more Agent Context than the previous experiments with Gemini CLI had, as that only had access to the API&#8217;s via the MCP Server.</p><h2>MCP Services</h2><p>We limited the services available in the AgileData App to a subset of the available APIs.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6BOQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5033a887-147d-42df-89ef-57a5388075b5_1714x1248.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6BOQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5033a887-147d-42df-89ef-57a5388075b5_1714x1248.png 424w, https://substackcdn.com/image/fetch/$s_!6BOQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5033a887-147d-42df-89ef-57a5388075b5_1714x1248.png 848w, https://substackcdn.com/image/fetch/$s_!6BOQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5033a887-147d-42df-89ef-57a5388075b5_1714x1248.png 1272w, https://substackcdn.com/image/fetch/$s_!6BOQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5033a887-147d-42df-89ef-57a5388075b5_1714x1248.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6BOQ!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5033a887-147d-42df-89ef-57a5388075b5_1714x1248.png" width="1200" height="873.6263736263736" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5033a887-147d-42df-89ef-57a5388075b5_1714x1248.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1060,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:170790,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/171434525?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5033a887-147d-42df-89ef-57a5388075b5_1714x1248.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6BOQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5033a887-147d-42df-89ef-57a5388075b5_1714x1248.png 424w, https://substackcdn.com/image/fetch/$s_!6BOQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5033a887-147d-42df-89ef-57a5388075b5_1714x1248.png 848w, https://substackcdn.com/image/fetch/$s_!6BOQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5033a887-147d-42df-89ef-57a5388075b5_1714x1248.png 1272w, https://substackcdn.com/image/fetch/$s_!6BOQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5033a887-147d-42df-89ef-57a5388075b5_1714x1248.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>When I use my typical question to get a list of Tiles and a list of Fields in a specific Tile then it pretty get the response I am expecting, no surprises there.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GuTB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc376e836-3c6c-4a48-8c85-b15f8a1efdfa_1714x1248.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GuTB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc376e836-3c6c-4a48-8c85-b15f8a1efdfa_1714x1248.png 424w, https://substackcdn.com/image/fetch/$s_!GuTB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc376e836-3c6c-4a48-8c85-b15f8a1efdfa_1714x1248.png 848w, https://substackcdn.com/image/fetch/$s_!GuTB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc376e836-3c6c-4a48-8c85-b15f8a1efdfa_1714x1248.png 1272w, https://substackcdn.com/image/fetch/$s_!GuTB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc376e836-3c6c-4a48-8c85-b15f8a1efdfa_1714x1248.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GuTB!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc376e836-3c6c-4a48-8c85-b15f8a1efdfa_1714x1248.png" width="1200" height="873.6263736263736" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c376e836-3c6c-4a48-8c85-b15f8a1efdfa_1714x1248.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1060,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:683810,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/171434525?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc376e836-3c6c-4a48-8c85-b15f8a1efdfa_1714x1248.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GuTB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc376e836-3c6c-4a48-8c85-b15f8a1efdfa_1714x1248.png 424w, https://substackcdn.com/image/fetch/$s_!GuTB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc376e836-3c6c-4a48-8c85-b15f8a1efdfa_1714x1248.png 848w, https://substackcdn.com/image/fetch/$s_!GuTB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc376e836-3c6c-4a48-8c85-b15f8a1efdfa_1714x1248.png 1272w, https://substackcdn.com/image/fetch/$s_!GuTB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc376e836-3c6c-4a48-8c85-b15f8a1efdfa_1714x1248.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LqG8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5c51042-7415-42cc-a10a-eba23338f50f_1714x1248.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LqG8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5c51042-7415-42cc-a10a-eba23338f50f_1714x1248.png 424w, https://substackcdn.com/image/fetch/$s_!LqG8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5c51042-7415-42cc-a10a-eba23338f50f_1714x1248.png 848w, https://substackcdn.com/image/fetch/$s_!LqG8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5c51042-7415-42cc-a10a-eba23338f50f_1714x1248.png 1272w, https://substackcdn.com/image/fetch/$s_!LqG8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5c51042-7415-42cc-a10a-eba23338f50f_1714x1248.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LqG8!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5c51042-7415-42cc-a10a-eba23338f50f_1714x1248.png" width="1200" height="873.6263736263736" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a5c51042-7415-42cc-a10a-eba23338f50f_1714x1248.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1060,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:399325,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/171434525?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5c51042-7415-42cc-a10a-eba23338f50f_1714x1248.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LqG8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5c51042-7415-42cc-a10a-eba23338f50f_1714x1248.png 424w, https://substackcdn.com/image/fetch/$s_!LqG8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5c51042-7415-42cc-a10a-eba23338f50f_1714x1248.png 848w, https://substackcdn.com/image/fetch/$s_!LqG8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5c51042-7415-42cc-a10a-eba23338f50f_1714x1248.png 1272w, https://substackcdn.com/image/fetch/$s_!LqG8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5c51042-7415-42cc-a10a-eba23338f50f_1714x1248.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Except &#8230;&#8230;</h2><p>When I use Claude to get the Blast Radius for a field change it gives me very rich Context back.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ulul!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b01ecdb-b002-4d7a-bb63-7a8098f9cac5_1664x1318.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ulul!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b01ecdb-b002-4d7a-bb63-7a8098f9cac5_1664x1318.png 424w, https://substackcdn.com/image/fetch/$s_!Ulul!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b01ecdb-b002-4d7a-bb63-7a8098f9cac5_1664x1318.png 848w, https://substackcdn.com/image/fetch/$s_!Ulul!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b01ecdb-b002-4d7a-bb63-7a8098f9cac5_1664x1318.png 1272w, https://substackcdn.com/image/fetch/$s_!Ulul!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b01ecdb-b002-4d7a-bb63-7a8098f9cac5_1664x1318.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ulul!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b01ecdb-b002-4d7a-bb63-7a8098f9cac5_1664x1318.png" width="1200" height="950.2747252747253" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5b01ecdb-b002-4d7a-bb63-7a8098f9cac5_1664x1318.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1153,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:254361,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/171434525?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b01ecdb-b002-4d7a-bb63-7a8098f9cac5_1664x1318.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ulul!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b01ecdb-b002-4d7a-bb63-7a8098f9cac5_1664x1318.png 424w, https://substackcdn.com/image/fetch/$s_!Ulul!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b01ecdb-b002-4d7a-bb63-7a8098f9cac5_1664x1318.png 848w, https://substackcdn.com/image/fetch/$s_!Ulul!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b01ecdb-b002-4d7a-bb63-7a8098f9cac5_1664x1318.png 1272w, https://substackcdn.com/image/fetch/$s_!Ulul!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b01ecdb-b002-4d7a-bb63-7a8098f9cac5_1664x1318.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>ADI in the AgileData App however&#8230;</p><p>She ain&#8217;t being that helpful!</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1G3d!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09193048-4a54-46ea-aa65-58d6b6ed374a_1717x1247.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1G3d!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09193048-4a54-46ea-aa65-58d6b6ed374a_1717x1247.png 424w, https://substackcdn.com/image/fetch/$s_!1G3d!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09193048-4a54-46ea-aa65-58d6b6ed374a_1717x1247.png 848w, https://substackcdn.com/image/fetch/$s_!1G3d!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09193048-4a54-46ea-aa65-58d6b6ed374a_1717x1247.png 1272w, https://substackcdn.com/image/fetch/$s_!1G3d!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09193048-4a54-46ea-aa65-58d6b6ed374a_1717x1247.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1G3d!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09193048-4a54-46ea-aa65-58d6b6ed374a_1717x1247.png" width="1200" height="871.1538461538462" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/09193048-4a54-46ea-aa65-58d6b6ed374a_1717x1247.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1057,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:440534,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/171434525?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09193048-4a54-46ea-aa65-58d6b6ed374a_1717x1247.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1G3d!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09193048-4a54-46ea-aa65-58d6b6ed374a_1717x1247.png 424w, https://substackcdn.com/image/fetch/$s_!1G3d!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09193048-4a54-46ea-aa65-58d6b6ed374a_1717x1247.png 848w, https://substackcdn.com/image/fetch/$s_!1G3d!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09193048-4a54-46ea-aa65-58d6b6ed374a_1717x1247.png 1272w, https://substackcdn.com/image/fetch/$s_!1G3d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09193048-4a54-46ea-aa65-58d6b6ed374a_1717x1247.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Over to Gemini CLI to see if its her or Gemini.</p><h4>Ah nope</h4><p>We made a fundamental change to the MCP Server we are using since the last McSpikey.  The previous version only had access to a subset of our API&#8217;s as tools.  We changed it to have access to all API&#8217;s.</p><p>For Claude this didn&#8217;t really make much of a difference, but for Gemini CLI it seemed to have a major impact and not in a good way.</p><p>Listing the tiles always seems to work ok, simple task.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iv8a!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd3c6d5a-3d15-4783-9a59-71aefe3f3803_1459x835.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iv8a!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd3c6d5a-3d15-4783-9a59-71aefe3f3803_1459x835.png 424w, https://substackcdn.com/image/fetch/$s_!iv8a!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd3c6d5a-3d15-4783-9a59-71aefe3f3803_1459x835.png 848w, https://substackcdn.com/image/fetch/$s_!iv8a!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd3c6d5a-3d15-4783-9a59-71aefe3f3803_1459x835.png 1272w, https://substackcdn.com/image/fetch/$s_!iv8a!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd3c6d5a-3d15-4783-9a59-71aefe3f3803_1459x835.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iv8a!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd3c6d5a-3d15-4783-9a59-71aefe3f3803_1459x835.png" width="1200" height="686.5384615384615" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dd3c6d5a-3d15-4783-9a59-71aefe3f3803_1459x835.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:833,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:459322,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/171434525?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd3c6d5a-3d15-4783-9a59-71aefe3f3803_1459x835.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iv8a!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd3c6d5a-3d15-4783-9a59-71aefe3f3803_1459x835.png 424w, https://substackcdn.com/image/fetch/$s_!iv8a!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd3c6d5a-3d15-4783-9a59-71aefe3f3803_1459x835.png 848w, https://substackcdn.com/image/fetch/$s_!iv8a!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd3c6d5a-3d15-4783-9a59-71aefe3f3803_1459x835.png 1272w, https://substackcdn.com/image/fetch/$s_!iv8a!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd3c6d5a-3d15-4783-9a59-71aefe3f3803_1459x835.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Listing the fields in a tile has worked well up until we made the MCP Server Tool change.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rjFR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e51fab6-3704-48a3-84d6-5696634921fb_1459x133.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rjFR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e51fab6-3704-48a3-84d6-5696634921fb_1459x133.png 424w, https://substackcdn.com/image/fetch/$s_!rjFR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e51fab6-3704-48a3-84d6-5696634921fb_1459x133.png 848w, https://substackcdn.com/image/fetch/$s_!rjFR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e51fab6-3704-48a3-84d6-5696634921fb_1459x133.png 1272w, https://substackcdn.com/image/fetch/$s_!rjFR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e51fab6-3704-48a3-84d6-5696634921fb_1459x133.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rjFR!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e51fab6-3704-48a3-84d6-5696634921fb_1459x133.png" width="1200" height="109.61538461538461" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0e51fab6-3704-48a3-84d6-5696634921fb_1459x133.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:133,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:60406,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/171434525?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e51fab6-3704-48a3-84d6-5696634921fb_1459x133.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rjFR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e51fab6-3704-48a3-84d6-5696634921fb_1459x133.png 424w, https://substackcdn.com/image/fetch/$s_!rjFR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e51fab6-3704-48a3-84d6-5696634921fb_1459x133.png 848w, https://substackcdn.com/image/fetch/$s_!rjFR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e51fab6-3704-48a3-84d6-5696634921fb_1459x133.png 1272w, https://substackcdn.com/image/fetch/$s_!rjFR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e51fab6-3704-48a3-84d6-5696634921fb_1459x133.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>That is the LLM version of go F$#@ yourself lol.</p><p>I had a quick go to see if I could limit the Tools that Gemini CLI could access on the client side, as we limited them in both the AgileData App for ADI and I limited them in the Claude client.</p><p>So new entries in the Gemini CLI settings.json it is:</p><pre><code>  "securityPolicy": {
    "mode": "configured",
    "allowedTools": [
      "get_business_glossary",
      "get_catalog_tiles",
      "get_ensemble_config",
      "get_change_rules"
    ]
  }</code></pre><p>But that did nada, when I listed the tools in Gemini CLI using /mcp it could still see them all and there was no behaviour change.</p><p>If I was very specific with my language it would sometimes work:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RHry!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09691254-a5a0-4a3d-af85-ba513117288b_1459x106.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RHry!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09691254-a5a0-4a3d-af85-ba513117288b_1459x106.png 424w, https://substackcdn.com/image/fetch/$s_!RHry!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09691254-a5a0-4a3d-af85-ba513117288b_1459x106.png 848w, https://substackcdn.com/image/fetch/$s_!RHry!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09691254-a5a0-4a3d-af85-ba513117288b_1459x106.png 1272w, https://substackcdn.com/image/fetch/$s_!RHry!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09691254-a5a0-4a3d-af85-ba513117288b_1459x106.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RHry!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09691254-a5a0-4a3d-af85-ba513117288b_1459x106.png" width="1200" height="87.36263736263736" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/09691254-a5a0-4a3d-af85-ba513117288b_1459x106.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:106,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:56524,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/171434525?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09691254-a5a0-4a3d-af85-ba513117288b_1459x106.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!RHry!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09691254-a5a0-4a3d-af85-ba513117288b_1459x106.png 424w, https://substackcdn.com/image/fetch/$s_!RHry!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09691254-a5a0-4a3d-af85-ba513117288b_1459x106.png 848w, https://substackcdn.com/image/fetch/$s_!RHry!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09691254-a5a0-4a3d-af85-ba513117288b_1459x106.png 1272w, https://substackcdn.com/image/fetch/$s_!RHry!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09691254-a5a0-4a3d-af85-ba513117288b_1459x106.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4d1v!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1de47faa-7d79-4c44-9472-6c2b5f48dcdf_1459x302.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4d1v!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1de47faa-7d79-4c44-9472-6c2b5f48dcdf_1459x302.png 424w, https://substackcdn.com/image/fetch/$s_!4d1v!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1de47faa-7d79-4c44-9472-6c2b5f48dcdf_1459x302.png 848w, https://substackcdn.com/image/fetch/$s_!4d1v!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1de47faa-7d79-4c44-9472-6c2b5f48dcdf_1459x302.png 1272w, https://substackcdn.com/image/fetch/$s_!4d1v!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1de47faa-7d79-4c44-9472-6c2b5f48dcdf_1459x302.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4d1v!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1de47faa-7d79-4c44-9472-6c2b5f48dcdf_1459x302.png" width="1200" height="248.07692307692307" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1de47faa-7d79-4c44-9472-6c2b5f48dcdf_1459x302.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:301,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:129895,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/171434525?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1de47faa-7d79-4c44-9472-6c2b5f48dcdf_1459x302.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4d1v!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1de47faa-7d79-4c44-9472-6c2b5f48dcdf_1459x302.png 424w, https://substackcdn.com/image/fetch/$s_!4d1v!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1de47faa-7d79-4c44-9472-6c2b5f48dcdf_1459x302.png 848w, https://substackcdn.com/image/fetch/$s_!4d1v!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1de47faa-7d79-4c44-9472-6c2b5f48dcdf_1459x302.png 1272w, https://substackcdn.com/image/fetch/$s_!4d1v!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1de47faa-7d79-4c44-9472-6c2b5f48dcdf_1459x302.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>But then it would go back to be a real dumb arse again.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!byuZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed0a6ed7-c484-43da-8c5b-b1432294c452_1459x95.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!byuZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed0a6ed7-c484-43da-8c5b-b1432294c452_1459x95.png 424w, https://substackcdn.com/image/fetch/$s_!byuZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed0a6ed7-c484-43da-8c5b-b1432294c452_1459x95.png 848w, https://substackcdn.com/image/fetch/$s_!byuZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed0a6ed7-c484-43da-8c5b-b1432294c452_1459x95.png 1272w, https://substackcdn.com/image/fetch/$s_!byuZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed0a6ed7-c484-43da-8c5b-b1432294c452_1459x95.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!byuZ!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed0a6ed7-c484-43da-8c5b-b1432294c452_1459x95.png" width="1200" height="78.2967032967033" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ed0a6ed7-c484-43da-8c5b-b1432294c452_1459x95.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:95,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:56162,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/171434525?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed0a6ed7-c484-43da-8c5b-b1432294c452_1459x95.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!byuZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed0a6ed7-c484-43da-8c5b-b1432294c452_1459x95.png 424w, https://substackcdn.com/image/fetch/$s_!byuZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed0a6ed7-c484-43da-8c5b-b1432294c452_1459x95.png 848w, https://substackcdn.com/image/fetch/$s_!byuZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed0a6ed7-c484-43da-8c5b-b1432294c452_1459x95.png 1272w, https://substackcdn.com/image/fetch/$s_!byuZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed0a6ed7-c484-43da-8c5b-b1432294c452_1459x95.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>So back to very specific language:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fd-8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff48bccb7-7f7f-4063-97f7-87a78ee472d0_1459x66.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fd-8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff48bccb7-7f7f-4063-97f7-87a78ee472d0_1459x66.png 424w, https://substackcdn.com/image/fetch/$s_!fd-8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff48bccb7-7f7f-4063-97f7-87a78ee472d0_1459x66.png 848w, https://substackcdn.com/image/fetch/$s_!fd-8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff48bccb7-7f7f-4063-97f7-87a78ee472d0_1459x66.png 1272w, https://substackcdn.com/image/fetch/$s_!fd-8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff48bccb7-7f7f-4063-97f7-87a78ee472d0_1459x66.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fd-8!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff48bccb7-7f7f-4063-97f7-87a78ee472d0_1459x66.png" width="1200" height="54.395604395604394" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f48bccb7-7f7f-4063-97f7-87a78ee472d0_1459x66.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:66,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:27769,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/171434525?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff48bccb7-7f7f-4063-97f7-87a78ee472d0_1459x66.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fd-8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff48bccb7-7f7f-4063-97f7-87a78ee472d0_1459x66.png 424w, https://substackcdn.com/image/fetch/$s_!fd-8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff48bccb7-7f7f-4063-97f7-87a78ee472d0_1459x66.png 848w, https://substackcdn.com/image/fetch/$s_!fd-8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff48bccb7-7f7f-4063-97f7-87a78ee472d0_1459x66.png 1272w, https://substackcdn.com/image/fetch/$s_!fd-8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff48bccb7-7f7f-4063-97f7-87a78ee472d0_1459x66.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!svje!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F740c9486-deb8-4759-890c-2f09143d06f2_1459x623.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!svje!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F740c9486-deb8-4759-890c-2f09143d06f2_1459x623.png 424w, https://substackcdn.com/image/fetch/$s_!svje!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F740c9486-deb8-4759-890c-2f09143d06f2_1459x623.png 848w, https://substackcdn.com/image/fetch/$s_!svje!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F740c9486-deb8-4759-890c-2f09143d06f2_1459x623.png 1272w, https://substackcdn.com/image/fetch/$s_!svje!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F740c9486-deb8-4759-890c-2f09143d06f2_1459x623.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!svje!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F740c9486-deb8-4759-890c-2f09143d06f2_1459x623.png" width="1200" height="512.6373626373627" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/740c9486-deb8-4759-890c-2f09143d06f2_1459x623.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:622,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:479060,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/171434525?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F740c9486-deb8-4759-890c-2f09143d06f2_1459x623.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!svje!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F740c9486-deb8-4759-890c-2f09143d06f2_1459x623.png 424w, https://substackcdn.com/image/fetch/$s_!svje!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F740c9486-deb8-4759-890c-2f09143d06f2_1459x623.png 848w, https://substackcdn.com/image/fetch/$s_!svje!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F740c9486-deb8-4759-890c-2f09143d06f2_1459x623.png 1272w, https://substackcdn.com/image/fetch/$s_!svje!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F740c9486-deb8-4759-890c-2f09143d06f2_1459x623.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>So theories from that rabbit hole:</p><ul><li><p>To many MCP Server Tools makes the AI Agent Tools job harder</p></li><li><p>The previous Tools we exposed had some additional hints in the naming of the Tool that probably helped the AI Agent Tools</p></li><li><p>Gemini 2.5 Pro is not as good as Claude Sonnet 4 for this use case </p><p></p></li></ul><h4>One more thing</h4><p>I ran Gemini CLI against a different AgileData Tenancy, Newcastle, compared to the ADI version which was running in the Kapiti Tenancy (hence the change in Tile name I was asking questions about).</p><p>Lets go try ADI in the Newcastle Tenancy and see what happens.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ROxZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b0e6442-783b-4647-b0e8-0fe2cfde621f_1717x1247.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ROxZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b0e6442-783b-4647-b0e8-0fe2cfde621f_1717x1247.png 424w, https://substackcdn.com/image/fetch/$s_!ROxZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b0e6442-783b-4647-b0e8-0fe2cfde621f_1717x1247.png 848w, https://substackcdn.com/image/fetch/$s_!ROxZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b0e6442-783b-4647-b0e8-0fe2cfde621f_1717x1247.png 1272w, https://substackcdn.com/image/fetch/$s_!ROxZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b0e6442-783b-4647-b0e8-0fe2cfde621f_1717x1247.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ROxZ!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b0e6442-783b-4647-b0e8-0fe2cfde621f_1717x1247.png" width="1200" height="871.1538461538462" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8b0e6442-783b-4647-b0e8-0fe2cfde621f_1717x1247.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1057,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:439799,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/171434525?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b0e6442-783b-4647-b0e8-0fe2cfde621f_1717x1247.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ROxZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b0e6442-783b-4647-b0e8-0fe2cfde621f_1717x1247.png 424w, https://substackcdn.com/image/fetch/$s_!ROxZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b0e6442-783b-4647-b0e8-0fe2cfde621f_1717x1247.png 848w, https://substackcdn.com/image/fetch/$s_!ROxZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b0e6442-783b-4647-b0e8-0fe2cfde621f_1717x1247.png 1272w, https://substackcdn.com/image/fetch/$s_!ROxZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b0e6442-783b-4647-b0e8-0fe2cfde621f_1717x1247.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Nope same problem.</p><p>So Im going to guess the problem could be:</p><ul><li><p>the Gemini 2.5 Pro model</p></li><li><p>and / or the extra MCP Server Tools</p></li><li><p>and / or the Prompts in the AgileData Platform</p></li><li><p>and / or all of the above</p></li></ul><p>that is causing the problems.</p><p>Out out time for this McSpikey so thats is an experiment for another day.</p><h2>So in Summary</h2><p>We can provide a web based interface in the AgileData App easily enough by reusing the current ADI capability.</p><p>But need to work out if it actually has any value.</p><p>And also validate the assumption that it would be a different personas using it compared to a AI Agent Tool and should that impact the type of response they receive or not.  Gut feel is no, it should be &#8220;any Human, any AI tool, same response&#8221;</p><p>And I need to experiment with the combination of LLM models, MCP Server Tools  and Prompts a lot more.</p><h3>Claude Sonnet 4 seems more compotent than Gemini 2.5 Pro</h3><p>Well for this use case anyway. I just seemed to get back better &#8220;Context&#8221; from my Context when I use Claude over Gemini.</p><h3>As slow as a wet pig</h3><p>One of the surprises was ADI  was sooooo slow in the Kapiti Tenancy vs the Newcastle Tenancy.</p><p>Check it out.</p><h4>Newcastle</h4><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;74c48a37-25c5-4e22-9c51-31ecb76cbc81&quot;,&quot;duration&quot;:null}"></div><h4>Kapiti</h4><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;b1acd8d9-d80f-4004-920c-820dbfa4c207&quot;,&quot;duration&quot;:null}"></div><p>Almost twice as long in Kapiti vs Newcastle.</p><h4>One of these things is not like the other</h4><p>After chatting with Nigel it seems were running Gemini 2.5 Flash in Newcastle and Gemini 2.5 Pro in Kapiti, hence the difference in response times.</p><p>So another McSpikey needed to see how different the responses are between the two models for our key use cases.</p><h2>Its all about the reducing uncertainity</h2><p>The key to a McSpikey is to reduce uncertainty and for this McSpikey I think we have increased the uncertainty.</p><p>But thats ok, better to know that now than later when we are much further down the Context Plane path. </p><h2>Wood from the Trees</h2><p>Still a way to go before I have a coherent set of Patterns that I can Coach / Mentor / Teach somebody else for the &#8220;Context Plane&#8221;, and the &#8220;AI Data Stack&#8221; or present as a robust Product Overview and / or Architecture map.</p><p>But as I have already said, writing my half formed ideas helps me think.</p><h2>Context Plane Use Cases</h2><p>You can find all the use cases where I think the Context Plane may have sone value over at:</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://contextplane.ai/use-case/&quot;,&quot;text&quot;:&quot;ContextPlane.ai&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://contextplane.ai/use-case/"><span>ContextPlane.ai</span></a></p><h2>An incoherent stream of Context</h2><p>You can find all the previous articles with my train of thought listed in this thread:<br><br><a href="https://agiledata.substack.com/t/context-plane">https://agiledata.substack.com/t/context-plane</a></p><p></p><p>We are building the Context Plane while flying it, so always looking for early adopters to help us decide the final destination.<br><br>If you want a virtual chat grab a slot here:<br><br><a href="https://contextplane.ai/contact-us/#bookemdanno">https://contextplane.ai/contact-us/#bookemdanno</a></p>]]></content:encoded></item><item><title><![CDATA[Dimensional Data Modeling Patterns with Johnny Winter]]></title><description><![CDATA[AgileData Podcast #73]]></description><link>https://agiledata.info/p/dimensional-data-modeling-patterns</link><guid isPermaLink="false">https://agiledata.info/p/dimensional-data-modeling-patterns</guid><dc:creator><![CDATA[Shagility]]></dc:creator><pubDate>Mon, 18 Aug 2025 19:55:51 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/eOPGc1F9lH4" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Join Shane Gibson as he chats with Johnny Winter about the core patterns that make up Dimensional (Star Schema) Modeling.</p><blockquote><p><strong><a href="https://agiledata.substack.com/i/171239242/listen">Listen</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/171239242/google-notebooklm-mindmap">View MindMap</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/171239242/google-notebooklm-briefing">Read AI Summary</a></strong></p><p><strong><a href="https://agiledata.substack.com/i/171239242/transcript">Read Transcript</a></strong></p></blockquote><p></p><h2>Listen</h2><p>Listen on all good podcast hosts or over at:</p><p><a href="https://podcast.agiledata.io/e/dimensional-data-modeling-patterns-with-johnny-winter-episode-73/">https://podcast.agiledata.io/e/dimensional-data-modeling-patterns-with-johnny-winter-episode-73/</a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://podcast.agiledata.io/e/dimensional-data-modeling-patterns-with-johnny-winter-episode-73/&quot;,&quot;text&quot;:&quot;Listen to the Agile Data Podcast Episode&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://podcast.agiledata.io/e/dimensional-data-modeling-patterns-with-johnny-winter-episode-73/"><span>Listen to the Agile Data Podcast Episode</span></a></p><blockquote><p><strong>Subscribe:</strong> <a href="https://podcasts.apple.com/nz/podcast/agiledata/id1456820781">Apple Podcast</a> | <a href="https://open.spotify.com/show/4wiQWj055HchKMxmYSKRIj">Spotify</a> | <a href="https://www.google.com/podcasts?feed=aHR0cHM6Ly9wb2RjYXN0LmFnaWxlZGF0YS5pby9mZWVkLnhtbA%3D%3D">Google Podcast </a>| <a href="https://music.amazon.com/podcasts/add0fc3f-ee5c-4227-bd28-35144d1bd9a6">Amazon Audible</a> | <a href="https://tunein.com/podcasts/Technology-Podcasts/AgileBI-p1214546/">TuneIn</a> | <a href="https://iheart.com/podcast/96630976">iHeartRadio</a> | <a href="https://player.fm/series/3347067">PlayerFM</a> | <a href="https://www.listennotes.com/podcasts/agiledata-agiledata-8ADKjli_fGx/">Listen Notes</a> | <a href="https://www.podchaser.com/podcasts/agiledata-822089">Podchaser</a> | <a href="https://www.deezer.com/en/show/5294327">Deezer</a> | <a href="https://podcastaddict.com/podcast/agiledata/4554760">Podcast Addict</a> |</p></blockquote><div id="youtube2-eOPGc1F9lH4" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;eOPGc1F9lH4&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/eOPGc1F9lH4?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><div class="pullquote"><p><strong>Tired of vague data requests and endless requirement meetings? The Information Product Canvas helps you get clarity in 30 minutes or less?</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://agiledataguides.com/ipc&quot;,&quot;text&quot;:&quot;Fix Your Data Requirements&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://agiledataguides.com/ipc"><span>Fix Your Data Requirements</span></a></p></div><h2>Google NotebookLM Mindmap </h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dsH3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0111cf5c-a0d4-4eab-acce-3d471c6e7158_8253x26836.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dsH3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0111cf5c-a0d4-4eab-acce-3d471c6e7158_8253x26836.png 424w, https://substackcdn.com/image/fetch/$s_!dsH3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0111cf5c-a0d4-4eab-acce-3d471c6e7158_8253x26836.png 848w, https://substackcdn.com/image/fetch/$s_!dsH3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0111cf5c-a0d4-4eab-acce-3d471c6e7158_8253x26836.png 1272w, https://substackcdn.com/image/fetch/$s_!dsH3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0111cf5c-a0d4-4eab-acce-3d471c6e7158_8253x26836.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dsH3!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0111cf5c-a0d4-4eab-acce-3d471c6e7158_8253x26836.png" width="1200" height="3901.6483516483518" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0111cf5c-a0d4-4eab-acce-3d471c6e7158_8253x26836.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:4734,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:11741457,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/171239242?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0111cf5c-a0d4-4eab-acce-3d471c6e7158_8253x26836.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dsH3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0111cf5c-a0d4-4eab-acce-3d471c6e7158_8253x26836.png 424w, https://substackcdn.com/image/fetch/$s_!dsH3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0111cf5c-a0d4-4eab-acce-3d471c6e7158_8253x26836.png 848w, https://substackcdn.com/image/fetch/$s_!dsH3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0111cf5c-a0d4-4eab-acce-3d471c6e7158_8253x26836.png 1272w, https://substackcdn.com/image/fetch/$s_!dsH3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0111cf5c-a0d4-4eab-acce-3d471c6e7158_8253x26836.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h2>Google NoteBookLM Briefing</h2><h3>Briefing Document: Dimensional Data Modelling Patterns</h3><p><strong>Overview:</strong> This podcast episode of "Agile Data" features Shane Gibson and Johnny Winter discussing the enduring relevance and practical applications of dimensional data modelling. Johnny Winter, a seasoned data consultant with a background spanning from Crystal Reports to modern data stacks, provides a comprehensive breakdown of core dimensional modelling concepts, common patterns, and nuances often overlooked. The discussion highlights why dimensional modelling remains the "number one analytical modelling technique in the world," even with the advent of new technologies and approaches like data vaults and activity schemas.</p><p><strong>Key Themes &amp; Important Ideas:</strong></p><p><strong>1. The Enduring Relevance of Dimensional Modelling:</strong></p><ul><li><p>Despite its age (Johnny started his career when it was prevalent), dimensional modelling is "pretty much still the number one analytical modeling technique in the world."</p></li><li><p>Its popularity has seen a resurgence with tools like DBT, indicating its continued applicability in modern data stacks.</p></li><li><p>The widespread availability of resources, particularly Ralph Kimball's books and blog posts, has made it highly accessible and understandable: "Kimble and Margie Ross they wrote a hell of a lot They write lots of books that were easy to read and understand."</p></li></ul><p><strong>2. Core Concepts: Facts and Dimensions:</strong></p><ul><li><p><strong>Dimensional modelling</strong> organises data into two primary categories: <strong>facts</strong> and <strong>dimensions</strong>.</p></li><li><p><strong>Facts</strong> are the "measurements, the things you actually potentially want to aggregate or trend," often representing "the event." Johnny uses the "how many" from Lawrence Core's 7 W's as a fact. Examples include order value, payment value, or sick days.</p></li><li><p><strong>Dimensions</strong> provide "the context that you apply to those," enabling "slicing and dicing of the data." These relate to the "who, what, when, why, where" of an event. Examples include customer, supplier, employee, or store.</p></li><li><p>Historically, this separation was driven by "constraints" (performance and cost) in databases, but it still promotes "reuse reasons" today.</p></li></ul><p><strong>3. Grain of Facts:</strong></p><ul><li><p><strong>Grain</strong> refers to the "level of detail" in a fact table.</p></li><li><p>Johnny advocates for modelling at the "atomic grain" or "lowest grain anyway." He states, "You can always roll up grain easily in a query... but it's very difficult to do the other way around."</p></li><li><p>While historical constraints led to "aggregated facts" for performance, modern "column store databases and now storage formats like paret" make aggregating granular data much easier and reduce storage footprint.</p></li><li><p>The default should be the lowest possible grain (e.g., "order line" rather than "order") unless there's a specific, justified reason for aggregation (e.g., performance, ease of use).</p></li></ul><p><strong>4. Slowly Changing Dimensions (SCDs):</strong></p><ul><li><p>SCDs address how changes in dimension attributes are managed over time, allowing for "asis and as was type reporting."</p></li><li><p><strong>SCD Type 0:</strong> The attribute "never ever changes." (e.g., a time dimension with 24 hours in a day).</p></li><li><p><strong>SCD Type 1:</strong> The attribute gets "overwritten." Historical records are updated to reflect the current value, meaning "all of my historical records will now point to that value." This can obscure historical analysis.</p></li><li><p><strong>SCD Type 2:</strong> "Allows you to track changes over time." A new record is created for each change, preserving historical context. This typically involves "valid from valid to date or a start date and end date" and a "surrogate key" to uniquely identify each version of the business entity. Johnny notes this is generally preferred now, though some clients still opt for Type 1 for "historical reporting perspective."</p></li><li><p><strong>SCD Type 3:</strong> "History recording except for rather than with a type two where you get an extra record when the value changes you get an additional column." It only tracks the previous version.</p></li><li><p><strong>SCD Type 6 &amp; 7:</strong> Hybrids, often involving "durable keys" for more complex scenarios, but "most people do ones or twos."</p></li><li><p><strong>Key Management:</strong> The importance of <strong>surrogate keys</strong> is highlighted, especially for Type 2 dimensions, as they provide a unique identifier for each record in the data warehouse, abstracting from the business key which might not be unique across historical records. While "hash keys" or "concatenated business keys" are possible with modern tech, Johnny prefers surrogate keys as they "forces you to go back and look at your dimensions and make sure that your values exist."</p></li><li><p><strong>End-dating strategies:</strong> The choice between leaving an end date as NULL or using a 99999 value, and the concept of "windowing" (which emerged from Hadoop's insert-only preference) versus direct end-dating, are implementation-level patterns that require careful consideration based on technology and cost (e.g., BigQuery's cost optimisation for end-dating). Consistency in these patterns across a data platform is crucial.</p></li></ul><p><strong>5. Fact Table Types:</strong></p><ul><li><p><strong>Transactional Fact Table:</strong> "Pretty much append only," recording a single event as it happens.</p></li><li><p><strong>Accumulating Snapshot Fact Table:</strong> Used for processes with multiple defined steps. A single record is updated as the process progresses (e.g., tracking a podcast booking process from invitation to confirmation).</p></li><li><p><strong>Snapshot Fact Table:</strong> A "full data dump every single day," capturing data at a specific point in time (e.g., stock balances, month-end closing balances).</p></li><li><p><strong>Type 145 Fact Table (Accumulating Time Span Snapshot):</strong> Less commonly mentioned, this "process driven type fact table whereby the status of something changes and you get a new record every single time." It's like a Type 2 SCD applied to a fact table, excellent for tracking states in a process like a sales funnel. Johnny calls the common misnomer "SCD Type 2 Fact Table" an "oxymoron."</p></li></ul><p><strong>6. Dimensional Design Patterns (Conformed, Role-Playing, Junk, Degenerate):</strong></p><ul><li><p><strong>Conformed Dimensions:</strong> "Reuse the same dimension across multiple facts." This promotes consistency and reusability across different business domains (e.g., a single Date dimension used for sales, finance, and HR reporting). This is a hallmark of dimensional modelling that differs from other approaches like Data Vault.</p></li><li><p><strong>Role-Playing Dimensions:</strong> A "single dimension that can be reused for multiple different things in different contexts." The classic example is the Date dimension serving as Order Date, Delivery Date, Refund Date via different foreign keys in the fact table. Another example is a Location dimension serving as both From Location and To Location.</p></li><li><p><strong>Junk Dimension:</strong> Addresses the "centipede fact table" problem (a wide fact table with many narrow, low-cardinality dimensions). A junk dimension combines "all those little bitty low cardality type things" into "one fat dimension that combines all possible given combinations of them." It's a "miscellaneous dimension" that simplifies queries, though it's "not a particularly common pattern."</p></li><li><p><strong>Degenerate Dimension:</strong> A "dimensional value" or context "only ever relevant in the context of a given fact," so it's "retained it on your facts table instead." This avoids joining to a separate dimension. While it can reduce joins, Johnny "try and avoid where I can" as requirements can change, leading to a need for that attribute in other contexts, making reuse difficult and potentially complicating user queries if there's no semantic layer. Modern analytical engines are also "optimizing their engines for that BI workload for that kind of star schema type shape and it it deals with them absolutely fine," so the performance argument for degenerates is often outdated.</p></li></ul><p><strong>7. Anti-Patterns &amp; Justification:</strong></p><ul><li><p><strong>Single "Thing is a Thing" Fact Table:</strong> Having one huge fact table for "every event all at the lowest grain" with highly abstract dimensions (e.g., a "people" dim for customers/employees/suppliers) is an anti-pattern. Dimensional modelling aims to reflect "business language," so dimensions should represent understandable business concepts. Separate dims for Customer, Employee, Supplier are generally preferred.</p></li><li><p><strong>Joining Across Facts Directly:</strong> This is generally discouraged due to "fan trap and chasm trap" issues and differing grains. Instead, "drill across" functionality relies on "conformed dimensionality" and aggregating queries at the dimension level.</p></li><li><p><strong>Factless Facts:</strong> A fact table that "doesn't have any of those things [measures to aggregate]," but "basically just stores the intersection of the various dimensions and ultimately end up counting rows on it to get the facts." It still has a 'fact' (the count of rows). Johnny prefers to just store the keys, not an extra column of '1's, as that's "bloat that you don't need."</p></li></ul><p><strong>8. Advanced Topics &amp; Nuances:</strong></p><ul><li><p><strong>Bridge Tables:</strong> Primarily used to "resolve many to many type relationships" between facts and dimensions. The example given is a joint bank account where one transaction (fact) relates to two customers (dimension), requiring a bridge table to resolve this. They can also help with "recursive hierarchies." Shane clarifies, "Bridge tables are a bridge between the facts and the dims not a bridge across facts."</p></li><li><p><strong>Late Arriving Facts:</strong> When a fact arrives <em>before</em> its corresponding dimension record has been loaded. This is handled by assigning an "unknown member" (a default surrogate key like -1 or 99999) in the fact table, and then "rolling windows" or later updates to assign the correct dimension key once it's available. This ensures "referential integrity" even if it means pipelines run "slower and a bit more expensive."</p></li><li><p><strong>Impact on BI Tools (e.g., PowerBI):</strong> PowerBI's Vertipac engine (compression engine) and DAX language are "structured to work with" star schemas, making it "far more efficient" than a single large table for analytical workloads, especially for compression and querying.</p></li><li><p><strong>Layered Data Architectures:</strong> Dimensional models often sit as a "core reporting layer" on top of other modelling techniques like Data Vault (which is typically not exposed directly to analysts due to its complexity for querying). This provides "flexibility" and "context" while making data "easy for the end user."</p></li><li><p><strong>The Importance of Context and Trade-offs:</strong> The choice of pattern always depends on context. "The nuance is in absolutely understanding what patterns are available and which ones to use when." Sometimes, knowingly implementing an "anti-pattern" might be justified for a specific edge case if it "makes sense in the right context and you can justify it."</p></li><li><p><strong>The Consultant's Mindset:</strong> Johnny and Shane discuss the need for a "dimensional checklist" when starting with a new client, to quickly understand their existing patterns and ensure consistency.</p></li></ul><p><strong>Conclusion:</strong> This episode serves as an excellent deep dive into dimensional modelling, re-emphasising its foundational role in data analytics. Johnny Winter expertly navigates complex concepts, providing practical examples and personal insights into common challenges and best practices. The discussion underscores that while technology evolves, the core principles of dimensional modelling remain highly effective for building robust, performant, and user-friendly analytical data platforms.</p><p></p><div class="pullquote"><p><strong>Tired of vague data requests and endless requirement meetings? The Information Product Canvas helps you get clarity in 30 minutes or less?</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://agiledataguides.com/ipc&quot;,&quot;text&quot;:&quot;Fix Your Data Requirements&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://agiledataguides.com/ipc"><span>Fix Your Data Requirements</span></a></p></div><p></p><h2>Transcript</h2><p><strong>Shane</strong>: Welcome to the Agile Data Podcast. I'm Shane Gibson.</p><p><strong>Johnny</strong>: And I'm Johnny Winter.</p><p><strong>Shane</strong>: Hey Johnny. Thanks for coming on today. We are gonna talk about the patterns of dimensionally modeling. It's an interesting one for me because I started my career in data when we did lots of three and F, and then we went on to dimensionally modeling.</p><p>I'm that old. And then I moved on. I used a bunch of other patterns and I was sitting back the other day thinking, holy shit. I've had a bunch of podcasts talk about modeling patterns, but actually nobody on to describe dimensional modeling, which pretty much is still the number one analytical modeling technique in the world.</p><p>Before we rip into that, why don't you give the audience a bit of background about yourself? </p><p><strong>Johnny</strong>: Yeah, absolutely. So. I guess some people listening, actually, even as you did that intro, all that was going on in my head was the intro music. 'cause that's like a ripping tune and if I ever do my own podcast, like the standard's been set in terms of, uh, intro music.</p><p>But yeah, me, I'm Johnny. I, I'm based in Preston, in Lhi in the uk. I have been working in data since I always say 2007, somewhere at my house. I've got a certificate where I went on a Crystal Reports course, so that's where I cut my teeth as a data professional. So basically a report writer, analyst type role using good old business objects, crystal reports, that's where I, my sql then effectively graduated, which think is quite a common path from that kind of sort of analyst type role into more the sort of full BI developer as we called it back in the day.</p><p>Moved on to building, developing data warehouses, the ETL and all the dimensional modeling that came with that. Historically that was good old Microsoft stack, so on-premise, SQL server integration services, those kinds of things. As I still did like the reporting services element of it as well. So I did the full sort of end-to-end.</p><p>It was before we got trendy and had data engineers versus analysts as it were. I used to do absolutely everything and then, yeah, graduated more into cloudy stuff from a tech perspective. In my last four or five years, I've been working in data consultancy for a couple of different consultancy firms, and I'm a consultant today looking to start my own independent consultancy imminently.</p><p><strong>Shane</strong>: Wow. Welcome to the chaos that is doing a business, running a business and trying to grow a business. Yeah, looking forward to it. How? How to take a 40 hour job and make it hundred 20 a week.</p><p><strong>Johnny</strong>: Yeah. So simultaneously terrified and excited. What is it? Find a job you love and you never have to work another day in your life.</p><p>I'm pretty sure that it works the other way as well. Find a job you love and you'll never have another day off in your life. </p><p><strong>Shane</strong>: Yeah, I think the key I say to people is always make sure you have annual leave. The problem with consulting is there's gonna be lumps and bumps. There's gonna be times where you're not working due to no choice of your own, and because of that you tend not to book holidays and then you burn out.</p><p>So yeah, just get somebody else to help you run the company, your partner or whatever, and say, this is annual leave. You've gotta take it every year. 'cause otherwise you won't and you're regret it. Crystal reports though, the real question there to age people is was that crystal reports before business objects or after business objects?</p><p><strong>Johnny</strong>: So in terms of my exposure to it, I started using it just after it got acquired. So I think its Seagate before that. So when I started using it, the company I worked for at the time, there was this Seagate footprint everywhere. Like quite a lot of the system accounts that we'd used for accessing databases.</p><p>'cause we didn't do things properly with like you would do today with service principles and whatnot. So a lot of the accounts that we'd accessed, things were like Seagate, everything was labeled Seagate. But yeah, they had just been acquired by business objects at that point in time. I think by the time I stopped using it was just as they got acquired by SAP as well.</p><p>But yeah, it was just the customer report side of it. I never really touched like the university side of it, the semantic layer, which is weird 'cause I'm like a bit of a semantic layer geek now. I guess my sort of public persona in the kind of data community, a lot of it has been built around Power bi, which is a semantic model, is like I'm a big fan and one of the reasons that I'm a big dimensional modeling fan as well because it's one of the things that Power BI is absolutely optimized for.</p><p>But yeah, never really got to grips with universes. It was just a pure Crystal Reports guy, and at the time it was Crystal Reports always on our LTP databases as well. When I learnt my trade, I didn't know what a date warehouse was. I didn't know what a dimensional model was. </p><p><strong>Shane</strong>: I think it's back in the days we used to do ods, which were really just near real time replicas or overnight replicas of the source system.</p><p>And then you were munging all those horrible tables together and there was no analytical layer as such. It was just bunch of bloody horrible queries, which is where business objects in the universe is actually had massive value. And then from memory, it took them ages to get crystal reports to run against the semantic layer.</p><p>It reminds me of SQL Server reporting Services, SSRS, where again, that was direct query against the relational databases. It never really had the semantic layer that you got when Power BI kind of took hold. </p><p><strong>Johnny</strong>: Yeah. Funnily enough, Krista reports to SQL Server reporting services was exactly the sort of step that I took as I left the role I was doing using Krista reports.</p><p>We were actually in the process of migrating to S server reporting services instead. So I had a bit of a foot in both camps to an extent. Definitely from a analysis service perspective, you could connect SSRS reports to analysis services semantic layer really easily. But the crystal reports that I had exposure to is exactly what you described overnight, ODS, just a database replication once a night and then you'd write your reports.</p><p>And I got forever frustrated about the fact I'd have to write the same convoluted business logic over and over again for lots and lots of different, similar concepts, similar reports for stakeholders, oh, we need to do this thing, and I'd do it and be quite pleased with it. And it was like somebody else wants something a little bit different, perhaps a variation on it.</p><p>And I'd have to. Reuse some of that logic all over again and it sent me down a bit of a, a rabbit hole in terms of trying to research that there must be a better way. And that is when I picked up my first copy of the Date Warehouse toolkit and learned about what a date Warehouse was. And ultimately I actually left that role because the project to implement our first data warehouse spent two years getting off the ground and eventually I ran outta patients.</p><p>'cause they'd not even started it by the time I actually ended up resigning because I was like, no, I'm fed up with this. I'm gonna go work somewhere that's actually doing the things I want to </p><p>do. Forget the days where we used to spend six months to a year doing requirements and then six months to a year buying hardware and waiting to rack and stack it and get the database installed before we could even start.</p><p><strong>Shane</strong>: The world has certainly moved in a good way. </p><p><strong>Johnny</strong>: Yeah. </p><p>This is working in defense industry as well. So they were very risk averse and the amount of red tape we had to go through for any kind of IT type projects was just horrific. Anyway, so I get the impression speaking to former colleagues that even with the way the world's gone these days, it's still like that they're still mostly on-prem based and getting anything up and running.</p><p>Still very slow. So I'm </p><p>glad I got out what I did. Let's go into that idea of data warehousing, dimensional modeling, star schemas, all those good words. Just kinda want to go through and just discuss the patterns. 'cause what we're seeing is quite interesting with the adoption of DBT as a tool. Dimensional modeling seems to have come back to the four.</p><p>For those people that are modeling or consciously modeling. Dimensional modeling seems to have come back as again, the number one modeling technique that is used in those kind of modern data stacks. So let's start off at the beginning. Dimensional modeling, uh, has this concept of facts and dims. Talk me through those.</p><p>Yeah, that is not far off the standard sort of textbook answer. If you were to describe what is dimensional modeling, organizing your data into two categories of table facts or dimensions, the facts tend to be the measurements, the things you actually potentially want to aggregate or trend. The dimensions are effectively the context that you apply to those.</p><p>So I always talk about the slicing and dicing of the data. I think like when you first asked whether or not I'd like to be a podcast guest and talk about dimensional modeling, we talked about the fact that I was going to potentially do a bit of a blog series about dimensional modeling and that perhaps we could then wrap that up as me being a podcast guest.</p><p>And I think I wrote part one. And part two still in draft, and I've just not gone round to it. And I was like, yeah, okay, Shane, let's just do the podcast because the blogs are going slow. But I've still been putting quite a lot of thought into what the content of those blogs is gonna be. And one of those is absolutely the sort of Oh, dimensional modeling.</p><p>Yeah. It's just facts and dimensions. And then you dive into it and it really isn't, it's actually lots more nuanced than that. So I guess the first layer of the onion is very much the, yeah. Facts, the things that you wanna be able to measure the event almost is the way that I describe it, especially in the sort of Lawrence core business event, E type context.</p><p>And then, yeah, the dimensions being, when I think of my seven W's from a Lawrence Corona Beam perspective, my how manys being my facts, and then my other W's, my who, what rental, why wear how. We know those might be the dimensions that are gonna sit around it. </p><p><strong>Shane</strong>: Yeah. Think about it as dims of things. We want to look at the things we can see, customer, supplier, employee store, those kind of things.</p><p>And then the facts are things we wanna count, order value, payment value, those kind of things. Yeah, absolutely. And what we're doing is effectively we are breaking the data out into those types of tables primarily. In the early days it was around constraints. So our databases were constrained in such a way for performance and cost that we couldn't just load all the data effectively into a data like inquiry at willy-nilly.</p><p>We actually had to model it in certain ways for performance reasons as well as reuse reasons. And that's where it came from. And so yeah, we've got this idea of dim being a thing and the fact being a measure of those things. And then the next thing that often we need to talk about is grain. Yeah. So as soon as we talk about a fact, we will typically wanna have a conversation about grain.</p><p>So do you just want to explain how you think about grain of the facts? In many ways </p><p><strong>Johnny</strong>: being brutal about it. I don't tend to overthink about my grain of my facts too much and like historically, as you say, from almost like a technology constraints perspective. You potentially end up with aggregated grains and things like that.</p><p>But I've always just gone in at trying to always model this sort of atomic grain at the lowest grain anyway. So grain for me is always about level of detail, and when you build a fat table, you've got to figure out what level of detail you're gonna get to. So I'm always aiming to get to the very, very lowest grain of detail I can.</p><p>You can always roll up grain easily in a query. You can always do an aggregation on top of a very granular fats table, but it's very difficult to do the way round. So for me, that grain, it's trying to get to the lowest detail of information available for me to an extent in terms of a dimensional model as well, I'm looking at the sort of the cardinality of the facts table to its dimensions as well.</p><p>So I want my dimensions for each unit of measurement that's in my facts table should only relate to one value in my dimensions. In an ideal world. We used to </p><p><strong>Shane</strong>: talk about transactional facts and aggregated facts. The aggregated facts back then was a constraint based model. We couldn't hold all the transactions and then query them, and with an aggregated query fast enough, we needed to materialize it or physicalize the aggregations for a performance reason.</p><p>Whereas now, not so much we've got the fire power to be lazy and bring it all through, but we get less work up front. More value down the stream. </p><p><strong>Johnny</strong>: Yeah, so I think column store from a going down a sort of what I call a techie gobbins perspective, the advent of column stores really help with that column, store databases, and now storage formats like Par k being common based as well does make aggregating data really easy and it helps reduce the storage footprint as well.</p><p>So you're getting the best of both worlds with it. </p><p><strong>Shane</strong>: But if you look at a fact, you're still gonna say, is the grain of that fact an order or an order line, aren't you? </p><p><strong>Johnny</strong>: Yeah, absolutely. I'd still always strive to just go as low as I can. So in that scenario, I'd always, we should build a order line. 'cause we can always roll up to order.</p><p>But yeah, absolutely. </p><p><strong>Shane</strong>: So treat your facts as the lowest grain possible for now. And then if you have to aggregate or change the grain for another fact table, then you're doing it for specific reasons. It's not anti-patent, but you're applying that aggregated Pattern for a specific reason. Performance, ease of use.</p><p>Yeah, something like that. Rather than using it as As the default. Yeah, absolutely. And then the next core word is slowly changing dimensions. Type one, two, what is it? I can't remember. Seven. Yeah, there's all these numbers </p><p><strong>Johnny</strong>: seven's as far as I've gotten to in terms of understanding them. In fact, I mean I could say that there is no way that I could reel off what all seven are.</p><p>'cause they're not. Just not used that much. It's zero index as well. I think sometimes type people forget there's a type zero to start with two. So yeah, dimensions effectively the context of your dimension with regard to your facts and talking about, the phrase I've been using most recently is as is and as was type reporting.</p><p>And the one that I like to talk about is like a sickness record for argument's sake. So if you're somebody in HR and you wanted to understand which job titles potentially, or which job roles potentially cause the most illness, and we had a fact table that recorded a transaction every time somebody the other day off sick and then a dimension for the employee.</p><p>And one of the attributes of the employee was what their job title is so that you can then slice it and dice it and say, ah, we can see that data consultants have had a hundred sick days this year. Whereas agile coaches have only had five sick days, so maybe we need to give a bit of wellness training to our data consultants.</p><p>What you've got to take into account is the fact that someone's job title can change over time. So I'm a data consultant now, but I'm gonna be a business owner. In a couple of months time. You were a data consultant and now you're. Agile data coach ish. So yeah, type zero is the, basically, it never ever changes.</p><p>The best example I always have that is the time there will always be 24 hours in a day. There will always be 60 minutes in an hour. There'll always be 60 seconds in a minute. So if you've got a time dimension, once you've defined it, you're never gonna have to change it. Type one is that it gets overwritten.</p><p>Me, for example, as a, currently as a data consultant, if my job title changes to business owner, the record just gets overwritten. But what that means is that all of my historical records will now point to that value. So my historical sick record would now have all of my sickness loaded as a business owner.</p><p>So it might cover up the fact that data consultant was the thing that was causing the stress. Type two basically allows you to track changes over time. So every time there's a change to the record, it creates a new record with it so that you can basically then say that, ah, all those absences that happened last year.</p><p>The job title is data consultant, all the absences next year, not that I'm ever gonna have any, the business owner absences. And then you get a, essentially a more accurate reflection of that particular analysis, but you've always got to bear in mind that isn't necessarily what the users always want.</p><p>Sometimes the users want to know what the current is. We speaking with a client the other day and they were talking mergers and acquisitions, and so they were talking about the fact that if a particular entity changed its name, they would always want the historical records to be recorded against the current name of the entity because from a historical reporting perspective, they'd want it to be overwritten.</p><p>So you gotta understand it from that perspective. Type three is history recording, except for, rather than with a type two where you get an extra record, when the value changes, you get an additional column and it basically only ever tracks So you the the previous version of it. So you don't get the full history, but you get the previous.</p><p>The good fun I've been getting into recently is type sixes and sevens. So type six is a hybrid one and three, and type seven is a hybrid one and two, which is then you get into the realms of the fact that you can do both and starting to use things like durable keys. I'm going properly down the rabbit hole at that point, </p><p><strong>Shane</strong>: and I think it'd be fair to say that most people do ones and twos.</p><p>Very rarely do you go into the other numbers unless there's a really specific use case and then it's well documented how to do it. You just need to know they exist. It's very rare that you are gonna bake that in. And again, I think in the early days we used to have to make a call between ones and twos because again, we had a constraint on how much data we could store.</p><p>So we would cherry pick which dimensions. We moved to a type two because it involved more data, more complexity. Whereas now I'm guessing everybody is. Type two in by default and then may surface a type one view. When you query the data, you always just see as it now, and if you want to, you can query another view, which is as at a certain date if you choose it.</p><p>Or is that not true? Are people still defaulting to type one by default and then type two by exception, </p><p><strong>Johnny</strong>: my experience has been more type one by default, type two by exception. Yeah, there's so many nuanced arguments in terms of the way to implement it these days, like if even getting into the realms of persisted keys and things like that.</p><p>'cause I'm like an old school surrogate key guy and I believe that's still really important from a relational integrity perspective. You get quite a lot of people who. Prefer the sort of indem key. So basically the scenario where the importance of surrogate key is this idea that it's almost like abstracting the business entity away from the business keys.</p><p>The fact that you're basically just having something, a key that identifies a particular record that's unique to your data platform. When you get into the realms of things like type two dimensions, it's really important to have a surrogate key because you're gonna get duplicate business keys or natural keys in there.</p><p>'cause every time the record changes, the business key's not gonna change. But you need a new unique key in your data warehouse. Some people are funds of in Keys these days, there's this idea of hashing values to produce a key that's gonna be pretty consistent. There are some drawbacks with that. The hashes aren't always perfect.</p><p>You can end up with clashes. I was always coached on the just the incrementing key type approach, and there's loads of people who argue against that these days and what. I've, my experience so far is that people have gotten a bit lazy by using the Indem key value because they can create them on the fly and they're like, oh great, we're gonna have perfect relational integrity because it always, when we use this hash, it's gonna make all the records match, but they're still not checking that they've got that integrity between their data sources.</p><p>So you can still end up with missing records and mismatches anyway. So I tend to use the old school surrogate key 'cause it at least forces you to go back and look at your dimensions and make sure that your values exist and that you've got things like unknown members and things like that. I feel like I'm going down like another rabbit hole of just said lots and lots of phrases and.</p><p>I take for granted that I know what they mean. </p><p><strong>Shane</strong>: But the good thing about Dimensional modeling, and one of the reasons I think it is still popular, so popular is Kimball and Margie Ross, they wrote a hell of a lot. They write lots of books that were easy to read and understand. They wrote lots of blog posts, which became books.</p><p>Some of those old tips and tricks, we should probably archive them so we don't lose them. If those sites ever go down, they give us really good examples of, if you have this problem, this is how you deal with it. I'm a big data vault fan and we haven't had that gift in data vault land. If you read any of the data vault books, they're pretty dire.</p><p>They don't explain things well, in my view, 'cause this is my personal opinion, even though I use the modeling technique all the time, all the tips and tricks, there's very few of them that are usable. And lots of people have tried. So yeah, I think part of it is if you wanna understand what a surrogate key is, get the books or go read the blog posts and it will explain them in infinite detail in a way you can understand.</p><p>I think one of the key things you said was the surrogate key Pattern. This Pattern of saying, I'm gonna look up unique business key and then I'm actually gonna store another unique key and maybe an incremental inte key that's just incremented up by one to say in the data warehouse and the data platform that is now the identifier for this customer, this employee, this supplier.</p><p>We definitely use different technical patterns. Now can business key, I think is a viable Pattern with the technology we have today. Hash keys. Yes, we have collisions. That is a problem we need to worry about. How often do they really happen? Would we even know? We wouldn't know. There is technical implementations, the ability to wrap.</p><p>Trust patterns around it to say, if I've got a key in my Salesforce and I've got a key in my operational system in my Salesforce, it's. Business key one is Johnny, my operational system business key one is Shane. I have to actually do some logic to say that I can't just slam those keys and put them in the same place.</p><p>I have to make sure that they're unique. Otherwise I'm gonna do a whole lot of bad things, whether we surrogate, whether we hash, whether we can cabinet business keys, who cares? Pick one that works for you for the technology you're using and then make sure it's accurate, it's trustworthy, it, it's got all that rigor around it.</p><p>'cause those are the patterns that actually really count. So then let's go back to the SCD two. So I've got Shane the consultant and I've gone on for my lovely two weeks holiday because I'm working for a consulting company and I actually get leave and then I become Shane, the, the company owner, and I go and take my one day weekend that I get forced to take by somebody else for the year.</p><p>We have a bunch of techniques on how we know that record has changed in dating versus windowing. So do you wanna talk through those? </p><p><strong>Johnny</strong>: Yeah. So in terms of the way I've always done it, the terms of all the implementations I've seen, it's always about having that valid from valid two date or a start date and end date that can be supplemented with it and, and is active flag potentially.</p><p>So you always know what the current one is. I've seen different people do them different ways. I've, I personally, for me, having a definitive start date and end date happen works really nicely. One of the patterns that I hate seeing is people who join on the years active indicator. 'cause if you happen to be loading a historic record, if it's a backdated payment, that needs to go back to when you are working as a consultant.</p><p>But we. Load it based on the active flood being business owner. For me, that's an inaccuracy. So for me, I would always sort of time bound it between my active windows, and I always hate leaving a end date as null as well, because then you're into the realms of casting that as some high date, just to be able to figure out your date ranges.</p><p>I guess it's one of the interesting things about the Pattern is even the patterns themselves have patterns within them. There's that Pattern at a conceptual level, and then there's even other patterns at a implementation level that can differ as well. But yeah, type two. You basically got an active from date and active two dates, what period of time a particular record was </p><p><strong>Shane</strong>: valid for.</p><p>And this is the key is that there's the logical modeling or the conceptual modeling of Dimensional. I've got a bunch of dims, I've got a bunch of facts, and then there's the physical implementation. How am I doing my end dating strategy? So as you said, I was gonna ask you, are you, have you start end dates and if it's a current active record, is you end date an or do you pick a a 9, 9, 9, 9?</p><p>Yeah, that's always interesting. And then we got this idea of windowing, which kind of came outta the Hadoop stage. So when we moved from relational databases to Hadoop, one of the problems was actually changing. A record was incredibly expensive. Yeah. So if we landed a record and then we wanted to end date, it actually, it wasn't performance.</p><p>So we moved to this upset model, this idea that we wanna be insert only if we can. And so that forced us down the idea of using a Pattern of windowing. We don't ever have the start date. And then whenever we run the query, we'd go back and say, run a window function, and tell me between this period, what is the effectively active record.</p><p>And that was an expensive query. So we just gotta make choices. </p><p><strong>Johnny</strong>: I've seen window in, it wasn't even in a Hadoop implementation, it was just in a relational database where somebody hadn't done an end date. They'd just say same rationale, but for the technology, probably the wrong choice. And even for that, it was an expensive query, which is why I prefer not to do it.</p><p>But again, I, I've missed that area to an extent. I went from relational databases and then just leapfrog straight to lake houses and open table formats. And so with your Deltas and your icebergs, that kind of updating historic records. I don't wanna go down the rabbit hole of explaining how it happens, but functionally you can, even though sort of the way under the hood it works is insert only, </p><p><strong>Shane</strong>: yeah, it is still an update of records in those technologies.</p><p>It's still a technical anti-patent for the technologies. It's just that they've worked out ways to make it work and you shouldn't care. You may get a query go from two seconds to 2.1 seconds unless you really care, which you don't at that level. So you just gotta choose a Pattern that works for you. You can do start only dates and windowing and then create views on top, and the view can give you a, an end date on the fly.</p><p>There's lots of choices. To make it easy for the end user versus performance, you just gotta pick one. The interesting for us is with our product, the end. Because it's way cheaper. With the petitioning strategy that we use with BigQuery, it's way cheaper for us to end date the records than it is to have a windowing function.</p><p>So when we looked at it and cost was key for us, we said, it's a trade off decision we're gonna make. We're gonna go touch that record because it saves us money over everything we do if we use that Pattern. So let's pick one. And then the other key is when you walk into a new site. You now got a whole lot of questions you need to ask yourself.</p><p>Are they dimensionally modeling? Great. Okay. What's the typical grain of the facts? Is it transactional? Yeah. Are there any aggregate facts? No. Okay. What's their DIM strategy? It's primarily SCD type one. Okay. How do they make a decision when they need a type two? Okay. When they do a type two, what's their end dating strategy?</p><p>Are they relying on active flags? Are they start in dating? How do they fill out the inundated field? Is it a null? Is it a specific date? Are they windowing? Where's the query for the window? Is it left to the user? Is it in Power bi? There's all these patterns within patterns that are known, but as soon as you walk into a place you've never seen before, you have to ask all those questions.</p><p>'cause you need to know, you need to follow those patterns. Heaven forbid you have a dimension where one dimension is inundated and another is windowed. Yeah, you, you have to say, what the fuck, what? There may be a reason, but I'm gonna ask really hard questions about why are we using a different end dating strategy for DIMS in the same data platform?</p><p>There's gotta be a reason for that. Apart from, oh, a new developer came on, that's why they did it. That would absolutely </p><p><strong>Johnny</strong>: fry my OCD. Just in terms of consistency. I can, I can get quite opinionated on what my preferred patterns are, and I don't mind being challenged on those, and even the patterns I prefer have some trade offs, and if those trade offs aren't the right trade offs for a client, then it's, okay, great.</p><p>If you're wanting to optimize for something. Different toward the kind of default that I try to optimize for. That's fine. But yeah, consistency's gotta be key. It's great though, that little spiel you just went on in terms of all those things you've gotta think of. And in my head I was like, just faxing dimensions dimensional modeling, just faxing dimensions.</p><p><strong>Shane</strong>: Well actually, I'll tell you what, you've just got me thinking again, so you know how we're working on that data layer checklist. Yeah. I'm wondering if there's like a dimensional checklist when you walk in as a consultant to a site you haven't seen before. You just have a little checklist where you go through and you say, here's the patterns.</p><p>It could be, let me just tick some boxes so that I'm thinking about it. The checklist is just helping me think to ask those questions. And I'm like, yeah, I've gotta reset because I've come off a gig for another customer that's got a slightly different Pattern and now I've gotta reset my brain. And as a consultant, that flipping between organizations, that change of slight variation in Pattern sometimes.</p><p>It gives you a problem because you haven't reset your brain to the Greenfield. </p><p><strong>Johnny</strong>: Yeah, I've struggled with it recently with a client. </p><p>There's a couple of things that they've done. First of all, are you familiar with this? If you were to look it up on the internet dimensional modeling and fact tables, the common answer say there are three types of facts table.</p><p>You've got transactional, you've got accumulating snapshot, and then you've got snapshot. Transactional is almost as it is. It's just pretty much a pen. Only as a new action happens, you keep adding to it. Accumulating snapshot tends to be more sort of process based. If we took booking this podcast as an example, I think you probably asked me in about March.</p><p>So you'd create a record that said Shane invited guests for podcasts 1st of March, and then the ball was in my court in terms of committing to a date, and I sat in my hands for a long time, and I think last week was when I was like, let's do it. So that would've been mid-July. So we'd update our record to say, date invited.</p><p>Date confirmed and then the date it happened, argument's sake. So 21st of July as it is now. So that's one record, but we've updated it three times. So, uh, accumulating snapshot has been updated based on dates. A clean snapshot, you just take a full date dump every single day. So things like stock balances are normally quite good for that.</p><p>Bank balances, maybe things like that. Finance teams are quite keen on the month end closing balances and things like that. So you take a snapshot of the data at point in time. So you've got those three types, but actually there is a fourth type that just doesn't get mentioned very often and I think it's 'cause Kimball only thought about it as a bit of an afterthought.</p><p>People refer to it as a type 1 4 5 fax table. So have you heard of this Type 1, 4, 5? </p><p><strong>Shane</strong>: No, I haven't actually, but I'm like, wow, what a great name you already for. You already blown you had to give a thing a stupid name. Yeah, </p><p><strong>Johnny</strong>: yeah. So the reason that, to be fair, Kimball didn't call it a type one four five, but it's just data models have, have gone with that because all of the supplementary information on the Kimball Group.</p><p>All of the blogs are numbered. So article number 1 4 5, if you were to look it up, basically refers to this fourth type of fact table, which is a accumulating time span snapshot. So we've basically taken two outta the three types of facts table and mashed them together somehow. The other thing that people call 'em, which really grinds my gears 'cause it's complete oxymoron, is they'll call it an an SCD type two factor table.</p><p>And it's okay. SCD is dimension tables and this is a fact table, so it can't be a type two, but the reason people get confused with it, 'cause it works the same way, it's like a process driven type fact table whereby the status of something changes and you get a new record every single time, but it's a change in status.</p><p>They're really good for processes. Things like a sales funnel or something like that. If you've got a particular opportunity that might be you get a lead and it's a lead between the 1st of January and the 1st of February, and then the 1st of February it becomes a qualified lead and then after two or three weeks of talking, it then becomes a proposal.</p><p>And then maybe it becomes a sale. So you've basically got the same record that goes through four different statuses. And then rather than capture that as making it wide and putting extra columns in to represent each stage of the process, you basically create new records and you have a start date and an end date.</p><p>A associated with a record, like a type two, slowly changing dimension. </p><p><strong>Shane</strong>: That's what I call an administration event change. Invoice entered, invoiced, reviewed, invoiced approved, invoiced, paid. 'cause normally it would be a bunch of columns. And now I've got a whole lot of SQL Magic I've gotta do if I want to go and grab those columns and make them rows for whichever bi tool I'm using.</p><p>So what you are saying is effectively we get a new row for that dimension key. When there's a state change, we have a date for that state change. What about event slamming? So let's say I've got a, an event of something ordered. I've got an event of a payment, I've got an event of a delivery, and I've got an event of a refund.</p><p>In my understanding with standard dimensional modeling, I'd have four fact tables. I'd have an audit table, a payment table, a delivery table, and a refund table. Is that true? </p><p><strong>Johnny</strong>: Oh, it's classic consultant answer at this point. And you definitely, it depends. My personal preference, when I design, and this is not so much a Kimball thing, but more of a Lawrence core and beam type thing, I always tend to conceptually model them as separate events and then having modeled them as separate events, if they're conformed dimensionality.</p><p>There's another great soundbite for you, conformed dimensionality. It's on my list. Yeah. If their conformed dimensionality is identical or so close to identical that it's not gonna make a difference. And if their grain is also identical, then I may well model them as a single table. But it depends. </p><p><strong>Shane</strong>: Okay.</p><p>But if I had a fact table, that thing is a thing. One fact table and in my whole warehouse, 'cause one fact table, that's every event, all at the lowest grain. That's an anti-patent with, with dimensional modeling, we don't have fact tables that are thing is a thing. So let's say I've got a fact table with three keys.</p><p>Thing one thing two thing three. Yep. And thing one goes back to a dimension. And that dimension is a bunch of keys and then types. So that dimension holds employees. Suppliers, customers. And there's a typing on it. And then thing two is the event. So it goes back to another dim, and that DIM holds all the records and they're typed by order, payment, refunds, delivery, packing, all those kind of things.</p><p>And then we've got thing three, it's some kind of location dim. And so in, I go back to a dim table and I've got a bunch of keys and it's store every store in the world and then every website and every URL. So I'm modeling at the highest level of extraction where I've effectively got three dim tables and one fact table.</p><p>And everything's a thing, is a thing that is an anti-patent for dimensionally modeling. Dimensional modeling is designed to pick up some of the business language. Yes, I should look at a dim and actually understand. That's something I can look at and understand that it holds a bucket of things that are different from a bucket of other things in my organization.</p><p>Yeah, I agree. So do we still see dimensions of people, which is customers, suppliers, and employees? Or would you tend to model a customer dim and employee dim and a supplier dim? </p><p><strong>Johnny</strong>: It always goes back to me to an extent in terms of speaking with my business users and how they reinterpret them for that kind of example, like almost unquestionably they're gonna be separate dimensions.</p><p>That's gonna be an employee, supplier and a customer. Yeah. Almost undoubtedly that's three separate concepts. Likely coming from three separate, possibly separate systems, but certainly if they're all in one system, separate tables. Where it gets fun is when you get into the realms of, so example I'm working with at the moment, and we've got this idea of a researcher dimension and a researcher is actually a subset of employees, but it's okay, should we still have an employee table and a researcher table?</p><p>We, we typically tackle things like that with role playing dimensions and having an employee dimension that can be filtered and may be obstructed in a view as different types and things like that. </p><p><strong>Shane</strong>: Yeah, and it's the same with suppliers and customers. Do I have an org dimension that is effectively role playing customer supplier?</p><p>It is choices. They're, as long as they fit the dimensional modeling Pattern, you've got a dim, you've got a fact. You're deciding the grain of your fact. You're deciding what form of society changing your dim is. You've decided how you're gonna manage your keys. It's okay. Right? You now just come into some choices about which Pattern works best for you.</p><p>Let's go back to a couple of those things you talked about. Let's go through conform dimensions first and then role playing dimensions next, because they're key terms that we'll see in dimensional modeling all the time. </p><p><strong>Johnny</strong>: Conform dimensions. So this idea that you can reuse the same dimension across multiple facts.</p><p>So if we had a sickness data model and we had. Periods of sickness, and we had our employees so we knew which employees had been sick. You could use that employee dimension with your sickness facts and use that as part of a modeled business process. And then you can do analysis on that. If you had a completely different business domain, let's go with sales, and you wanted to be able to analyze sales, and you wanted to be able to see which employees had sold the most of a particular product, you wouldn't remodel your employee for that sales domain.</p><p>You would use that conformed dimension. So this idea that this single entity can be used across many business events effectively. So the classic ones date. Like a date Dimension is, I think not far off every dimensional model ever. Design always has some kind of date dimension attached to it, and you're likely just gonna reuse that date dimension, be it a finance domain if you were doing financial reporting or if you're doing sales reporting, or if you're doing marketing or anything HR related.</p><p>Writing and maintaining a single date table means basically that date dimension's conformed across all of your potential facts. </p><p><strong>Shane</strong>: And that's one of the things that's, I know if it's unique to dimensional modeling, but it's definitely one of the things that you do in dimensional modeling that you don't tend to do in some of the others.</p><p>I'm data vault modeling or I'm activity schema modeling. For me, dates are just attributes of a thing. I don't hold a hub and set for dates and data vault. I don't hold dates as secondary key in activity schema. So in dimensional modeling you will typically see a date D, which holds that is, and it's used across every fact table.</p><p>Yeah, so another kind of unique thing, I dunno if it's unique 'cause I haven't actually seen every model in the world, but definitely a Pattern of a dimensional model is a date dim and role playing. Talk me through roleplaying dimensions. </p><p><strong>Johnny</strong>: Yeah. Roleplaying is the idea of a single dimension that can be reused for multiple different things in different contexts.</p><p>Date is actually quite often a role playing dimension because we took into account the example we talked about before, where you might have an order date, a delivery date, a refund date. Typically, you wouldn't create three specific date tables relating to that. You'd have one date table that you can relate to your fax tables via.</p><p>A variety of foreign keys and you'd reuse that single dimension in different contexts. Dates a really typical one, but even things like trying to think of good ones. I recently, I did a supply chain type thing where location was really important, but it was always from location and a there, or a shipping from one location to a different location.</p><p>But actually they could always be going in either direction. So rather than having a from location and a two location dimension, we just had a location dimension with everything in it. And then we just roleplay to it, depending on the context of whether it was the destination or the origin. </p><p><strong>Shane</strong>: And so if I look at the pen itself, it's effectively in our effect table.</p><p>We have two columns with keys, but both of those keys come from the same dimension. And it's the typing of the dimension. So another one would be, if we ever wanna see customers and suppliers in the same fact, we might have an organization dim and it's typed by customer, supplier. And then we'll see those keys in two different columns in the fact table.</p><p>Possibly let's go out through the rest of the weird dimensional thing. So Junk dimension. Talk me through that one. That's a cool name, right? That's better than fact. 1 4 5. Junk Dimension. </p><p><strong>Johnny</strong>: The junk dimension. At the same time it's a terrible name 'cause it makes it sound really throw away and not particularly valuable.</p><p>So going one back from Junk Dimension. Have you ever come across the concept of a Centipede factor table or is that a new one on you? That's a new one on me. Cool. So a send speed facts table is where you end up with a facts table that's got lots of different context around it and the context is very specific and granular.</p><p>So you end up with a wide fact table that's got lots and lots of dimensional keys in it. And then the dimensions, it's joining to end up being very narrow dimensions with only a few attributes on them. And you end up with basically to write any given query having to do lots and lots of joins all over the shop.</p><p>And they call it a center speed fax table because if you think about an ERD, the fax table ends up being very long and thin with lots of relationships coming off it like little center speed legs. And they're difficult to use because your dimensions are spread across so many different sort of categories.</p><p>They're difficult to navigate. You've gotta write these really unwieldy SQL queries where you've got lots and lots of joins and they only tend to be like joins that are one hop. We're not talking snowflake. And so they're not that bad from a performance perspective, but from writing them they become unwieldy and you can cure that.</p><p>By, there's a couple of things really. If you've got any sort of commonality between those lots and lots of small dimensions and you can perhaps denormalize them into a single entity that works nicely, but sometimes you just can't, and almost the way around that, the idea of this junk dimension is to take all those little titty, bitty, low cardinality type things and actually just create sort of the, the product of them all into one fat dimension that combines all possible given combinations of them so that you've got this one entity that you can navigate through.</p><p>It's almost like a miscellaneous dimension is almost one way that it feels could be a good way. In fact, miscellaneous dimension for me feels like a better description than a junk dimension, but it's almost this kind of just group all these things together because lots and lots of little. Homes for them doesn't make much sense.</p><p>So we're just gonna stick them in one big group together and allow people to query and buy that instead. </p><p><strong>Shane</strong>: And so how does the key for that work? Because now we've got a bunch of things that kind of aren't the same. So we're still gonna surrogate it with an incremental key, but the business key's gonna have no real relationship.</p><p><strong>Johnny</strong>: The business key ends up being just a composite key of every column, and that's how I've always done them. It's any given combination of a set of different things. I'm trying to think of a good example of it. So I would say in my life, I've only actually implemented junk dimensions. It might only be like twice.</p><p>I don't find it a particularly common Pattern, but I'd know how to do it if I needed to do it. And I think I'd recognize the need for it if I saw it as well. And sometimes I struggle to get my head around why I would understand that. I'd see a centipede fact table and be like, there must be a better way to consolidate this and make.</p><p>Easier to navigate. </p><p><strong>Shane</strong>: So it's in your toolkit. It's not an anti-patent. Yeah, yeah. It's a Pattern you use, but you use it very rarely. It's only when you go, ah, actually this is gonna cause me a problem if I do it the normal way. Let me use that alternate Pattern and bring it in. And then degenerate dimension.</p><p><strong>Johnny</strong>: Yeah. So again, degenerate dimension is like another one that I was gonna be on my blog series in terms of, oh, it's just facts and dimensions, isn't it? So degenerate dimension, different people seem to have different interpretations of them. My interpretation, effectively, it's a dimensional value, it's context that would ordinarily exist, auto dimension, but you just retain it on your facts table instead.</p><p>So you basic. Get rid of the need to join out to a separate table to use it. The reason for it being a piece of contextual information that is only ever relevant in the context of a given fact. 'cause at that point, if that given dimension is never ever gonna have any kind of conformity with all the things, then you may as well just eliminate the need for it.</p><p>Again, if I give you that description, theoretically you could end up with very wide degenerate dimensions, which I definitely wouldn't recommend. 'cause then you end up with a wide facts table. So if it's only a small dimension where the context of it only applies on a given facts, then yeah, you use 'em as degenerates.</p><p>Again, this almost feels like a bit of a personal preference type thing, but it's a Pattern. I really try and avoid where I come. 'cause the typical thing that happens for me is that you go through the requirements gathering and you come across this particular concept and it feels like it's a good fit for degenerate dimension.</p><p>And then you go to your customer and your client and you discuss it. Conceptually with the mental, does this thing only ever exist in this context? And they say, yep, absolutely. So you do it as a general degenerate dimension, and then the next requirement comes along and all of a sudden, oh yeah, we need that piece of information relating to this different factor, a different grain as well.</p><p>And at that point, people start trying to join facts tables together, which is a discouraged practice as well. I always try and even if it's quite a high cardinal analysis, even if it's gonna be almost one-to-one with a factor table, I will try to avoid degenerate dimensions if I can. Just, just promote that reusability and all the processes </p><p><strong>Shane</strong>: at a later date.</p><p>Again, it's not an andan, but it's a Pattern that's used. Really, it's an exception. You have to justify why using that Pattern. It's a Pattern that I use rarely. It's a Pattern that I see everybody else use all the time, and then I bang my head against the wall. One of the problems with degenerate dimensions is you think about it as you are querying the data.</p><p>You're a user coming in and you don't have a semantic layer, so you're hitting straight against the dimensional model as your semantic layer, and now you've got another rule. So the first rule is you join your DIMS d fx. Grab your fact table, you join it to the DIMS you want and you're effectively just creating a de-normalized one big table is what you're gonna get back with those queries.</p><p>And then you are saying, oh, but actually you probably need to check the fact table to make sure there's no degenerative dimensions, because if there's an attribute you need, and it's actually in the fact table, not the dim, now you've gotta go and actually do something different to your query. So you are changing their query Pattern from the standard Pattern to a Pattern plus, and now they've got to remember to check whether that attribute's sitting in the fact table or not.</p><p>Versus attributes always sit in a dim. And so again, we don't have the ability to template our code as much. I suppose it's only a second Pattern, right? Run this query and it's got degenerates. But why, again, why are we justifying, slamming the attribute on the fact table when attributes go on a dim, maybe back in the days when we had constraints, but we don't intend to have those now.</p><p><strong>Johnny</strong>: Yeah, I mean, the other sort of fallacy that I hear off spouted is that in the world of spark engines and parallel processing and those kind of things that oh, it's more efficient to keep it on the flat table. And I get the impression that maybe that was true. Four or five years ago. But all of the major players these days are optimizing their engines for that bi workload, for that kind of star schema type shape, and it, it deals with them absolutely fine.</p><p>I've had it a couple of times recently, it's, oh yeah, we need to reduce the number of joins, and it's like, why the engine can deal with it now. It's not as big a problem as people think it is. </p><p><strong>Shane</strong>: I think, again, there's lots of those patterns where we have a preference and we try and justify it for the technology that we built that preference from.</p><p>But things have changed. So test it. Just run the experiment. See, simulate the volume of data you think you're gonna have and the types of crew you're gonna run and run them and see which performs better. Okay, so you, you talked about joining across facts. So again, there's this idea of was it a bridge table that allows you to slam.</p><p>Facts together or join them. I never quite remember that one. Talk me through. Yeah. </p><p><strong>Johnny</strong>: I've never used bridge tables to span facts. 'cause ultimately that's just a conformed dimension that's effectively, Kimball will talk about this idea of drill across. And the way to drill across is, again, it's really strange.</p><p>I found myself in this position where I've lived and breathed this stuff for so long. I find it sometimes difficult to put into words exactly what's meant by some of it. Kimball puts it into a really great description in terms of this idea of drilling across factor tables and this idea of have you come across things like the fan trap and the chasm trap and all those kind of things like that.</p><p>And that fact tables tend to be at different grains, and if you join them together, the cardinality is not gonna match. But if you basically route your queries and your dimensions and then aggregate across facts, it works absolutely fine. It's always gonna depend on how the fact tables have been structured to an extent.</p><p>But as a rule, if you basically structure your queries that way, that's how they're designed to work and how they're decided to drill across, and that's what you conform dimensionality does. Bridge table's an interesting one. Bridge tables are more, again, one of the patterns I talked about and when we talked about defining grain, this idea that for every unit of measurement in a fax table for every transaction, it should have a one to one relationship with its dimensions.</p><p>It's not always the case. The classic example is bank accounts. So I have a joint bank account with my wife. If a bill goes out. On a direct debit for a mortgage, for example, that's two customers associated with it. It's one bank account, but it's two customers. So you've almost got to have a bridging table that can resolve that so that your fax table only has one transaction and it can basically join through a bridge table and resolve out to those two customers.</p><p>So that's where bridge tables mainly get used. So I went down a real rabbit hole, and again, this is for me the fact that I take pride in this sort of amount of experience and knowledge of a master round dimensional modeling, but I'm definitely not encyclopedic. So it's still for me, hang on. I've got a problem.</p><p>I dunno how to solve this, but I have got a shelf of books and in those books are patterns. So the one I was trying to solve recently was around hierarchies and the best way to resolve recursive hierarchies. You can just flatten 'em out, which kind of works. If they're not fixed depth, that can get a bit messy, but, but.</p><p>Does work, but there's also patterns you can use with bridge tables that will basically help you resolve recursive hierarchies as well. So that's the idea that we're talking about resolving many to many type relationships, and you basically end up that every level of the hierarchy becomes its own record, and then the bridge table resolves it back to the fact.</p><p><strong>Shane</strong>: Okay, so bridge tables are a bridge between the facts and the dims. Not a bridge across facts is what you just said. That's how I've always used them. </p><p><strong>Johnny</strong>: I feel like I'm gonna have to get my head in my books and see if I can find any examples of bridge tables between facts. </p><p><strong>Shane</strong>: I can't even remember what they were.</p><p>And then I just, I just think about bridge tables from data Vault and then, so I'm, I'm applying a different Pattern for that name in my head. </p><p><strong>Johnny</strong>: I can say it's so bizarre. My first foray into data platforms after having been a report analyst was Kimball. And every subsequent organization I've worked in, and I think you alluded to the fact that it's probably the number one applied Pattern in the data industry.</p><p>It's all I've ever known. I've got real blind spots for data vault. I can talk about hubs and satellites and vaguely sound like I've, I know a little bit, but I would not have. The foggiest how to start, they've had to do a data vault implementation, </p><p><strong>Shane</strong>: and it's a different language and it's the same patterns to a degree.</p><p>So with data vault, effectively we just take the dimensional key and we make it a hub. So it holds the key only, and then we take the attributes out of a dim and we make those SATs, but we can add more than one SAT table for a hub. So again, it's a very similar Pattern, but it's different and the language is different and the patterns are slightly different.</p><p>And so you've gotta reset your brain and then you go to activity schema. And that's, again, it's very similar but very different. And again, like you said, there's uh, the depths and the breadth. You can do dimensional modeling with some of the core patterns. You'll do really well and then eventually you'll hit an edge case where you need some of the more of skill bands you need to know they exist.</p><p>But as you said, there's some good books, there's all the Kimball books, there's all the blog posts. And then with Lawrence Coors stuff, half his book is Beam, which is the stuff I use around the who does whats and effectively understanding requirements and concepts and conceptually modeling. And then the other half of his book is how to apply all that to Dimensional modeling.</p><p>And I remember in the course there's quite a large part of the course is around ragged hierarchies and yeah, how you model those in dims and I'm, I don't care. I'll just use an OLA queue back then. It takes care of it for me. So yeah. Alright, let's get onto what I think will probably, the last one I can remember, which is Factless facts, which kind of sounds dumb.</p><p>It's kinda like factless facts. Yeah, totally. </p><p><strong>Johnny</strong>: I'm trying to remember who I was talking to this about the other day and we just basically decided that's just a nonsense, not just thing is a factless fact table. The fact table still has facts in it. It's just that there are no values that you're gonna a.</p><p>That you like. So the fax basically represents the intersection of all the dimensions and ultimately you end up pretty much counting the rows. That becomes the measure at a given intersection. Quite often when I do basic star scheme examples, I'll fall back on a sales example. It's in online retail for a long time.</p><p>And so a typical sales fax table might have sales amount, it might have cost, it might have margin, it might have order quantity. These are all things that you're gonna be able to add up an average, things like that. A factless fact table doesn't have any of those things. It basically just stores the intersection of the various dimensions and ultimately end up counting rows on it to get the facts.</p><p><strong>Shane</strong>: With a factless fact table, would it just hold the keys or would you have a column with one in every row? </p><p><strong>Johnny</strong>: I would just have the keys again. So I've seen that Pattern as well when people just put a one in. So from a power BI perspective, that's considered an anti-pattern that's just bloats that you don't need.</p><p>It's just a count of the rows. It's you don't need the one 'cause you just do account. Account star or </p><p><strong>Shane</strong>: account one. So let's go into that for a second. So everybody tells me, not everybody, but most people tell me that Power BI is far more efficient when you use the star schema versus one big table or anything else, but they never gimme the context.</p><p>And I'm like, is that when you're not using direct query and you're actually bringing the data back into the Power BI layer? Or is that when you are using DAX or when you're creating a semantic model? Where is the thing that says Power BI works best with a star schema? </p><p><strong>Johnny</strong>: So I feel like I'm gonna have to shout out a couple of Microsoft Oh Form MVPs now I think.</p><p>So there's a Chap Kon kbi. Dutch guy and his catchphrase is that you must star schemer all the things. And he's like a massive advocate of it. And he made stickers and t-shirts and all these things. Star schemer, all the things. And then there's a chap bent, I'm gonna butcher his name, which he'll kill me for if he listens back to this.</p><p>So he is Belgian Benny Dre, he is on the power bi cat team and he did a conference session that was taking the mick out of K bit. Basically the session was called Star Schemer, all the things. But why? And it was really fascinating 'cause it actually dug into it and did a load of tests. So some people were like, oh you need to use a star schemer 'cause you'll get better compression out of the, so Power BI is built on the Verta pack engine, but it's basically the compression engine for it.</p><p>Oh. You get better compression so you'll have a lower memory footprint and you a load of tests versus one big table where you basically proved, yeah that's not really the case. And it was like, oh start scheme will load quicker. 'cause if you've not got less data redundancy and you did a load of tests and it, that wasn't really the case.</p><p>The main thing is that DS is a language. Is structured to work with it. Well, it always strikes me as chicken and egg like in terms of is Dax built to work with star schema? Almost That kind of, what was the way around? I was thinking about it the other day, that effectively, did they make power BI to be optimized for star schema as opposed to people saying that our star schema is what's optimized for Power bi.</p><p>The other soundbite I came up with the other day is that Dimensional modeling's, probably the second best Pattern for everything, speed-wise. One big table, a former colleague who did his thesis on it, star schema versus one big table, and I was like, oh yeah, one big table is loads better. Was like, why is it better?</p><p>It's quicker. Okay. What about from maintenance and reusability and then when you get into the realms of that, actually, yeah, one big table's good and quick and easy to query, but I've got to have a different, one big table for lots and lots of different things. And then if I need to update a particular attribute, I'm gonna have to update it in lots of different places.</p><p>And then again, not being an expert on it, but my impression of data vault is that if flexibility wise, it's really good. It deals with change lots really easily. People say it's complex to implement, I've never done one. And that potentially querying it because there's a lot of joining. Your tables can be quite complex and difficult to navigate from a analyst perspective.</p><p><strong>Shane</strong>: Yeah, so let's talk about that one. 'cause that's a really interesting one. And it's true. You typically use data vault in a layered data architecture. What Joe Reese calls mixed modeled arts. So we would typically never expose the data vault structures as the core reporting layer. We would dimensional it with one big table it, we would activity schema.</p><p>We'd do a whole bunch of things to make it easy because joining lots of tables together as an analyst is an anti-pain, in my view. You're getting 'em to do work that the machine can do for you. And then everybody goes, oh yeah, but everybody's can understand how to query a star schema. And I'm like, yeah, they can if you train them.</p><p>Yeah, and they can still get it wrong if I give them one big table, if I give 'em the table with a grain and all the columns in it. As long as I don't give them 2000 columns, it's much easier for them to query. Now. Yes. What happens now is if my only Pattern is one big table, if I write 5,000 DBT models that do all the transformations in code with no segmentation layering, no shared reuse, no shared context, and I'm creating nothing but 10,001 big tables, that's a bad Pattern.</p><p>But if my code is effectively my model, the context holds the model, and I'm just hydrating the one big tables at the end, and every time I do a change, those tables are automatically refreshed with those changes. Not a human, then that's a Pattern that works. If I'm data folding and then dimensioning, and then one big tabling.</p><p>It's a Pattern. I can automate that and I can hold context and I can get the machine to do all the changes for me when I change that context. So that's my view. And I was always intrigued with power BI dimension. That's the norm. And that's fine because with the norm there's lots of good articles, there's lots of things that have been written.</p><p>There's lots of people that can help you if you get stuck. And if you don't follow the norm, it's a little bit trickier, but why is it the norm? And the same with DBT. For people that now consciously model rather than unconsciously model dimensional seems to be the flavor they use the most. Now why is that?</p><p>Is it because the information is easily accessible? Is it because that's the things that people are being trained on the most? And our days Dimensional became so popular because Ralph Kimball did a lot of training. You could always go on a dimensional training course. It was easy to get hold of somebody that would teach you that.</p><p>Not so much with the other courses, but he doesn't do the training anymore. Yeah, he's been retired like a long time. Yeah. And then Margie Ross took over, but she's retired now. So actually, who's doing the training? 'cause it's not the people who invented the patterns, but maybe that's it. Maybe the training is still more accessible or the books are accessible.</p><p>It's intriguing. How is that still the modeling Pattern and it has value. It's a really valuable Pattern. It's not my favorite Pattern. I'll be honest about that, but that's just my opinion. Like you said, you, you don't use a lot of junk dimensions or degenerate dimensions. That's just your choice. That's how you mod.</p><p>Yeah, and that's fine. You're making conscious decisions around that, which reminds me, there's what I missed a late arriving facts. Sorry. That was the other one that we probably need to talk about. So </p><p><strong>Johnny</strong>: the way I always interpreted them was this idea that you are, and, and I guess this is a Pattern that we've not really discussed, is this idea that you'd always load your dimensions first and then load your facts afterwards.</p><p>And part of that is to guarantee you've got that sort of relational integrity. And again, this goes back to that Indem key argument and the fact that I prefer to look up my keys after the event. So if you're gonna look up your keys when you create your facts, tables means you've always got to load your dimensions first.</p><p>But what if in between you loading your dimensions and you loading your facts, a new dimension occurs? So we sell a brand new product, so we load our product dimension and that's got every product that exists. And then. Whilst that is happening, a new product goes live and gets sold straight away. So then basically when loaded the facts, it's arrived late because it's arrived after the dimensions have been processed and it doesn't have a matching record back in its dimension table to be able to join to.</p><p>You deal with that with an unknown member. So basically a, a default key that gets assigned where the dimensional record doesn't exist. And then I'd always into the realms of rolling windows from my updates. 'cause then I'd go back and revisit my fax table and update the key. Which then for me, that is contrary to this idea of a transactional fax table.</p><p>That should be right. Only 'cause that's not true. You'd still go back for a later item. Fax and update Its dimensional key </p><p><strong>Shane</strong>: is that effectively we get a fact turns up and there's no dim for the fact we'd normally do a placeholder of dim, don't we? So there's a dimensional key. 9, 9, 9 9 9 or zero minus one.</p><p>Pretty much minus one. That's right. Used to be the big argument wasn't there about what do you use as the surrogate key for your late arriving fact dim who used to argue about that all the time. So effectively the fact turns up the dimension isn't there for whatever reason you bind it to this Yeah.</p><p>Dummy dimension key. And then you go and update the fact later when, when that dim actually arrives. And that way you get consistency or reference integrity across the, the dims and the facts. </p><p><strong>Johnny</strong>: I, it's strange to an extent 'cause my love of dimensional modeling, definitely. Predates even the invention of Power bi.</p><p>But having turned into a bit of a power bi, not, I think that's helped me go deeper and further in understanding all this stuff. And that goes back to me for, people would argue that the, that the idea of the Indem key was that actually, 'cause if you do that in your facts, say well you don't have to go back and update it afterwards.</p><p>If it's a late arrival fact, it doesn't matter because the key's already been predetermined so you don't have to worry about it. But yeah. But if you can imagine you were doing nightly batch loads. That's a whole day where you've got a relational integrity issue. And from a Power BI perspective, that can have quite a big impact on your dax, which you've not got proper relational integrity.</p><p>So that's one of the reasons I always prefer to do the lookup. 'cause then with the lookup you to fall back on your unknown member if you need to. </p><p><strong>Shane</strong>: It's the key, isn't it? Is that these weird patterns are there for a reason. Because when people have been using this in anger for 20, 30 years, they have found these edge cases that they needed to have a patent to deal with because they would turn up every now and again and if late arriving facts is one of those.</p><p><strong>Johnny</strong>: This is one of the sort of debates we ended up getting into with the engineering teams I've worked with in terms of, and it goes back to something again, Joe Reese talks about is trade offs and understanding what it is you're trying to optimize for, because. Your pipelines would be more efficient and run quicker if you don't have to do that dimensional key lookup.</p><p>So your pipelines are run quicker. If they're run quicker, they're gonna be cheaper, your data's gonna be more available. Okay. But if my data's available with relational integrity issues, then it's not accurate data. And I'd rather do it slower and a bit more expensive, but have it accurate than have the most cost efficient pipeline.</p><p>Because you don't think looking at my dimensions is the efficient thing to do. </p><p><strong>Shane</strong>: But also a, again, a lot of the patterns were around technical constraints. 'cause I'm sure I remember really in the early days of Oracle when we had foreign keys on the tables between the facts and the dims, the updates were really slow.</p><p>'cause we had on-prem servers and we were constrained around memory and dis and all that kind of stuff. And I'm pretty sure we used to do an update, drop the referential integrity, drop the foreign keys, and then load all the data, making sure that we kept refer integrity by the code in theory, and then we'd reapply the.</p><p>Foreign keys at the end of that process and hope like hell bloody rebuilt because that just helped us get the load times down from 12 hours to two hours. Now you would never do that. Now that I know. I mean, half the cloud analytical databases don't have foreign keys for that very reason. But I'd be surprised if anybody's doing that Pattern now.</p><p>We, uh, altering the tables, removing the reference integrity, doing a load and then putting it back on. </p><p><strong>Johnny</strong>: Yeah, so what's interesting, again, another debate in terms of happens, but even in my on-prem days with SQL Server in a data warehousing context, we never actually applied throwing key constraints. Never.</p><p>We always just loaded it. We. Dealt with them with logic. So whenever we insert this record, we're gonna check it's got a key that matches and if it doesn't, we're gonna put the unknown member in there. So battling against the constraint checks wasn't a problem. I guess when I talk referential integrity, that's probably me using my power bi conditioned brain.</p><p>'cause I'm not talking about the actual database constraint, I'm just talking about the effects that the database constraint would make, if that makes sense. </p><p><strong>Shane</strong>: Yeah. Referential integrity is actually a Pattern that says everything has integrity. Then in my head I just naturally go back to databases, apply it, and therefore whenever I use the term reference integrity, I'm falling back to that Pattern of it's a technical implementation, not a logical one.</p><p>And I think that's interesting is somebody said to me the other day is I often bring up ghosts of data past, and it's intriguing because now I start to think about patterns and I go, where did that Pattern come from? Was it a ghosted data pass for a technical constraint? Was it a process constraint? Was it a people constraint?</p><p>Was it an edge case that the patents don't deal with and therefore it's still valid? Where did it come from? It's intriguing. And I wonder how many of the star scheme of stuff comes out of tools like Power bi, like you said, chicken and egg. I think it's just important </p><p><strong>Johnny</strong>: to question it and have that curiosity and go with it from there.</p><p>What was it? What was the other thing somebody was talking to me about the other day? I had a really good deep data conversation with a former colleague and we had to write, good matter bang, the world to write. So his problem is that he sees people. Apply patterns with no context. And because they've seen a Pattern applied before, they assume that's the right Pattern every single time.</p><p>And the nuance isn't absolutely understanding what patterns are available and which ones to use when and sometimes when to break the rules. Sometimes when to knowingly implement an anti-pattern almost on purpose. 'cause actually it serves a particular edge case and it makes sense. In the right context and you can justify it.</p><p><strong>Shane</strong>: Yeah. Which makes it a Pattern then, which is really weird because actually, yeah, a solution to a problem, and that's not commonly applied, but with the context actually works. All right. And on that one, just to close it out, if people wanna get hold of you, what's the best way for them to find you? Read what you're doing, listen to what you do.</p><p><strong>Johnny</strong>: The worst thing you can possibly do is Google me. 'cause if you Google me, the first hit's gonna be an albino blues guitarist who's now dead and once performed at Woodstock. So Googling Johnny Winter doesn't work. I have can't believe I've gotten this far in the podcast and not mentioned it, that I've got this kind of data persona called Gray School Analytics.</p><p>But yeah, it's a Heman reference Castle Gray School. It came out of the idea of Heman slogan was I have the Power and I had the Power bi. So that's where Gray School Analytics came from. And then I've turned it into sort of a massive skeletal reference as well. So yeah, if people look me up on LinkedIn, you'll see quite a few skeletal themed memes being shared there with various sort of data contacts in them as well.</p><p>Grayscale analytics.com is my website. The version that you can currently see is the original version, 'cause I've reverted it back, which was just really a blog. I've got a YouTube channel as well, so you can get me on YouTube. That gray school analytics.com websites in the process of being revamped.</p><p>'cause I'm looking to launch my business in September. But yeah, LinkedIn or Gray school analytics.com are the best place. Gray school with an E as opposed to an A. 'cause I, I anglicized it partly 'cause I'm English and partly 'cause I didn't want to get sued by Mattel. </p><p><strong>Shane</strong>: Maybe bring the grayscale analytics story right in the beginning.</p><p>Next time it's how I know of you and I've gotta do shout out for probably what I reckon must be one of the best LLM engineering prompts in the world that you can constantly generate grayscale images that actually look like you've got a gray gold character sitting in your office somewhere and you're just moving them around.</p><p>The quality of those generations are pretty damn awesome. </p><p><strong>Johnny</strong>: It does. Well, you're not the first person to point it out either. I tried to do one skeletal playing cricket the other day and it couldn't get its head around that. But yeah, most of the time it manages to make a different decent fist of it.</p><p><strong>Shane</strong>: Alright, it's been great. Thank you for going through all those dimensional patterns. I've ticked off another set of the modeling patterns for the podcast and can't believe it's taken me so long to get round to this one. It probably should have been the first one. But anyway, thank you for that and I hope everybody has a simply magical day.</p><h2>&#171;oo&#187;</h2><div class="pullquote"><p><em>Stakeholder - &#8220;Thats not what I wanted!&#8221; <br>Data Team - &#8220;But thats what you asked for!&#8221;</em></p></div><p>Struggling to gather data requirements and constantly hearing the conversation above?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0Bu2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0Bu2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg" width="387" height="342" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:342,&quot;width&quot;:387,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:19725,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/160520537?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!0Bu2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0Bu2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea54a17-bf89-4dc3-a46b-d039a4585eee_387x342.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Want to learn how to capture data and information requirements in a repeatable way so stakeholders love them and data teams can build from them, by using the Information Product Canvas.</p><p>Have I got the book for you!</p><p>Start your journey to a new Agile Data Way of Working.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://adiwow.com/168&quot;,&quot;text&quot;:&quot;Buy the Agile Data Guide now!&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://adiwow.com/168"><span>Buy the Agile Data Guide now!</span></a></p><h2>&#171;oo&#187;</h2>]]></content:encoded></item><item><title><![CDATA[The Joe Reis Show - The Information Product Canvas: A Shared Language for Data]]></title><description><![CDATA[I was lucky enough to be able to chat to Joe Reis about my recently published book, "The Information Product Canvas," and my diving into writing a second book.]]></description><link>https://agiledata.info/p/the-joe-reis-show-the-information</link><guid isPermaLink="false">https://agiledata.info/p/the-joe-reis-show-the-information</guid><dc:creator><![CDATA[Shagility]]></dc:creator><pubDate>Fri, 08 Aug 2025 22:35:25 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/rDaknYO2UH8" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1>The Joe Reis Show - The Information Product Canvas: A Shared Language for Data</h1><p>I was lucky enough to be able to chat to Joe Reis about my recently published book, "The Information Product Canvas," and my diving into writing a second book.</p><h2>Watch</h2><div id="youtube2-rDaknYO2UH8" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;rDaknYO2UH8&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/rDaknYO2UH8?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><h2><br>Listen</h2><p><a href="https://open.spotify.com/episode/7fJrhBZG3rXgwFg3gg7iee?si=Z4nJfCAkSLiX8EeyGFkVcQ">https://open.spotify.com/episode/7fJrhBZG3rXgwFg3gg7iee?si=Z4nJfCAkSLiX8EeyGFkVcQ</a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://open.spotify.com/episode/7fJrhBZG3rXgwFg3gg7iee?si=Z4nJfCAkSLiX8EeyGFkVcQ&quot;,&quot;text&quot;:&quot;Listen on Spotify&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://open.spotify.com/episode/7fJrhBZG3rXgwFg3gg7iee?si=Z4nJfCAkSLiX8EeyGFkVcQ"><span>Listen on Spotify</span></a></p><p></p><h2><strong>Google NotebookLM Mind Map<br></strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!IxTI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd313d59a-e3e0-4de2-b5b9-96a0cfaa88d1_6345x19938.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!IxTI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd313d59a-e3e0-4de2-b5b9-96a0cfaa88d1_6345x19938.png 424w, https://substackcdn.com/image/fetch/$s_!IxTI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd313d59a-e3e0-4de2-b5b9-96a0cfaa88d1_6345x19938.png 848w, https://substackcdn.com/image/fetch/$s_!IxTI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd313d59a-e3e0-4de2-b5b9-96a0cfaa88d1_6345x19938.png 1272w, https://substackcdn.com/image/fetch/$s_!IxTI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd313d59a-e3e0-4de2-b5b9-96a0cfaa88d1_6345x19938.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!IxTI!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd313d59a-e3e0-4de2-b5b9-96a0cfaa88d1_6345x19938.png" width="1200" height="3770.6043956043954" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d313d59a-e3e0-4de2-b5b9-96a0cfaa88d1_6345x19938.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:4575,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:6081106,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://agiledata.substack.com/i/170490859?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd313d59a-e3e0-4de2-b5b9-96a0cfaa88d1_6345x19938.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!IxTI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd313d59a-e3e0-4de2-b5b9-96a0cfaa88d1_6345x19938.png 424w, https://substackcdn.com/image/fetch/$s_!IxTI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd313d59a-e3e0-4de2-b5b9-96a0cfaa88d1_6345x19938.png 848w, https://substackcdn.com/image/fetch/$s_!IxTI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd313d59a-e3e0-4de2-b5b9-96a0cfaa88d1_6345x19938.png 1272w, https://substackcdn.com/image/fetch/$s_!IxTI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd313d59a-e3e0-4de2-b5b9-96a0cfaa88d1_6345x19938.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2></h2><h2><strong>Google NotebookLM Brief</strong></h2><h3>Detailed Briefing: Shane Gibson - The Power of the Information Product Canvas</h3><p><strong>Date:</strong> October 26, 2023</p><p><strong>Source:</strong> Excerpts from "Shane Gibson - The Power of the Information Product Canvas" (Podcast Transcript)</p><p><strong>Key Speakers:</strong></p><ul><li><p><strong>Shane Gibson (SG):</strong> Author of "The Information Product Canvas".</p></li><li><p><strong>Joe (J):</strong> Interviewer, author, and podcast host.</p></li></ul><h4><strong>1. The Gruelling Journey of Authorship</strong></h4><p>Shane Gibson kicks off by reflecting on the arduous process of writing a book, particularly his first, "The Information Product Canvas," and the even more challenging experience of tackling the second.</p><ul><li><p><strong>Writing as a "Nightmare":</strong> SG describes writing as an "awful" and "not enjoyable" process, despite being a prolific writer who doesn't consider himself a "natural writer." This contrasts with Joe's enjoyment of writing as a means to "think more clearly about a topic."</p></li><li><p><strong>Forced Effort vs. Natural Enjoyment:</strong> SG admits, "I had to force myself to spend time doing the writing process." His "buzz" comes from simplifying complex ideas into comprehensible words: "how do you take those things I could talk about for ages and just waffle on and get them into... a thousand words that make sense to somebody else."</p></li><li><p><strong>The "Vanity Project" &amp; Secondary Benefits:</strong> His primary motivation was to "scratch" an "itch" he'd always had: "I just want to write a freaking book and I've tried a few times and I've failed and I wasn&#8217;t  going to fail this time." A secondary, and deeply satisfying, benefit has been seeing strangers successfully use his "pattern" without his direct involvement, solving "a small problem in their data work."</p></li><li><p><strong>Beyond Just Publishing:</strong> The real challenge wasn't just writing the book, but making it a book he "didn't hate," particularly in an era where AI can "crank out a book in an hour." True value, SG suggests, comes from distilling complexity into simplicity, a process that "takes all the time."</p></li><li><p><strong>The Importance of "Voice":</strong> SG learned from an editor that his early drafts lacked his unique "voice," making the content hard to understand for a general audience. This underscored the need to tailor the book for those "outside our domain," making it "easy to read."</p></li><li><p><strong>Self-Publishing Woes:</strong> SG highlights the significant effort involved in self-publishing beyond writing, including navigating Amazon Kindle Direct Publishing, which he calls a "bloody nightmare." He notes the platform&#8217;s inconsistent print quality, the need for continuous uploads to check layout changes, and the unpredictable review process upon final submission. He also jokes about his book mistakenly appearing in the "online dating" category on Amazon UK due to the subtitle "stakeholders love it."</p></li></ul><h4><strong>2. The Core Problem: A Lack of Shared Language in Data Teams</strong></h4><p>SG identifies a decades-old, pervasive issue in data teams: a fundamental communication breakdown between data professionals and stakeholders.</p><ul><li><p><strong>The Misunderstanding Loop:</strong> "data teams talk to stakeholders ask them what they want stakeholders tell them the data team think they understand they go away and build something they give it back to the stakeholder and the stakeholder says that's not what I wanted." This often leads to frustrating exchanges like "but that's what you ask for."</p></li><li><p><strong>Learned Behaviours &amp; Disparate Languages:Stakeholders:</strong> Have been "trained by us as data people to talk in terms of reports and lists and data requests," e.g., "I want a dashboard and I want to have this table and I want these three fields on it."</p></li><li><p><strong>Data Teams:</strong> Converse in terms of "source systems," data formats, and "conform[ing] the dimension" &#8211; technical details irrelevant to business outcomes.</p></li><li><p><strong>The Missing Link:</strong> "there was no shared language in the middle." This void led to the development of the Information Product Canvas.</p></li></ul><h4><strong>3. The Information Product Canvas: A Shared Language and Solution</strong></h4><p>The Information Product Canvas, inspired by the Business Model Canvas, provides a structured, collaborative solution to the communication gap.</p><ul><li><p><strong>Inspiration from the Business Model Canvas:</strong> SG and his team "lucked across this idea of the business model canvas," a tool for summarising business strategy "on a page with a small number of boxes." Its success lay in "this combination of a template that was easy to understand and a process to fill it out." Crucially, it was open-sourced, leading to other iterations like the Lean Startup Canvas.</p></li><li><p><strong>The Canvas's Impact:</strong> Using an A3 paper with "12 boxes" and "a specific language of how we filled it out," stakeholders began providing "what we needed." The key shift was focusing "more on what's the action and the outcome that they're trying to achieve not what's the data they wanted."</p></li><li><p><strong>What is an "Information Product"?</strong> SG&#8217;s term, which predates the popularity of  "data product," defines a bounded entity that encompasses:</p><ul><li><p><strong>Multiple Delivery Types:</strong> "a dashboard a report a data service an AI chatbot an MCP service."</p></li><li><p><strong>Necessary Data:</strong> "should contain the data that is needed to to power that product."</p></li><li><p><strong>Targeted Personas:</strong> "focused on a specific set of personas."</p></li><li><p><strong>A Clear Boundary:</strong> Like different breakfast cereals, they are distinct products with different flavours (delivery, data, personas) but serve the same fundamental purpose (breakfast/solving a business problem).</p></li></ul></li><li><p><strong>Iterative &amp; Product Thinking:</strong> The canvas facilitates an "information value stream," integrating "product thinking" into data work. This process moves from:</p><ul><li><p><strong>Ideation &amp; Discovery:</strong> Identifying business problems that "potentially we can solve... with data" and brainstorming "three to five potential ways we could use data to solve that."</p></li><li><p><strong>Prioritisation:</strong> For larger organisations, discovering "20 things we could do" and then prioritising "the next most valuable product to build."</p></li><li><p><strong>Data Factory/Information Factory:</strong> Moving into design, build, deploy, maintain, and enhance phases.</p></li><li><p><strong>Benefits of the Canvas:Faster Feedback:</strong> Increased speed from problem statement to product delivery leads to quicker feedback loops and adjustments.</p></li><li><p><strong>Deliver More Value:</strong> Enables chunking down large problems into "smaller things," delivering value incrementally (e.g., a week for a revenue model part, rather than six months for a lifetime value model).</p></li><li><p><strong>Versatile Application:</strong> Primarily used in the "discovery phase," but also valuable for conceptual data modeling and understanding the "language of the business."</p></li></ul></li><li><p><strong>Greenfield vs. Brownfield Projects:</strong> The canvas is adaptable for both new projects (greenfield) and existing systems (brownfield).</p><ul><li><p><strong>Greenfield:</strong> Can quickly "lightly discover all the things we could potentially rebuild" (e.g., replacing 1,000 legacy reports). Teams can identify "150 potential products," then group and prioritise "the most valuable ones to build first" before filling out the canvas in more detail.</p></li><li><p><strong>Brownfield:</strong> Useful if current requirements gathering is slow, misunderstood, or poorly attended. It aims to fix the problem of delivering "not what they wanted."</p></li></ul></li></ul><h4><strong>4. Challenging Traditional Data Modelling &amp; Embracing Agility</strong></h4><p>SG critiques traditional data modelling practices, advocating for more agile and collaborative approaches.</p><ul><li><p><strong>Workshop Exhaustion:</strong> SG recalls "multi-day" data modeling workshops that were "waste" for stakeholders, highlighting the need to "not steal those three days of time from those stakeholders" and instead "chunk it down into small bits that are valuable."</p></li><li><p><strong>Isolated "Heroes":</strong> He criticises the "data modeling heroes" who "model a really good data model... in isolation," resulting in "no buy in" and "no feedback loop." This reinforces the problem of delivering "what they asked for But it's not what they wanted."</p></li><li><p><strong>Incremental &amp; Iterative:</strong> The product thinking approach promotes "incremental and iterative versus big design up front," a traditional pitfall of "enterprise data model[s]." This is crucial due to "constraints on time constraints on budgets."</p></li><li><p><strong>Data Product vs. Information Product:</strong> While acknowledging market preference for "data product," SG stands by "information product" due to its historical use within his work. He clearly differentiates between a "data asset" (purely a table) and an "information product" (data, logic, delivery, targeted users).</p></li><li><p><strong>Critique of DIKW Hierarchy:</strong> SG questions the Data-Information-Knowledge-Wisdom hierarchy, seeing it as a "convenient abstraction" but potentially "wrong" because "it assumes that that knowledge and wisdom and information are sort of a linear progression from data," ignoring feedback in the other direction. He struggles with its practical application: "where I struggle with data information knowledge and wisdom is like okay I get it but how do I use it?"</p></li><li><p><strong>The Blurring Lines of Data &amp; Knowledge:</strong> Joe notes the increasing vagueness of boundaries in data, especially with AI, integrating "library sciences" (ontologies, taxonomies) into the data world, which were historically "separate gangs on the street." SG is actively seeking to learn from library sciences to better understand concepts like "context" for AI.</p></li><li><p><strong>Accessibility of Knowledge:</strong> Both SG and Joe lament the inaccessibility of knowledge in some domains (e.g., data modelling, library sciences). SG points out that the success of books like Joe's "Fundamentals of Data Engineering" and his own "Information Product Canvas" lies in their ability to "take that complexity and make it simple" for newcomers.</p></li><li><p><strong>Kimball's Enduring Legacy:</strong> SG discusses Ralph Kimball's continued success in data warehousing due to the "readily available" patterns, "easy to read" books, and freely shared ideas. Kimball "made his money out of being the expert... to learn the idea in a better way."</p></li><li><p><strong>"Mixed Modelling Arts":</strong> SG suggests that just as data modelling benefits from combining techniques, publishing could adopt a "mixed analogy," leveraging various approaches.</p></li></ul><h4><strong>5. Publishing Strategies &amp; The Future of Content Creation</strong></h4><p>The conversation delves into the evolving landscape of publishing, marketing, and continuous learning.</p><ul><li><p><strong>Beyond Publishers: The Author's Burden:</strong> SG notes the "bullshit" expectation that publishers will handle marketing and sales. Instead, they prioritise an author's "following" and market access. Authors "do all the work they take all your money."</p></li><li><p><strong>Marketing &amp; Word of Mouth:</strong> Marketing typically drives initial sales, but "marketing only lasts for about probably the first month of a book then it's word of mouth." Positive feedback from users, like the person who used his canvas and it "just worked," is the true measure of success.</p><ul><li><p><strong>Continuous Improvement through Feedback:Courses as Feedback Loops:</strong> SG plans to run "bad version[s]" of his course for the next book, making them "almost free" to "learn from people what messaging is not getting through." This provides "feedback to make a better book."</p></li><li><p><strong>Public Accountability:</strong> SG uses public deadlines (e.g., LinkedIn updates on writing progress) as a "forcing function" to maintain discipline, especially since he doesn't naturally enjoy writing.</p></li></ul></li><li><p><strong>The Evolving Definition of "Data Modelling":</strong> SG and Joe discuss whether merely adding metadata or context to data constitutes data modelling. Joe argues "hell yeah," expanding the definition beyond just changing data structure. This aligns with the idea that an "information product" doesn't have to be complex; a "paperclip" is still a product, just like a simple data output can be an information product.</p></li><li><p><strong>The Iterative Nature of Writing:</strong> SG likens writing a book to an "iterative agile process," where the table of contents isn't fixed, and unexpected "new chapter[s] turn up that just makes sense."</p></li><li><p><strong>AI's Impact on Content:</strong> SG muses that in the AI world, "we're going to start paywalling the ideas more because people won't pay for the content." He notes the current inability of LLMs to scrape Amazon's website, highlighting content providers' efforts to control their material.</p></li></ul><p>This briefing summarises the key takeaways from Shane Gibson's reflections on his publishing journey and the innovative approach of the Information Product Canvas, all delivered with that classic Kiwi directness. Chur!</p><p></p><p></p>]]></content:encoded></item></channel></rss>