{"id":911,"date":"2021-04-14T05:02:56","date_gmt":"2021-04-14T05:02:56","guid":{"rendered":"http:\/\/puneetkhosla.com\/?p=911"},"modified":"2021-04-14T05:12:34","modified_gmt":"2021-04-14T05:12:34","slug":"einstein-prediction-builder-improving-predictions-by-improving-data-quality","status":"publish","type":"post","link":"https:\/\/puneetkhosla.com\/?p=911","title":{"rendered":"Einstein Prediction Builder : Improving Predictions by improving data quality"},"content":{"rendered":"\n<p>In my last blog on Einstein Prediction Builder, we were able to improve our prediction score by choosing the right fields for doing the prediction. Link to last blog \u2013 <a href=\"https:\/\/puneetkhosla.com\/?p=890\" target=\"_blank\" rel=\"noreferrer noopener\">Einstein Prediction Builder : Improving Predictions through choice of fields<\/a><\/p>\n\n\n\n<p>In this blog, we will understand why data quality is important for improving prediction results.<\/p>\n\n\n\n<p>Imagine an astrologer predicting your future but doesn&#8217;t have your date of birth . Imagine a store predicting the sale in month&#8217;s to come, but they count the store and customer&#8217;s receipt of bills (i.e. double counting the sale proceeds).<\/p>\n\n\n\n<p>This is a similar situation that we might encounter in our data. We are now going to talk through steps to improve data quality and thus improve our prediction.<\/p>\n\n\n\n<p>Just a quick recap, we achieved a score of 47 in the previous module.<\/p>\n\n\n\n<p><strong><span class=\"has-inline-color has-bright-blue-color\">STEP 1 : Remove the Duplicates<\/span><\/strong><\/p>\n\n\n\n<p>When we prepare the example set, it is important to cleanse the data to remove duplicates. Salesforce Duplicate Rules or products from AppExchange can help in cleaning data and removing the duplicates. Post removing the duplicates, it is important to re-evaluate the score. In our case, the score increased to 53.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-style-default img-border\"><img fetchpriority=\"high\" decoding=\"async\" width=\"348\" height=\"354\" src=\"http:\/\/puneetkhosla.com\/wp-content\/uploads\/2021\/04\/EPB44.png\" alt=\"\" class=\"wp-image-912\" srcset=\"https:\/\/puneetkhosla.com\/wp-content\/uploads\/2021\/04\/EPB44.png 348w, https:\/\/puneetkhosla.com\/wp-content\/uploads\/2021\/04\/EPB44-295x300.png 295w\" sizes=\"(max-width: 348px) 100vw, 348px\" \/><\/figure>\n\n\n\n<p><strong><span class=\"has-inline-color has-bright-blue-color\">STEP 2 : Validate Data in the example set<\/span><\/strong><\/p>\n\n\n\n<p>It is important to ensure that our example set has valid data. Try to give meaning to the data, by ensuring the fields are populated with proper and meaningful values. <\/p>\n\n\n\n<p>Avoid values like etc. , any other reason, none of the above. Such values can result in low scores especially if they appear on large number of records.<\/p>\n\n\n\n<p>Check if the data is correct and fix the data if there is an issue. This however requires involvement from business users to help in fixing the data and converting it into a meaningful information.<\/p>\n\n\n\n<p>After completing the above, the score of my prediction became <strong><span class=\"has-inline-color has-bright-blue-color\">85<\/span><\/strong>.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-style-default img-border\"><img decoding=\"async\" width=\"684\" height=\"374\" src=\"http:\/\/puneetkhosla.com\/wp-content\/uploads\/2021\/04\/EPB45.png\" alt=\"\" class=\"wp-image-915\" srcset=\"https:\/\/puneetkhosla.com\/wp-content\/uploads\/2021\/04\/EPB45.png 684w, https:\/\/puneetkhosla.com\/wp-content\/uploads\/2021\/04\/EPB45-300x164.png 300w\" sizes=\"(max-width: 684px) 100vw, 684px\" \/><\/figure>\n\n\n\n<p>So  with some basics checks, we moved our score from <strong><span class=\"has-inline-color has-bright-blue-color\">Good <\/span><\/strong>to <strong><span class=\"has-inline-color has-bright-blue-color\">Great<\/span><\/strong>.<\/p>\n\n\n\n<p>It is important to note that it was not just improving the data quality that moved the score, it was series of steps that helped in reaching this mark. To reiterate it is<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Removing Hindsight Bias.<\/li><li>Removing fields which are majorly blank (and will not have too much data in future).<\/li><li>Removing fields which has no impact.<\/li><li>Removing the Duplicate data.<\/li><li>Validating the data.<\/li><\/ul>\n\n\n\n<p>We will next talk about the Einstein Prediction Builder Component (provided by Salesforce) in my next blog.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In my last blog on Einstein Prediction Builder, we were able to improve our prediction score by choosing the right fields for doing the prediction. Link to last blog \u2013 Einstein Prediction Builder : Improving Predictions through choice of fields In this blog, we will understand why data quality is important for improving prediction results. [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"categories":[7],"tags":[14,11,13,18,16],"class_list":["post-911","post","type-post","status-publish","format-standard","hentry","category-einstein","tag-ai","tag-einstein","tag-einstein-discovery","tag-einstein-prediction-builder","tag-salesforce"],"_links":{"self":[{"href":"https:\/\/puneetkhosla.com\/index.php?rest_route=\/wp\/v2\/posts\/911","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/puneetkhosla.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/puneetkhosla.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/puneetkhosla.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/puneetkhosla.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=911"}],"version-history":[{"count":8,"href":"https:\/\/puneetkhosla.com\/index.php?rest_route=\/wp\/v2\/posts\/911\/revisions"}],"predecessor-version":[{"id":922,"href":"https:\/\/puneetkhosla.com\/index.php?rest_route=\/wp\/v2\/posts\/911\/revisions\/922"}],"wp:attachment":[{"href":"https:\/\/puneetkhosla.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=911"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/puneetkhosla.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=911"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/puneetkhosla.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=911"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}