Create a Dataset (Salesforce Object) using Dataset Builder and Dataflow

In my previous blog, I talked about creating a dataset from a csv file (Click Here). Now we would be talking about another approach on creating a dataset using dataset builder and dataflow.

A big question – what is a dataflow ?

A Dataflow in simple terms is just a way in which you can start with a particular set of data and then change it, build on it, combine with other data, derive new columns based on some values in other columns, etc. to come up with your final data. So simply it a set of steps or instructions to be performed on a data to use it for further analytics and insights.

Here we would be creating a dataset for Accounts and its owner information (Full Name and Photo)

STEP 1 : Click on Create Button & Select Dataset

There are multiple ways to start creating a dataset. After entering Analytics Studio, you can either click on the “Create” button (on the top right corner) or you can go into your Einstein Analytics app and click the button, etc. Select Dataset.

Create a dataset

STEP 2 : Choose Data Source

Choose Salesforce Data as the datasource.

Data Source

STEP 3 : Add New Dataset Details

Add the new dataset name and whether you want to create a new dataflow or add it to an existing dataflow. Lets name the dataset as Accounts and choose a new dataflow – Account with Owner Details.

New Dataset

STEP 4 : Select the objects and Fields

Select the root or base object to start building the dataflow. This is the object from which we will start building our dataflow. Ideally you should choose the base or the main object from which you want to create further associations.

Choose Base Object

We will choose Account as the Base object.

Base Object

Click on the plus symbol which will allow you to add fields for the Account object that you want to be included in your dataset.

Account Fields

We wanted Account with Owner Details. Click on the Relationship Tab to Associate with Owner fields.

Relationship Tab

Click on the User (Owner ID) node (plus symbol) and select the fields. In case the list is long, you can even search for the field. We will search for Full Name field. When finished, click on the Next Button.

Choose Fields

STEP 5 : Select the App

Select the App where you want to save your dataset.

Select App

Einstein now does the magic of creating the dataflow and also running the dataflow in the background.

Dataset Creation

You can now see the dataflow that has been created. In case you choose not to perform the above steps, you can directly create the dataflow as well. I will cover the dataflow in the upcoming blogs.

However to give an idea, the dataflow created the following :-

  • 2 SFDCDigest Node – These nodes represent that data is coming from Salesforce Objects
  • 1 Augment Node – This node is to join the Account and the user object i.e. based on the User Id in the Owner field, join the Account data with other user information.
  • 1 SFDCRegister – This is the node for creating the dataset.
Dataflow

You can check the “Monitor” tab and see your dataflow running with stats of every step.

Dataflow Monitor

You can edit the dataflow directly, if required. I will cover that in the upcoming blogs.

Create a Dataset with CSV file

Lets start creating the first dataset (you can read more about dataset in my blog on Datasets – Click Here) using a CSV file.

STEP 1 : Click on Create Button & Select Dataset

There are multiple ways to start creating a dataset. After entering Analytics Studio, you can either click on the “Create” button (on the top right corner) or you can go into your Einstein Analytics app and click the button, etc. Select Dataset.

Create a dataset

STEP 2 : Choose Data Source

Choose CSV File as we are creating dataset from csv.

Data Source

STEP 3 : Upload the CSV File

Upload / Drag and Drop the csv file to upload. Here we are uploading Opportunity Data. I am using the data available in my org (built as part of one of the trailhead modules)

File to Upload

STEP 4 : Define the properties of the Dataset

This is the place to give the name and the App where you want to save the dataset. You can also check the file properties (Auto detected) in case it needs a change. Data Schema file will be covered in later blogs, however the file (also known as Extended Metadata or XMD file) is used for some formatting options of the dataset fields and their values. Click Next button to move to the next step.

New Dataset

STEP 5 : Preview and Edit the Field Attributes

This step now allows you to preview the data being loaded and to change any settings. This is the step where you define the type of column (Dimension, Measure and Date) and set its attribute. By default Einstein Analytics automatically detects the data type but you can change it as well.

In this step, you can check the fields that would be part of the dataset. As we can see we have Account.Name, Amount, CloseDate, Id, etc. as the columns (or fields) of the dataset.

Dataset Search Fields

You can preview the data getting loaded. Here I have a screen of 3 columns, where we will now make some changes.

Edit Field Attributes

We have a column as Account.Name. This is because in Opportunity Data, we also extracted Parent Account’s Name field and in the csv it got extracted as Account.Name. Further to note Analytics Studio automatically picked the field as dimension (qualitative value).

Dimension Attributes

You can choose to fix column name in CSV file before loading or you can change the field label in this Step. So now we will change Account.Name to Account.

Changing Field Label

The Next field to check is the Amount field. This field is auto-detected as a Measure (value) field as the data in this column is numeric.

Measure Field
Measure Attributes

Since its a currency, we will change the number format to 0.00. Furthermore we will add a grouping symbol and a Decimal Symbol to our data.

Changing Measure Attributes

We will now check the Close Date field. This field is of type Date which has been auto-picked by Einstein Analytics.

Date Attributes

After doing checks / changes to the attributes of the fields, we move to the next step.

STEP 6 : Einstein builds Dataset

This is where Einstein Analytics Engine builds the dataset. It can take a few seconds for the while to load. Though I like the page as we can see Mr. Einstein doing the magic.

Creating Dataset

STEP 7 : Review the prepared Dataset

Once the dataset is loaded, you will automatically get transferred to the dataset page. There you can check your dataset properties and also add Security predicates and XMD file. A good check here is to check the number of rows loaded.

Dataset

You can even the view the dataset data by clicking “Explore” button. This opens a Lens where you can add your filters, bars (or the dimension) , bar length (measure or calculation on measure)metc. and choose a chart to display.

I added a filter saying Account contains “Einstein”. Bar Length has been kept as Sum of Amount. Bars have been kept as Closed Date (Quarter) and Account (and not Account.Name). Just to add, we only loaded Closed Date, however Einstein Analytics provided us with various groupings like Year, Quarter, Month, Year-Month, Year-Month-Day, etc.

Lens View

We will soon use this dataset in our dashboard.

What is Dataset in Salesforce Einstein?

A dataset is simply a collection of rows and columns. If you have used Excel or Google Sheets, then one dataset is one worksheet. If you have built databases, it is one table.

Column of the dataset indicates various fields. Rows of the dataset indicates various records.

A dataset can contain the following type of data

  • Dimension
  • Measure
  • Date

Dimension is a qualitative value / text / a field that you want to group your data with / measure your data against / use it as filter eg: City Name, Status, Salesperson, product number, etc. A number (like Product Number) can also be a dimension as the number represents something you can group by. In a Lens (Chart) Dimension is used in Bar & Filter.

Measure is a quantitative value / number / a field that gives actual value eg: Amount, Number of Cases, etc. While creating the dataset, it is important to identify the value fields and ensure that the column is marked as Measure. A measure can further be aggregated (like Count of Rows , Sum of Rows, etc.). In a Lens (Chart) Measure is used in Bar Length & Filter.

Date is used to denote Date fields in Dataset. If we mark the column as Date, then it helps in analysis by automatically allowing the grouping / filtering by Days, Months, Year and also Relative Dates like Last Week, Last Month, etc. In a Lens (Chart) Date is used in Bar & Filter.

Once the dataset is setup and ready to use, we can then build various charts and dashboards on top the datasets to provide useful analysis and insights for the business.

Salesforce Einstein : Is it for me?

Salesforce Einstein – In Salesforce world, this has a created a buzz. Salesforce Einstein is a big thing. It combines the power of Analytics and AI.
The big question that comes to your mind, should I really know and learn about Einstein?

Person 1 : I am an expert in Apex / Aura / LWC. Do I really need to know about Einstein Analytics or Discovery?
Person 2: I am an expert Admin. I can do amazing things with Flows, Process Builders, Layouts, etc. I can’t write code. Do I really need to know about Salesforce Einstein?
Person 3 : I am a consultant. I know Sales Cloud, Service Cloud, Community, FSL. Should I know about Einstein?

As per me, for the above scenarios, the answer is “Yes” .. Really !!

Before I talk about Salesforce Einstein features, its really important to understand why I should invest my time in knowing about a certain technology and how it impacts my day to day work.

Salesforce is one platform that is continuously growing and innovating at a rapid pace. In my career, I started with Admin work, Learnt Apex, then Aura and LWC. Along with that I am also enhancing my knowledge in Einstein .

Let’s say we have a brand new org and we need to make it functional for user(s).
First we try to do everything out of the box and basic for users to be functional as soon as possible.
Once they start using the system, we build fancy UI (Developer’s delight) and automation.
Then we bring other systems together on one platform.
Then what : We continue to innovate and enhance.

However at the same time we are also building a very crucial thing – Data.

Data plays an important role in our day to day role by providing analysis of what we have done so far, what we are currently doing and what better we can do in future.

Now with large amount of data we have (Salesforce or other systems), how do we Analyze it / how can we use the past experience to build a better future?
The answer is Salesforce Einstein.

Do we really need a tool to analyze / predict – Why can’t my Dev team just run some queries and put some logic into it?
May be 100 records, they can do it minutes. 1,000 records may be hours. 10,000 records – Days? Can you really wait for days?
Now think about what if there are 50,000 records?

What if you no longer need to wait for IT to provide all this and can access powerful data insights on your desktop browser or mobile.

Welcome to the world of Salesforce Einstein. A place where you can easily build Dashboards to visualize data and share insights (Einstein Analytics). A place where you can use the power of machine learning, AI and statistical Analysis (Einstein Discovery).

So is Salesforce Einstein replacement for a Dev / Admin / Consultant ? Consider Einstein as your Friend, Adviser , Additional Team member who will provide you powerful insights to help in day to day work.