Exploratory analysis

Exploratory analysis is a crucial first step in data science, serving as a foundation for more detailed analysis later on.

Exploratory analytics is open-ended and curiosity driven, often with several iterations where the insights gained in one step may lead to new questions and further analysis!

This process typically involves a variety of techniques such as statistical summaries and data visualization. You’re generally looking to spot trends, relationships, & distributions in the data.

Statistical summaries

Statistical summaries should provide some insights on the range and tendency of the dataset.

Polymer comes with this information out-of-the-box!

For each dataset, we provide a summary including metrics like mean, max, min, and variance for each column. Plus some information regarding distribution overviews, empty values, & unique values.

These summaries will differ a bit depending on the data type.

Here’s an example of a number’s summary:

Untitled

& another example for a string:

Untitled

Find more information here: Data summaries

Visualizing relationships

Look for correlations between data where one data point may help to predict another. Let’s do an example with movie ratings & reviews!

Classic: Scatterplot

Understand how one metric may move with (or not with) another with the scatterplot.

If one may predict the other, add that one to the x-axis and the other to the y-axis.

Untitled

Polymer power-up: Correlation block

Use the correlations block to instantly see the degree (how strong) and the direction (positive or negative) of the relationship between each metric in your data!

Simply add all of the desired metrics to the correlations block & watch the magic happen.

Untitled

Visualizing distributions

Look for categories or data points outside of the normal distribution of data. Let’s do an example with food prices!

Classic: Column / bar chart

Discover the full distribution of the data with a column or bar chart.

Use a category in the x-axis to find their distribution or add a metric to the x-axis to turn the column chart into a histogram!

Untitled

Polymer power-up: Outlier block

The outlier block allows you to find, as expected, outliers in the data - categories that sit on the extremes for the range of a given metric.

Choose a metric of interest and then some categories to break this metric down by. You can dive as deep as 6 categories of segmentation!

Results will show you which categories perform the “best” & “worst”.

Untitled