A Streamlit App for Rapid Exploratory Data Analysis and Visualisation

Exploratory Data Analysis (EDA) is a critical first step when working with any kind of data, helping to uncover patterns, spot anomalies, and gain valuable insights.

To make this process faster and more interactive, I’ve built an EDA Dashboard using Streamlit and Plotly.

In this blog post, I’ll walk you through the app, show you how to use it, and highlight key features.


What is the EDA Dashboard?

The EDA Dashboard is a web based tool for rapid data exploration and visualisation.

You can upload a CSV file and immediately get summary statistics, correlation analysis, and interactive plots, all without writing a single line of code!

My aim was to open up data analysis and visualisation for those unfamiliar with R/Python, or those of us just wanting a super quick look at our data.


Key Features

Here’s a quick overview of what the EDA Dashboard can do:

  • Upload and analyse CSV files with automatic data type detection.
  • View data, summary statistics and missing value counts.
  • Generate an interactive correlation heatmap.
  • Visualise data distributions with histograms.
  • Create bar charts and scatter plots to explore relationships.
  • Export visualisations as PNG files.


Upload Your Data

The app comes pre-loaded with an example dataset (assemblies.csv), but you can upload your own file to get started with your data. Just drag and drop or click Browse files to upload your data, and the app will automatically process it.

Pro tip: Ensure your CSV has a header row and clean, consistent numeric data for the best results.


Explore Your Data

After uploading your dataset, the View Data and Summary Metrics section provides several ways to explore it:

  • Show Data – View the full dataset with row and column counts.
  • Show Summary – See descriptive statistics for each numeric column.
  • Show Data Types – Check the data type of each column.
  • Show Missing Values – Identify columns with missing values and their percentages.

Here’s what the summary view looks like:


Visualisation Options

The visualisation section lets you generate several types of plots to better understand your data.

Correlation Analysis

Click Generate Correlation Matrix to visualise pairwise Pearson correlation coefficients between numeric columns. The heatmap uses a red-blue colour scale, with tooltips showing exact correlation values when you hover over each cell.


Distribution Analysis

Select a column from the dropdown menu to generate a histogram of its values. This is a quick and easy way to spot outliers or skewed distributions.


Bar Charts

Want to compare values across samples or categories? The bar chart option lets you visualise any column interactively.


Scatter Plots

Explore relationships between two variables by plotting them on a scatter plot. Select different columns for the X and Y axes, then zoom and pan to explore specific regions.


Exporting Visualisations

Every plot in the app can be exported as a PNG file. Just hover over the plot, click the camera icon in the top-right corner, and save it to your computer.


Try it Out!

Head over to the EDA Dashboard and give it a try. I’d love to hear your feedback!


Support

If you find this tool helpful, consider buying me a coffee on Ko-fi. Your support helps me keep building tools like this!

Buy Me a Coffee at ko-fi.com




Enjoy Reading This Article?

Here are some more articles you might like to read next:

  • Reconstructing a Phylogenetic Tree Using SNP Information from WGS Data
  • Visualising Gene Expression with a Heatmap using Python
  • 4-Way Venn Diagram of Overlapping Gene Expression in R
  • Useful Bash Aliases and Functions
  • Estimating Genome Size Using Jellyfish and GenomeScope2