๐Ÿ”ฅ
Exploratory Data Analysis (EDA)
WebsiteGithub
  • ๐Ÿ‘‹Welcome!
  • Course Content
    • 1. Introduction
      • EDA: Uncovering Insights and Patterns
      • Why EDA?
      • Importance of EDA
      • The role of EDA in the data analysis process
      • A Comprehensive Examination
      • Code & Practice
      • Basic Concept
    • 2. Fundamentals
      • Lifecycle
        • Data Science
        • EDA
    • 3. Dataset Selection and Understanding
      • Kaggle
      • Github
    • 4. Data Cleaning and Preprocessing
    • 5. Techniques and Approaches
      • Types of EDA
    • 6. Data Visualization
    • 7. Statistical Measures and Hypothesis Testing
    • 9. Case Studies
    • 11. Best Practices and Tips for Effective EDA
    • 12. Future Trends and Emerging Technologies
  • Dataset
    • โ„น๏ธKaggle
  • Tools and Software
    • โœจData Analysis Tools
    • ๐ŸPython Library
      • ๐ŸผPandas
      • ๐ŸงŠNumpy
      • ๐Ÿ“ŠMatplotlib
      • ๐Ÿ“ˆSeaborn
      • ๐Ÿ“ถPlotly
      • ๐ŸคนSciPy
      • ๐Ÿ’ซStatsmodels
      • ๐Ÿ‘‚Scikit-learn
      • ๐Ÿ—ณ๏ธYellowbrick
    • โ›๏ธPython tools
    • ยฎ๏ธยฎ ยฎ ยฎ The R Project
    • ๐ŸŒ€Data Exploration
    • ๐ŸŽฏData Quality
    • ๐Ÿ“”Data Profiling
    • ๐Ÿ“บVisualization
  • Tech Exploration
    • ๐ŸŽฌYoutube
    • โ˜๏ธGithub
    • ๐Ÿ”ฌLab
    • ๐Ÿ’ผCase Study
  • Reference
    • API Reference
      • Pets
      • Users
      • Quick Start
Powered by GitBook
On this page
  • Metabase
  • Lightdash
  • Perspective
  • Apache Doris
  1. Tools and Software

Data Profiling

PreviousData QualityNextVisualization

Last updated 1 year ago

Now that the data has been examined and some initial cleaning has taken place, itโ€™s time to assess the quality of the characteristics of the dataset. This includes its structure, content, and relationships between variables. This step is important because itโ€™s used to identify any issues or inconsistencies in the data. Data analysts can use these tools to examine the data and produce reports on key aspects, such as data types, ranges, distributions, and so on. To differentiate from data exploration, data profiling is focused on the quality of the data, whereas data exploration is meant to better understand the data.

Metabase

Metabase is an easy-to-use data exploration tool that allows even non-technical users to ask questions and gain insights. This business intelligence and user experience tool allows you to build interactive dashboards, models for cleaning tables, and set up alerts to notify users when your data changes. You can even connect directly to 20+ data sources to work with data within minutes.

Lightdash

Perspective

Perspective is an interactive analytics and data visualization component, which is especially well-suited for large and/or streaming datasets. This tool allows users to create easily-configurable reports, dashboards, notebooks, and applications.

Apache Doris

Built on an MPP (massively parallel processing) architecture, this tool from Apache is a high-performance, real-time analytics database, known for speed and ease of use. Apache Doris can better meet the scenarios of report analysis, ad-hoc query, unified data warehouse, Data Lake Query Acceleration, etc. Users can build user behavior analysis, AB test platform, log retrieval analysis, user portrait analysis, order analysis, and other applications on top of this.

A popular open-source business intelligence tool, Lightdash is designed for (data build tool), and allows data analysts and engineers to control all of their business intelligence tools in a single place, bridging the gap between the transformation and visualization layers. The tool is a full-stack BI platform, so analysts can write their metrics in-house, enabling the entire business to work with the data with ease.

๐Ÿ“”
dbt
MetabaseMetabase
Logo
GitHub - metabase/metabase: The simplest, fastest way to get business intelligence and analytics to everyone in your companyGitHub
Lightdash | BI for teams that move fast
GitHub - lightdash/lightdash: Open source BI for teams that move fast โšก๏ธGitHub
Perspective | Perspective
Logo
GitHub - finos/perspective: A data visualization and analytics component, especially well-suited for large and/or streaming datasets.GitHub
Home - Apache Doris
GitHub - apache/doris: Apache Doris is an easy-to-use, high performance and unified analytics database.GitHub
Logo
Logo
Logo
Logo
Logo
Visitors
Logo