Introduction: Apache Doris - A Distributed SQL-Based Data Warehouse System

Learn about Apache Doris, a distributed SQL-based data warehouse system that offers high performance and scalability for data analytics.

July 9, 2024 · 3 min · 514 words · Bá Tới

Introduction to DuckDB: The New Database for Data Analysis

I. What is DuckDB? DuckDB is an open-source analytical database (OLAP) designed to optimize large-scale data processing on personal computers. Often referred to as “SQLite for analytics,” DuckDB offers convenience and efficiency in data processing, allowing users to quickly harness data without complex setup. III. Key Features of DuckDB Simplicity and Ease of Use: DuckDB doesn’t require complex server setup, making it easy to integrate into existing projects. High Performance: Built for data analysis, DuckDB supports complex queries and handles large datasets with impressive speed. Full SQL Support: DuckDB provides a rich SQL environment, supporting most standard SQL commands. Easy Integration: It integrates with popular programming languages like Python, R, and C++, enabling users to interact and analyze data seamlessly. III. Benefits of Using DuckDB No Complex Setup: Unlike other database systems, DuckDB doesn’t need intricate configuration—just download and use. Personal Computer Processing: DuckDB performs well on personal computers, allowing users to analyze data without requiring powerful server resources. High Integration Capability: DuckDB can be integrated into existing tools and applications, allowing seamless use in data analysis workflows. IV. How to Use DuckDB 1. Installation Installing DuckDB is straightforward, especially in Python: ...

June 29, 2024 · 2 min · 400 words · Bá Tới

Redash: Empowering Data Visualization and Collaboration

Redash is an open-source data visualization and collaboration platform that allows users to connect to various data sources, execute SQL queries, visualize data, and share insights with team members. Learn more about the key features of Redash and how it empowers organizations to make data-driven decisions.

March 24, 2024 · 2 min · 398 words · Bá Tới

Understanding PostgreSQL COUNT(NULL), COUNT(*), and COUNT(1)

In PostgreSQL, the COUNT function is commonly used to count the number of rows in a table. However, there are subtle differences between COUNT(NULL), COUNT(*), and COUNT(1) that are worth understanding. Let’s delve into each of these variants, their usage, and performance implications.

March 10, 2024 · 2 min · 292 words · Bá Tới

Unlocking the Power of PostgreSQL Window Functions: A Comprehensive Guide

In PostgreSQL, window functions are a powerful feature that allows you to perform complex analytical calculations within SQL queries. By understanding the key features and common use cases of window functions, you can unlock their full potential and gain valuable insights into your data.

March 1, 2024 · 5 min · 910 words · Bá Tới

Understanding PostgreSQL DISTINCT ON

In PostgreSQL, the DISTINCT ON clause is a powerful tool used in the SELECT statement to retrieve unique rows based on a specified set of columns. It is often combined with the ORDER BY clause to determine which unique rows to select.

February 28, 2024 · 2 min · 341 words · Bá Tới