Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Filter by Categories
About Article
Analyze Data
Archive
Best Practices
Better Outputs
Blog
Code Optimization
Code Quality
Command Line
Course
Daily tips
Dashboard
Data Analysis & Manipulation
Data Engineer
Data Visualization
DataFrame
Delta Lake
DevOps
DuckDB
Environment Management
Feature Engineer
Git
Jupyter Notebook
LLM
LLM Tools
Machine Learning
Machine Learning & AI
Machine Learning Tools
Manage Data
MLOps
Natural Language Processing
Newsletter Archive
NumPy
Pandas
Polars
PySpark
Python Helpers
Python Tips
Python Utilities
Scrape Data
SQL
Testing
Time Series
Tools
Visualization
Visualization & Reporting
Workflow & Automation
Workflow Automation

SQL

Automate Weekly Data Monitoring and Sharing with Kestra

Consider the scenario where you need to query a CSV file and subsequently share the results in a Slack channel every Monday to monitor the data and enhance team communication.

Completing this task manually each week can be inefficient and repetitive.

However, with Kestra, you can streamline this process by automating it with just a few lines of YAML code.

Link to Kestra.

Automate Weekly Data Monitoring and Sharing with Kestra Read More »

The Lakehouse Model: Bridging the Gap Between Data Lakes and Warehouses

First-generation data warehouses excelled with structured data and BI tasks but had limited support for unstructured data and were costly to scale up.

Second-generation data lakes offered scalable storage for diverse data but lacked key management features, such as ACID transactions and data versioning.

Databricks’ Lakehouse architecture combines the strengths of lakes and warehouses, including:

Supporting various data types, suitable for data science and machine learning.

Enhancing management features such as ACID transactions and data versioning.

Using cost-effective object storage, like Amazon S3, with formats like Parquet.

Maintaining data integrity via a metadata layer.

Learn more about Data Lakehouse Architecture.

The Lakehouse Model: Bridging the Gap Between Data Lakes and Warehouses Read More »

Supercharge Your dbt and SQL Workflows in VSCode with DataPilot

Wouldn’t it be nice if you could accelerate your dbt and SQL workflows directly within your VSCode? With DataPilot via the dbt-power-user VSCode extension, you can:

Transform your SQL into dbt models

Instantly generate dbt documentation

Effortlessly map out column lineage

and more—all with a single click.

Link to dbt-power-user.

Supercharge Your dbt and SQL Workflows in VSCode with DataPilot Read More »

Scroll to Top

Work with Khuyen Tran

Work with Khuyen Tran