Table of Contents
- Why Should Data Scientists Switch from Print to Logging
- Why Many Data Scientists Still Use Print
- Meet Loguru: The Best of Both Worlds
- Format Logs Easily
- Save Logs to File
- Rotate and Retain Logs
- Better Exception Logging
- Summary: Why Loguru Wins
- Should I Always Use Loguru Instead of Print?
Why Should Data Scientists Switch from Print to Logging
As your data science projects evolve from notebooks to production-ready pipelines, print()
becomes harder to manage.
For example, consider the following data science project where you want to track different stages like data loading, preprocessing, model training, and error handling:
#| eval: false
print("Loaded 1000 rows from dataset.csv")
print("Started training RandomForest model")
print("Missing values detected in 'age' column")
print("Model training failed: insufficient memory")
Output:
Loaded 1000 rows from dataset.csv
Started training RandomForest model
Missing values detected in 'age' column
Model training failed: insufficient memory
This works fine locally, but in a production environment:
- There’s no record of when these events occurred
- There’s no way to save that record to a file for later inspection
- There’s no indication of the severity of each message, making it hard to distinguish between general informational messages and serious runtime errors
Unlike print
, the logging
module supports log levels, output formatting, and destination control (file, stdout, etc.). Here’s a quick comparison:
#| eval: false
# logging_example.py
import logging
logging.basicConfig(
level=logging.DEBUG,
format="%(asctime)s | %(levelname)s | %(module)s:%(funcName)s:%(lineno)d - %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
)
def main():
logging.debug("Loaded 1000 rows from dataset.csv")
logging.info("Started training RandomForest model")
logging.warning("Missing values detected in 'age' column")
logging.error("Model training failed: insufficient memory")
if __name__ == "__main__":
main()
Output:
2025-05-03 14:14:32 | DEBUG | logging_example:main:11 - Loaded 1000 rows from dataset.csv
2025-05-03 14:14:32 | INFO | logging_example:main:12 - Started training RandomForest model
2025-05-03 14:14:32 | WARNING | logging_example:main:13 - Missing values detected in 'age' column
2025-05-03 14:14:32 | ERROR | logging_example:main:14 - Model training failed: insufficient memory
You can hide debug logs and focus only on more critical messages by changing the log level to INFO:
#| eval: false
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s | %(levelname)s | %(module)s:%(funcName)s:%(lineno)d - %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
)
Output:
2025-05-03 14:35:29 | INFO | logging_example:main:12 - Started training RandomForest model
2025-05-03 14:35:29 | WARNING | logging_example:main:13 - Missing values detected in 'age' column
2025-05-03 14:35:29 | ERROR | logging_example:main:14 - Model training failed: insufficient memory
Why Many Data Scientists Still Use Print
print()
is fast, familiar, and doesn’t require setup. When exploring data or debugging inside a Jupyter notebook, it feels like the most convenient option.
#| eval: false
print("Training complete")
Thus, many data scientists prefer using print statements, even though the built-in logging module offers greater structure, flexibility, and long-term maintainability.
Meet Loguru: The Best of Both Worlds
Loguru makes logging effortless without sacrificing power. There is no boilerplate, no custom handlers. Just drop it in and go.
#| eval: false
from loguru import logger
def main():
logger.debug("Loaded 1000 rows from dataset.csv")
logger.info("Started training RandomForest model")
logger.warning("Missing values detected in 'age' column, using median imputation")
logger.error("Model training failed: insufficient memory")
if __name__ == "__main__":
main()
Default output is colored, timestamped, and detailed.
Format Logs Easily
Formatting logs allows you to add useful information to logs such as timestamps, log levels, module names, function names, and line numbers. Here’s how to do it with both logging and Loguru:
Traditional Way
The traditional logging approach uses the % formatting, which is not intuitive to use and maintain:
#| eval: false
import logging
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s | %(levelname)s | %(module)s:%(funcName)s:%(lineno)d - %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
)
Loguru Way
In contrast, Loguru uses the {} formatting, which is much more readable and easy to use:
#| eval: false
import sys
from loguru import logger
# Remove the default handler
logger.remove()
# Add a stream handler
logger.add(
sys.stdout,
format="{time:YYYY-MM-DD HH:mm:ss} | {level} | {module}:{function}:{line} - {message}",
level="INFO",
)
In the code above:
logger.remove()
clears the default Loguru handler so that only your custom configuration is active.logger.add(sys.stdout, ...)
explicitly adds a stream handler that logs to the terminal using your specified format and log level.
Other common options for time formatting:
Category | Token | Output Example |
---|---|---|
Year | YYYY | 2025 |
Month | MM | 01 … 12 |
Day | DD | 01 … 31 |
Day of Week | ddd | Mon, Tue, Wed |
Hour (24h) | HH | 00 … 23 |
Hour (12h) | hh | 01 … 12 |
Minute | mm | 00 … 59 |
Second | ss | 00 … 59 |
Microsecond | SSSSSS | 000000 … 999999 |
AM/PM | A | AM, PM |
Timezone | Z | +00:00, -07:00 |
Save Logs to File
Saving logs to a file can help preserve important information over time and aid debugging. Here’s how to do it with both logging
and Loguru:
Traditional Way
Saving logs to both a file and the terminal using the logging
module requires setting up separate handlers:
FileHandler
: writes log messages to a specified file so that they can be reviewed laterStreamHandler
: sends log messages to the console (stdout), allowing you to see logs in real time during execution
#| eval: false
import logging
logging.basicConfig(
level=logging.DEBUG,
format="%(asctime)s | %(levelname)s | %(module)s:%(funcName)s:%(lineno)d - %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
handlers=[
logging.FileHandler(filename="info.log"),
logging.StreamHandler(),
],
)
Loguru Way
Logging to a file using Loguru is simple: call the add()
method with the file path, format, and log level. Loguru logs to the terminal by default, so calling add()
for a file automatically saves logs to both the file and the terminal.
#| eval: false
from loguru import logger
logger.add(
"info.log",
format="{time:YYYY-MM-DD HH:mm:ss} | {level} | {module}:{function}:{line} - {message}",
level="INFO",
)
Rotate and Retain Logs
Without log rotation, long-running processes like ETL jobs or model training can generate massive log files that waste disk space and are hard to manage. Automatic rotation keeps logs compact and readable.
Here’s how to do it with both logging and Loguru:
Traditional Way
To automatically rotate the log file using the logging module, you need to use TimedRotatingFileHandler
, which has the following key parameters:
filename
: the file where logs are written.when
: the time interval to trigger a new log file (e.g.,'S'
for seconds,'M'
for minutes,'H'
for hours,'D'
for days,'W0'
–'W6'
for weekdays,'midnight'
for daily at midnight).interval
: how often rotation should happen based on the unit provided inwhen
.backupCount
: how many rotated log files to keep before old ones are deleted.
This setup gives you finer control, but requires more manual configuration than Loguru.
#| eval: false
import logging
from logging.handlers import TimedRotatingFileHandler
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
handler = TimedRotatingFileHandler("debug.log", when="W0", interval=1, backupCount=4)
handler.setLevel(logging.INFO)
handler.setFormatter(
logging.Formatter(
"%(asctime)s | %(levelname)s | %(module)s:%(funcName)s:%(lineno)d - %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
)
)
logger.addHandler(handler)
Loguru Way
With Loguru, you can rotate and retain logs in a single line using the rotation
and retention
parameters in add()
:
rotation
: when to create a new log file (e.g., size or time)retention
: how long to keep old log files
No extra classes or handlers required:
#| eval: false
from loguru import logger
logger.add("debug.log", level="INFO", rotation="1 week", retention="4 weeks")
You can also customize log rotation and retention rules in Loguru using different triggers and strategies:
#| eval: false
logger.add("file_1.log", rotation="500 MB") # Automatically rotate if the file exceeds 500 MB
logger.add("file_2.log", rotation="12:00") # Create a new log file daily at noon
logger.add("file_3.log", rotation="1 week") # Rotate weekly
logger.add("file_X.log", retention="10 days") # Keep logs for 10 days, then delete old ones
logger.add("file_Y.log", compression="zip") # Compress rotated logs to save space
Better Exception Logging
When exceptions occur, logging can help you understand not only what went wrong, but also where and why. Here’s how traditional logging compares with Loguru when it comes to capturing exception details:
Traditional Way
To catch and log exceptions using the built-in logging
module, you typically wrap your code in a try-except block and call logging.exception()
to capture the traceback:
#| eval: false
import logging
def divide(a, b):
return a / b
def main():
try:
divide(1, 0)
except ZeroDivisionError:
logging.exception("Division by zero")
main()
Output:
2025-05-03 15:23:09 | ERROR | logging_example:nested:18 - ZeroDivisionError
Traceback (most recent call last):
File ".../logging_example.py", line 16, in nested
division(1, c)
File ".../logging_example.py", line 11, in division
return a / b
~~^~~
ZeroDivisionError: division by zero
The stack trace is printed, but you don’t see the values of a
and b
, so you’re left guessing what inputs caused the failure.
Loguru Way
Loguru improves debugging by capturing the full stack trace and the state of local variables at each level.
#| eval: false
from loguru import logger
def division(a, b):
return a / b
def nested(c):
try:
division(1, c)
except ZeroDivisionError:
logger.exception("ZeroDivisionError")
if __name__ == "__main__":
nested(0)
Output:
> File ".../catch_decorator.py", line 14, in <module>
nested(0)
└ <function nested at 0x106492520>
File ".../catch_decorator.py", line 10, in nested
division(1, c)
│ └ 0
└ <function division at 0x105051800>
File ".../catch_decorator.py", line 5, in division
return a / b
│ └ 0
└ 1
ZeroDivisionError: division by zero
In the traceback above, Loguru shows that a
is 1 and b
is 0, making it immediately clear what inputs caused the failure.
You can also capture and display full tracebacks in any function simply by adding the @logger.catch
decorator:
#| eval: false
from loguru import logger
def divide(a, b):
return a / b
@logger.catch
def main():
divide(1, 0)
main()
Summary: Why Loguru Wins
Feature | print |
logging |
loguru |
---|---|---|---|
Log levels | ✗ | ✓ | ✓ |
File output | ✗ | ✓ | ✓ |
Log rotation | ✗ | Manual | One-liner with rotation |
Filtering | ✗ | Custom Filter class |
Simple function |
Stack trace + variables | ✗ | Basic traceback | Rich context |
Pretty logging | ✗ | Requires colorlog |
Built-in |
Customize format | ✗ | % formatting |
{} formatting |
Setup time | None | High | Minimal |
Should I Always Use Loguru Instead of Print?
print()
is perfectly fine for quick checks or exploratory work inside a Jupyter notebook. It’s simple, fast, and requires no setup.
However, when your code starts to include multiple stages, like data loading, preprocessing, modeling, and evaluation, or needs to run reliably in production, it’s worth moving to a logging tool like Loguru.
2 thoughts on “Loguru: Simple as Print, Powerful as Logging”
This is fantastic and very relevant for my current project
I’m glad to hear that!