Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Filter by Categories
About Article
Analyze Data
Archive
Best Practices
Better Outputs
Blog
Code Optimization
Code Quality
Command Line
Course
Daily tips
Dashboard
Data Analysis & Manipulation
Data Engineer
Data Visualization
DataFrame
Delta Lake
DevOps
DuckDB
Environment Management
Feature Engineer
Git
Jupyter Notebook
LLM
LLM Tools
Machine Learning
Machine Learning & AI
Machine Learning Tools
Manage Data
MLOps
Natural Language Processing
Newsletter Archive
NumPy
Pandas
Polars
PySpark
Python Helpers
Python Tips
Python Utilities
Scrape Data
SQL
Testing
Time Series
Tools
Visualization
Visualization & Reporting
Workflow & Automation
Workflow Automation

Course

Python Data Modeling with Dataclasses and Pydantic

/* CodeMirror 5 CSS (inlined to prevent WordPress stripping) */
.CodeMirror{font-family:’Fira Code’,monospace;height:300px;color:#000;direction:ltr}.CodeMirror-lines{padding:4px 0}.CodeMirror pre.CodeMirror-line,.CodeMirror pre.CodeMirror-line-like{padding:0 4px}.CodeMirror-gutter-filler,.CodeMirror-scrollbar-filler{background-color:#fff}.CodeMirror-gutters{border-right:1px solid #ddd;background-color:#f7f7f7;white-space:nowrap}.CodeMirror-linenumber{padding:0 3px 0 5px;min-width:20px;text-align:right;color:#999;white-space:nowrap}.CodeMirror-guttermarker{color:#000}.CodeMirror-guttermarker-subtle{color:#999}.CodeMirror-cursor{border-left:1px solid #000;border-right:none;width:0}.CodeMirror div.CodeMirror-secondarycursor{border-left:1px solid silver}.cm-fat-cursor .CodeMirror-cursor{width:auto;border:0!important;background:#7e7}.cm-fat-cursor div.CodeMirror-cursors{z-index:1}.cm-fat-cursor .CodeMirror-line::selection,.cm-fat-cursor .CodeMirror-line>span::selection,.cm-fat-cursor .CodeMirror-line>span>span::selection{background:0 0}.cm-fat-cursor .CodeMirror-line::-moz-selection,.cm-fat-cursor .CodeMirror-line>span::-moz-selection,.cm-fat-cursor .CodeMirror-line>span>span::-moz-selection{background:0 0}.cm-fat-cursor{caret-color:transparent}@-moz-keyframes blink{50%{background-color:transparent}}@-webkit-keyframes blink{50%{background-color:transparent}}@keyframes blink{50%{background-color:transparent}}.cm-tab{display:inline-block;text-decoration:inherit}.CodeMirror-rulers{position:absolute;left:0;right:0;top:-50px;bottom:0;overflow:hidden}.CodeMirror-ruler{border-left:1px solid #ccc;top:0;bottom:0;position:absolute}.cm-s-default .cm-header{color:#00f}.cm-s-default .cm-quote{color:#090}.cm-negative{color:#d44}.cm-positive{color:#292}.cm-header,.cm-strong{font-weight:700}.cm-em{font-style:italic}.cm-link{text-decoration:underline}.cm-strikethrough{text-decoration:line-through}.cm-s-default .cm-keyword{color:#708}.cm-s-default .cm-atom{color:#219}.cm-s-default .cm-number{color:#164}.cm-s-default .cm-def{color:#00f}.cm-s-default .cm-variable-2{color:#05a}.cm-s-default .cm-type,.cm-s-default .cm-variable-3{color:#085}.cm-s-default .cm-comment{color:#a50}.cm-s-default .cm-string{color:#a11}.cm-s-default .cm-string-2{color:#f50}.cm-s-default .cm-meta{color:#555}.cm-s-default .cm-qualifier{color:#555}.cm-s-default .cm-builtin{color:#30a}.cm-s-default .cm-bracket{color:#997}.cm-s-default .cm-tag{color:#170}.cm-s-default .cm-attribute{color:#00c}.cm-s-default .cm-hr{color:#999}.cm-s-default .cm-link{color:#00c}.cm-s-default .cm-error{color:red}.cm-invalidchar{color:red}.CodeMirror-composing{border-bottom:2px solid}div.CodeMirror span.CodeMirror-matchingbracket{color:#0b0}div.CodeMirror span.CodeMirror-nonmatchingbracket{color:#a22}.CodeMirror-matchingtag{background:rgba(255,150,0,.3)}.CodeMirror-activeline-background{background:#e8f2ff}.CodeMirror{position:relative;overflow:hidden;background:#fff}.CodeMirror-scroll{overflow:scroll!important;margin-bottom:-50px;margin-right:-50px;padding-bottom:50px;height:100%;outline:0;position:relative;z-index:0}.CodeMirror-sizer{position:relative;border-right:50px solid transparent}.CodeMirror-gutter-filler,.CodeMirror-hscrollbar,.CodeMirror-scrollbar-filler,.CodeMirror-vscrollbar{position:absolute;z-index:6;display:none;outline:0}.CodeMirror-vscrollbar{right:0;top:0;overflow-x:hidden;overflow-y:scroll}.CodeMirror-hscrollbar{bottom:0;left:0;overflow-y:hidden;overflow-x:scroll}.CodeMirror-scrollbar-filler{right:0;bottom:0}.CodeMirror-gutter-filler{left:0;bottom:0}.CodeMirror-gutters{position:absolute;left:0;top:0;min-height:100%;z-index:3}.CodeMirror-gutter{white-space:normal;height:100%;display:inline-block;vertical-align:top;margin-bottom:-50px}.CodeMirror-gutter-wrapper{position:absolute;z-index:4;background:0 0!important;border:none!important}.CodeMirror-gutter-background{position:absolute;top:0;bottom:0;z-index:4}.CodeMirror-gutter-elt{position:absolute;cursor:default;z-index:4}.CodeMirror-gutter-wrapper ::selection{background-color:transparent}.CodeMirror-gutter-wrapper ::-moz-selection{background-color:transparent}.CodeMirror-lines{cursor:text;min-height:1px}.CodeMirror pre.CodeMirror-line,.CodeMirror pre.CodeMirror-line-like{-moz-border-radius:0;-webkit-border-radius:0;border-radius:0;border-width:0;background:0 0;font-family:inherit;font-size:inherit;margin:0;white-space:pre;word-wrap:normal;line-height:inherit;color:inherit;z-index:2;position:relative;overflow:visible;-webkit-tap-highlight-color:transparent;-webkit-font-variant-ligatures:contextual;font-variant-ligatures:contextual}.CodeMirror-wrap pre.CodeMirror-line,.CodeMirror-wrap pre.CodeMirror-line-like{word-wrap:break-word;white-space:pre-wrap;word-break:normal}.CodeMirror-linebackground{position:absolute;left:0;right:0;top:0;bottom:0;z-index:0}.CodeMirror-linewidget{position:relative;z-index:2;padding:.1px}.CodeMirror-rtl pre{direction:rtl}.CodeMirror-code{outline:0}.CodeMirror-gutter,.CodeMirror-gutters,.CodeMirror-linenumber,.CodeMirror-scroll,.CodeMirror-sizer{-moz-box-sizing:content-box;box-sizing:content-box}.CodeMirror-measure{position:absolute;width:100%;height:0;overflow:hidden;visibility:hidden}.CodeMirror-cursor{position:absolute;pointer-events:none}.CodeMirror-measure pre{position:static}div.CodeMirror-cursors{visibility:hidden;position:relative;z-index:3}div.CodeMirror-dragcursors{visibility:visible}.CodeMirror-focused div.CodeMirror-cursors{visibility:visible}.CodeMirror-selected{background:#d9d9d9}.CodeMirror-focused .CodeMirror-selected{background:#d7d4f0}.CodeMirror-crosshair{cursor:crosshair}.CodeMirror-line::selection,.CodeMirror-line>span::selection,.CodeMirror-line>span>span::selection{background:#d7d4f0}.CodeMirror-line::-moz-selection,.CodeMirror-line>span::-moz-selection,.CodeMirror-line>span>span::-moz-selection{background:#d7d4f0}.cm-searching{background-color:#ffa;background-color:rgba(255,255,0,.4)}.cm-force-border{padding-right:.1px}@media print{.CodeMirror div.CodeMirror-cursors{visibility:hidden}}.cm-tab-wrap-hack:after{content:”}span.CodeMirror-selectedtext{background:0 0}
/* Material Palenight theme */
.cm-s-material-palenight.CodeMirror{background-color:#292d3e;color:#a6accd}.cm-s-material-palenight .CodeMirror-gutters{background:#292d3e;color:#676e95;border:none}.cm-s-material-palenight .CodeMirror-guttermarker,.cm-s-material-palenight .CodeMirror-guttermarker-subtle,.cm-s-material-palenight .CodeMirror-linenumber{color:#676e95}.cm-s-material-palenight .CodeMirror-cursor{border-left:1px solid #fc0}.cm-s-material-palenight.cm-fat-cursor .CodeMirror-cursor{background-color:#607c8b80!important}.cm-s-material-palenight .cm-animate-fat-cursor{background-color:#607c8b80!important}.cm-s-material-palenight div.CodeMirror-selected{background:rgba(113,124,180,.2)}.cm-s-material-palenight.CodeMirror-focused div.CodeMirror-selected{background:rgba(113,124,180,.2)}.cm-s-material-palenight .CodeMirror-line::selection,.cm-s-material-palenight .CodeMirror-line>span::selection,.cm-s-material-palenight .CodeMirror-line>span>span::selection{background:rgba(128,203,196,.2)}.cm-s-material-palenight .CodeMirror-line::-moz-selection,.cm-s-material-palenight .CodeMirror-line>span::-moz-selection,.cm-s-material-palenight .CodeMirror-line>span>span::-moz-selection{background:rgba(128,203,196,.2)}.cm-s-material-palenight .CodeMirror-activeline-background{background:rgba(0,0,0,.5)}.cm-s-material-palenight .cm-keyword{color:#c792ea}.cm-s-material-palenight .cm-operator{color:#89ddff}.cm-s-material-palenight .cm-variable-2{color:#eff}.cm-s-material-palenight .cm-type,.cm-s-material-palenight .cm-variable-3{color:#f07178}.cm-s-material-palenight .cm-builtin{color:#ffcb6b}.cm-s-material-palenight .cm-atom{color:#f78c6c}.cm-s-material-palenight .cm-number{color:#ff5370}.cm-s-material-palenight .cm-def{color:#82aaff}.cm-s-material-palenight .cm-string{color:#c3e88d}.cm-s-material-palenight .cm-string-2{color:#f07178}.cm-s-material-palenight .cm-comment{color:#676e95}.cm-s-material-palenight .cm-variable{color:#f07178}.cm-s-material-palenight .cm-tag{color:#ff5370}.cm-s-material-palenight .cm-meta{color:#ffcb6b}.cm-s-material-palenight .cm-attribute{color:#c792ea}.cm-s-material-palenight .cm-property{color:#c792ea}.cm-s-material-palenight .cm-qualifier{color:#decb6b}.cm-s-material-palenight .cm-type,.cm-s-material-palenight .cm-variable-3{color:#decb6b}.cm-s-material-palenight .cm-error{color:#fff;background-color:#ff5370}.cm-s-material-palenight .CodeMirror-matchingbracket{text-decoration:underline;color:#fff!important}
* {
box-sizing: border-box;
margin: 0;
padding: 0;
}

body {
font-family: -apple-system, BlinkMacSystemFont, ‘Segoe UI’, Roboto, sans-serif;
background: #1a1a1a;
color: #f0f0f0;
line-height: 1.6;
}

/* Layout */
.course-layout {
display: flex;
min-height: 100vh;
}

/* Sidebar */
.course-sidebar {
width: 280px;
background: #2F2D2E;
border-right: 1px solid #4a4849;
position: fixed;
height: 100vh;
overflow-y: auto;
padding: 1.5rem 0;
}

.course-title {
padding: 0 1.5rem 1rem;
border-bottom: 1px solid #4a4849;
margin-bottom: 1rem;
}

.course-title h1 {
font-size: 1.1rem;
color: #72BEFA;
margin-bottom: 0.25rem;
}

.course-title .progress-text {
font-size: 0.75rem;
color: #888;
}

.progress-bar {
height: 4px;
background: #4a4849;
border-radius: 2px;
margin-top: 0.5rem;
overflow: hidden;
}

.progress-fill {
height: 100%;
background: #72BEFA;
width: 0%;
transition: width 0.3s;
}

/* Navigation */
.nav-section {
margin-bottom: 1rem;
}

.nav-section-title {
padding: 0.5rem 1.5rem;
font-size: 0.7rem;
text-transform: uppercase;
letter-spacing: 1px;
color: #888;
}

.nav-item {
display: flex;
align-items: center;
gap: 0.75rem;
padding: 0.6rem 1.5rem;
color: #ccc;
text-decoration: none;
font-size: 0.9rem;
transition: all 0.2s;
cursor: pointer;
border-left: 3px solid transparent;
}

.nav-item:hover {
background: #3d3b3c;
color: #fff;
}

.nav-item.active {
background: #3d3b3c;
border-left-color: #72BEFA;
color: #72BEFA;
}

.nav-item.completed .status-icon {
color: #72BEFA;
}

.status-icon {
width: 20px;
height: 20px;
min-width: 20px;
flex-shrink: 0;
display: flex;
align-items: center;
justify-content: center;
border: 2px solid #4a4849;
border-radius: 50%;
font-size: 0.7rem;
}

.nav-item.completed .status-icon {
border-color: #72BEFA;
background: rgba(114, 252, 219, 0.1);
}

.lock-icon {
margin-left: auto;
font-size: 0.75rem;
color: #666;
opacity: 0.7;
flex-shrink: 0;
min-width: 1rem;
}

/* Main content */
.course-content {
margin-left: 280px;
flex: 1;
padding: 2rem 3rem;
max-width: 900px;
}

.lesson {
display: none;
}

.lesson.active {
display: block;
}

.lesson h2 {
color: #72BEFA;
font-size: 1.75rem;
margin-bottom: 1.5rem;
padding-bottom: 0.5rem;
border-bottom: 2px solid #4a4849;
}

.lesson h3 {
color: #fff;
font-size: 1.25rem;
margin-top: 2rem;
margin-bottom: 1rem;
}

.lesson h4 {
color: #ccc;
font-size: 1.1rem;
margin-top: 1.5rem;
margin-bottom: 0.75rem;
}

.lesson p {
color: #ccc;
margin-bottom: 1rem;
}

.lesson ul, .lesson ol {
color: #ccc;
margin-bottom: 1rem;
padding-left: 1.5rem;
}

.lesson li {
margin-bottom: 0.5rem;
}

.lesson code {
background: #3d3b3c;
padding: 0.2rem 0.4rem;
border-radius: 4px;
font-family: ‘Fira Code’, monospace;
font-size: 0.9em;
color: #72BEFA;
}

.lesson pre {
background: #2F2D2E;
padding: 1rem;
border-radius: 8px;
overflow-x: auto;
margin-bottom: 1rem;
border: 1px solid #4a4849;
}

.lesson pre code {
background: none;
padding: 0;
color: #f8f8f2;
}

/* Callouts */
.callout {
padding: 1rem 1.25rem;
border-radius: 8px;
margin: 1.5rem 0;
border-left: 4px solid;
}

.callout-title {
font-weight: 600;
margin-bottom: 0.5rem;
display: flex;
align-items: center;
gap: 0.5rem;
}

.callout-tip {
background: rgba(114, 190, 250, 0.1);
border-color: #72BEFA;
}

.callout-tip .callout-title {
color: #72BEFA;
}

.callout-note {
background: rgba(114, 252, 219, 0.1);
border-color: #72FCDB;
}

.callout-note .callout-title {
color: #72FCDB;
}

.callout-warning {
background: rgba(229, 131, 182, 0.1);
border-color: #E583B6;
}

.callout-warning .callout-title {
color: #E583B6;
}

.callout a {
color: #fff;
text-decoration: underline;
}

.callout a:hover {
color: #72FCDB;
}

/* Collapsible callouts */
details.callout {
cursor: pointer;
}

details.callout summary.callout-title {
cursor: pointer;
list-style: none;
}

details.callout summary.callout-title::before {
content: ‘▶ ‘;
font-size: 0.8em;
transition: transform 0.2s;
display: inline-block;
}

details.callout[open] summary.callout-title::before {
transform: rotate(90deg);
}

details.callout summary.callout-title::-webkit-details-marker {
display: none;
}

details.callout > p {
margin-top: 0.75rem;
}

.callout pre {
background: #1a1a1a;
border-radius: 6px;
padding: 1rem;
margin-top: 0.75rem;
overflow-x: auto;
}

.callout pre code {
font-family: ‘Fira Code’, monospace;
font-size: 0.9rem;
color: #c3e88d;
}

/* Blockquotes */
.lesson blockquote {
border-left: 3px solid #72BEFA;
background: rgba(114, 190, 250, 0.08);
padding: 0.75rem 1.25rem;
border-radius: 0 6px 6px 0;
margin: 1rem 0;
}

.lesson blockquote p {
margin: 0;
color: rgba(255, 255, 255, 0.85);
}

/* Tables */
.course-table {
width: 100%;
border-collapse: collapse;
margin: 1rem 0 1.5rem 0;
font-size: 0.95rem;
}
.course-table th,
.course-table td {
border: 1px solid #4a4849;
padding: 0.6rem 1rem;
text-align: left;
}
.course-table thead th {
background: #3a3839;
color: #e0e0e0;
font-weight: 600;
}
.course-table tbody td {
color: #ccc;
}
.course-table tbody tr:nth-child(even) {
background: rgba(255, 255, 255, 0.03);
}

/* Quiz */
.quiz {
background: #2F2D2E;
border-radius: 8px;
padding: 1.5rem;
margin: 0 0 1.5rem 0;
border: 1px solid #4a4849;
}

.quiz-heading {
color: #ccc;
font-size: 1.1rem;
margin-top: 1.5rem;
margin-bottom: 0.75rem;
}

.quiz-divider {
border: none;
border-top: 1px solid #4a4849;
margin: 1.5rem 0;
}

.quiz-question {
color: #fff;
font-size: 1rem;
margin-bottom: 1rem;
font-weight: 500;
}

.quiz-options {
display: flex;
flex-direction: column;
gap: 0.75rem;
}

.quiz-option {
display: flex;
align-items: center;
gap: 0.75rem;
padding: 0.75rem 1rem;
background: #3d3b3c;
border: 2px solid #4a4849;
border-radius: 8px;
cursor: pointer;
transition: all 0.2s;
text-align: left;
width: 100%;
}

.quiz-option:hover:not(:disabled) {
border-color: #72BEFA;
background: #454243;
}

.quiz-option:disabled {
cursor: default;
}

.quiz-option.correct {
border-color: #72FCDB;
background: rgba(114, 252, 219, 0.15);
}

.quiz-option.incorrect {
border-color: #ff6b6b;
background: rgba(255, 107, 107, 0.15);
}

.option-label {
display: flex;
align-items: center;
justify-content: center;
width: 28px;
height: 28px;
min-width: 28px;
background: #4a4849;
border-radius: 50%;
font-weight: 600;
font-size: 0.85rem;
color: #fff;
}

.quiz-option.correct .option-label {
background: #72FCDB;
color: #2F2D2E;
}

.quiz-option.incorrect .option-label {
background: #ff6b6b;
color: #2F2D2E;
}

.option-content {
display: block;
flex: 1;
color: #ccc;
}

.option-content code {
background: #282a36;
padding: 0.15rem 0.4rem;
border-radius: 4px;
font-size: 0.85rem;
color: #f8f8f2;
}

.code-option code {
display: block;
padding: 0.5rem 0.75rem;
}

.quiz-feedback {
margin-top: 1rem;
padding-top: 1rem;
border-top: 1px solid #4a4849;
}

.quiz-feedback .callout {
margin: 0;
}

/* Code widget */
.codecut-widget {
background: #2F2D2E;
border-radius: 8px;
overflow: hidden;
margin: 1.5rem 0;
border: 1px solid #4a4849;
}

.codecut-widget-header {
display: flex;
justify-content: space-between;
align-items: center;
padding: 0.5rem 1rem;
background: #3d3b3c;
border-bottom: 1px solid #4a4849;
}

.codecut-widget-lang {
color: #72BEFA;
font-size: 0.75rem;
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.5px;
}

.codecut-run-btn {
display: flex;
align-items: center;
gap: 0.4rem;
background: #72BEFA;
color: #2F2D2E;
border: none;
padding: 0.4rem 0.8rem;
border-radius: 4px;
font-size: 0.8rem;
font-weight: 600;
cursor: pointer;
transition: all 0.2s;
}

.codecut-run-btn:hover {
background: #5aa8e8;
}

.codecut-run-btn:disabled {
background: #666;
cursor: not-allowed;
}

.codecut-editor {
min-height: 80px;
background: #2F2D2E;
}

.codecut-editor > textarea,
.exercise-editor > textarea {
display: none;
}

/* Static code widgets (read-only, no header/output) */
.codecut-widget[data-static=”true”] {
border-radius: 8px;
border: 1px solid #4a4849;
}

.codecut-widget[data-static=”true”] .codecut-editor {
border-radius: 8px;
min-height: auto;
}

.codecut-widget[data-static=”true”] .codecut-editor textarea {
min-height: auto;
}

.codecut-widget[data-static=”true”] .CodeMirror {
min-height: auto;
}

.codecut-widget[data-static=”true”] .CodeMirror-scroll {
min-height: auto;
}

.codecut-widget[data-demo=”true”] .codecut-editor {
min-height: auto;
}

.codecut-widget[data-demo=”true”] .codecut-editor textarea {
min-height: auto;
}

.codecut-widget[data-demo=”true”] .CodeMirror {
min-height: auto;
}

.codecut-widget[data-demo=”true”] .CodeMirror-scroll {
min-height: auto;
}

/* CodeMirror 5 styling overrides */
.CodeMirror {
height: auto;
min-height: 80px;
font-family: ‘Fira Code’, monospace;
font-size: 0.9rem;
line-height: 1.5;
background: #282a36;
border-radius: 0;
}

.CodeMirror-scroll {
min-height: 80px;
overflow-x: auto !important;
overflow-y: hidden !important;
}

.CodeMirror-gutters {
background: #282a36;
border-right: 1px solid #4a4849;
min-width: 40px;
}

.CodeMirror-linenumber {
color: #6272a4;
padding: 0 8px 0 5px;
min-width: 25px;
text-align: right;
}

.CodeMirror-sizer {
margin-left: 40px !important;
}

.CodeMirror-cursor {
border-left-color: #72BEFA;
}

.CodeMirror-selected {
background: rgba(114, 190, 250, 0.3) !important;
}

.CodeMirror-focused .CodeMirror-selected {
background: rgba(114, 190, 250, 0.4) !important;
}

/* Suppress red error background for $ and other valid-in-context tokens */
.cm-s-material-palenight .cm-error {
background: none;
}

.codecut-output-section {
margin-top: 0.75rem;
border-top: 2px solid #4a4849;
background: #252324;
}

.codecut-output-header {
padding: 0.4rem 1rem;
background: #3d3b3c;
border-bottom: 1px solid #4a4849;
}

.codecut-output-label {
color: #aaa;
font-size: 0.75rem;
font-weight: 600;
text-transform: uppercase;
}

.codecut-output {
padding: 1rem;
min-height: 60px;
max-height: 300px;
overflow-y: auto;
font-family: ‘Fira Code’, monospace;
font-size: 0.85rem;
line-height: 1.5;
color: #f8f8f2;
white-space: pre-wrap;
}

.course-image {
max-width: 100%;
height: auto;
border-radius: 4px;
display: block;
margin: 1em 0;
}

pre.mermaid {
text-align: center;
background: transparent;
border: none;
padding: 1em 0;
margin: 1em 0;
}

pre.mermaid svg {
background: transparent !important;
}

.codecut-output img {
max-width: 100%;
height: auto;
border-radius: 4px;
}

.codecut-output.has-image {
max-height: none;
white-space: normal;
}

.codecut-output.error { color: #ff6b6b; }
.codecut-output.loading { color: #72BEFA; }
.codecut-output .success { color: #72BEFA; }

.codecut-spinner {
display: inline-block;
width: 14px;
height: 14px;
border: 2px solid #2F2D2E;
border-top-color: transparent;
border-radius: 50%;
animation: spin 0.8s linear infinite;
}

@keyframes spin {
to { transform: rotate(360deg); }
}

/* Exercise widget */
.exercise-widget {
background: #1e1e2e;
border-radius: 12px;
overflow: hidden;
margin: 1.5rem 0;
border: 1px solid #4a4849;
}

.exercise-split {
display: flex;
flex-direction: column;
}

.exercise-left {
padding: 20px 24px;
background: #252535;
border-bottom: 1px solid #4a4849;
}

.exercise-title {
color: #72BEFA;
font-size: 1rem;
font-weight: 600;
margin: 0 0 1rem 0;
text-transform: uppercase;
letter-spacing: 0.5px;
}

.exercise-assignment {
color: #e0e0e0;
font-size: 0.9rem;
line-height: 1.6;
display: flex;
flex-wrap: wrap;
gap: 1.5rem 3rem;
}

.exercise-assignment p {
margin: 0;
}

.exercise-heading {
color: #72BEFA;
font-size: 0.75rem;
font-weight: 600;
margin: 0 0 0.4rem 0;
text-transform: uppercase;
letter-spacing: 0.5px;
}

.exercise-section {
flex: 1;
min-width: 200px;
}

.exercise-heading + p {
margin-top: 0;
}

.exercise-assignment em {
color: #ffffff;
font-style: italic;
}

.exercise-assignment code {
background: #3d3b3c;
padding: 0.2rem 0.4rem;
border-radius: 4px;
font-family: ‘Fira Code’, monospace;
font-size: 0.85rem;
}

.exercise-secrets {
margin-top: 1rem;
padding-top: 1rem;
border-top: 1px solid #3d3b3c;
}

.exercise-secret {
display: flex;
flex-direction: column;
gap: 0.4rem;
margin-bottom: 0.75rem;
}

.exercise-secret:last-child {
margin-bottom: 0;
}

.exercise-secret label {
color: #72BEFA;
font-size: 0.75rem;
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.5px;
}

.exercise-secret input {
padding: 0.6rem 0.8rem;
background: #1e1e2e;
border: 1px solid #4a4849;
border-radius: 6px;
color: #e0e0e0;
font-family: ‘Fira Code’, monospace;
font-size: 0.85rem;
outline: none;
transition: border-color 0.2s;
}

.exercise-secret input:focus {
border-color: #72BEFA;
}

.exercise-secret input::placeholder {
color: #666;
}

.exercise-right {
display: flex;
flex-direction: column;
background: #1e1e2e;
}

.exercise-editor {
flex: 1;
min-height: 200px;
background: #282a36;
}

.exercise-editor textarea {
width: 100%;
min-height: 200px;
padding: 1rem;
background: #282a36;
color: #f8f8f2;
border: none;
font-family: ‘Fira Code’, monospace;
font-size: 0.9rem;
line-height: 1.5;
resize: none;
outline: none;
}

.exercise-actions {
display: flex;
gap: 8px;
padding: 12px 16px;
background: #1a1a2e;
border-top: 1px solid #4a4849;
}

.exercise-btn {
display: flex;
align-items: center;
gap: 0.4rem;
padding: 0.5rem 1rem;
border: none;
border-radius: 6px;
font-size: 0.85rem;
font-weight: 600;
cursor: pointer;
transition: all 0.2s;
background: #3d3b3c;
color: #e0e0e0;
}

.exercise-btn:hover {
background: #4d4b4c;
}

.exercise-btn:disabled {
opacity: 0.5;
cursor: not-allowed;
}

.exercise-btn.primary {
background: #72BEFA;
color: #1e1e2e;
}

.exercise-btn.primary:hover {
background: #5aa8e8;
}

.exercise-btn.primary:disabled {
background: #666;
}

.exercise-output-section {
border-top: 1px solid #4a4849;
background: #1e1e2e;
}

.exercise-output-header {
padding: 0.5rem 1rem;
background: #252535;
border-bottom: 1px solid #4a4849;
}

.exercise-output-label {
color: #888;
font-size: 0.75rem;
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.5px;
}

.exercise-output {
padding: 1rem;
font-family: ‘Fira Code’, monospace;
font-size: 0.9rem;
line-height: 1.5;
color: #f8f8f2;
white-space: pre-wrap;
max-height: 200px;
overflow-y: auto;
}

.exercise-output.error { color: #ff6b6b; }
.exercise-output.loading { color: #72BEFA; }
.exercise-output.success { color: #72FCDB; }

.exercise-result {
padding: 1rem;
margin: 0;
font-weight: 600;
text-align: center;
}

.exercise-result.success {
background: rgba(114, 252, 219, 0.1);
color: #72FCDB;
border-top: 2px solid #72FCDB;
}

.exercise-result.failure {
background: rgba(255, 107, 107, 0.1);
color: #ff6b6b;
border-top: 2px solid #ff6b6b;
}

/* Navigation buttons */
.lesson-nav {
display: flex;
justify-content: space-between;
margin-top: 3rem;
padding-top: 2rem;
border-top: 1px solid #4a4849;
}

.lesson-nav-btn {
display: flex;
align-items: center;
gap: 0.5rem;
padding: 0.75rem 1.5rem;
background: #3d3b3c;
color: #fff;
border: none;
border-radius: 8px;
font-size: 0.9rem;
cursor: pointer;
transition: all 0.2s;
}

.lesson-nav-btn:hover {
background: #4a4849;
}

.lesson-nav-btn.primary {
background: #72BEFA;
color: #2F2D2E;
}

.lesson-nav-btn.primary:hover {
background: #5aa8e8;
}

/* Completion modal */
.completion-overlay {
display: none;
position: fixed;
inset: 0;
background: rgba(0, 0, 0, 0.7);
z-index: 1000;
align-items: center;
justify-content: center;
padding: 1rem;
}

.completion-modal {
background: #2F2D2E;
border: 1px solid #4a4849;
border-radius: 16px;
max-width: 520px;
width: 100%;
padding: 2.5rem;
text-align: center;
position: relative;
}

.completion-modal-close {
position: absolute;
top: 1rem;
right: 1rem;
background: none;
border: none;
color: #999;
font-size: 1.25rem;
cursor: pointer;
padding: 0.25rem;
line-height: 1;
}

.completion-modal-close:hover {
color: #fff;
}

.completion-modal h2 {
color: #72BEFA;
font-size: 1.5rem;
margin-bottom: 0.5rem;
}

.completion-modal p {
color: #ccc;
margin-bottom: 1.5rem;
font-size: 0.95rem;
line-height: 1.5;
}

.completion-courses {
display: flex;
flex-direction: column;
gap: 0.75rem;
margin-bottom: 1.5rem;
}

.completion-course-card {
display: block;
background: #3d3b3c;
border: 1px solid #4a4849;
border-radius: 10px;
padding: 1rem 1.25rem;
text-decoration: none;
text-align: left;
transition: border-color 0.2s;
}

.completion-course-card:hover {
border-color: #72BEFA;
}

.completion-course-card .card-title {
color: #72BEFA;
font-size: 0.95rem;
font-weight: 600;
margin-bottom: 0.25rem;
}

.completion-course-card .card-desc {
color: #999;
font-size: 0.8rem;
}

.completion-browse {
display: inline-block;
color: #E583B6;
font-size: 0.9rem;
text-decoration: none;
}

.completion-browse:hover {
text-decoration: underline;
}

/* Responsive */
@media (max-width: 768px) {
.course-sidebar {
width: 100%;
position: relative;
height: auto;
}

.course-content {
margin-left: 0;
padding: 1.5rem;
}

.course-layout {
flex-direction: column;
}
}

Python Data Modeling with Dataclasses and Pydantic
0 of 29 completed

Getting Started


The Silent Bug


What Are Typed Data Containers?

Using Dictionaries


Creating a Dictionary


Silent Failures


Type Confusion

Using NamedTuple


Creating a NamedTuple


Catching Typos at Runtime


Exercise: Fix a Buggy Pipeline


Immutability Prevents Accidental Changes


Default Values


Limitations: No Runtime Type Validation


Exercise: Fix a Type Bug

Using dataclass


Creating a dataclass


Exercise: Build a Product Record


Mutability Allows Updates


Mutable Defaults with default_factory


Exercise: Build a Shopping Cart


Post-Init Validation with __post_init__


Limitations: Manual Validation Only


Limitations: Nested Validation

Using Pydantic


Getting Started


Creating a Pydantic Model


Runtime Validation


Exercise: Validate Signup Data


Type Coercion


Constraint Validation


Exercise: Validate a Job Posting


Nested Validation

Summary


Key Takeaways

The Silent Bug
Imagine you’re processing customer records. The pipeline runs without errors, but customers never receive their welcome emails. After digging through the code, you discover the issue is a simple typo in a dictionary key.


config:
theme: dark
layout: dagre
look: neo

flowchart LR
A[“Write data’emial’: …”] –> B[“Storedict saves anything”] –> C[“Read data.get(’email’)”] –> D[“ResultNone, no error!”]

Press Run below to see it in action.

Python

Run

ZGVmIGxvYWRfY3VzdG9tZXIocm93KToKICAgIHJldHVybiB7ImN1c3RvbWVyX2lkIjogcm93WzBdLCAibmFtZSI6IHJvd1sxXSwgImVtaWFsIjogcm93WzJdfSAgIyBUeXBvCgoKZGVmIHNlbmRfd2VsY29tZV9lbWFpbChjdXN0b21lcik6CiAgICBlbWFpbCA9IGN1c3RvbWVyLmdldCgiZW1haWwiKSAgIyBSZXR1cm5zIE5vbmUgc2lsZW50bHkKICAgIGlmIGVtYWlsOgogICAgICAgIHByaW50KGYiU2VuZGluZyBlbWFpbCB0byB7ZW1haWx9IikKICAgIGVsc2U6CiAgICAgICAgcHJpbnQoIk5vIGVtYWlsIGZvdW5kLiBOb3RoaW5nIHNlbnQhIikKCgpjdXN0b21lciA9IGxvYWRfY3VzdG9tZXIoWyJDMDAxIiwgIkFsaWNlIiwgImFsaWNlQGV4YW1wbGUuY29tIl0pCnNlbmRfd2VsY29tZV9lbWFpbChjdXN0b21lcikgICMgTm90aGluZyBoYXBwZW5z

Output

Loading Python…

💡 Tip
The output looks like the customer has no email on file, but we passed "alice@example.com". The data is there, just stored under "emial".
.get("email") finds no match and returns None instead of raising an error.

This happens because dictionaries don’t know what keys they should have. Without a schema, Python treats "emial" and "email" as equally valid. The same goes for missing fields, extra fields, and wrong types.

Complete & Continue →

What Are Typed Data Containers?
Python offers several ways to avoid this bug, each adding more safety than the last:

Safety
Flexibility
Dependencies
Mutability

dict
None
Any key, any value
Built-in
Mutable

NamedTuple
Basic
Fixed fields
Built-in
Immutable

dataclass
Moderate
Fixed fields, defaults
Built-in
Mutable

Pydantic
Full
Fixed fields, validators
pip install
Mutable

Notice the pattern: each row gains something the previous one lacks:

dict → NamedTuple: Gain fixed fields, lose flexibility.
NamedTuple → dataclass: Gain mutability and defaults.
dataclass → Pydantic: Gain type validation, add a dependency.

In this course, you’ll try each tool yourself and see how it catches the mistakes that dictionaries miss.

← Previous

Complete & Continue →

Creating a Dictionary
A dictionary maps string keys to values. It’s the most common way to represent a record in Python, but it has no fixed structure. You can add, remove, or misspell any key at any time.

Creating one takes a single pair of curly braces:

Python

Run

Y3VzdG9tZXIgPSB7CiAgICAiY3VzdG9tZXJfaWQiOiAiQzAwMSIsCiAgICAibmFtZSI6ICJBbGljZSBTbWl0aCIsCiAgICAiZW1haWwiOiAiYWxpY2VAZXhhbXBsZS5jb20iLAogICAgImFnZSI6IDI4LAogICAgImlzX3ByZW1pdW0iOiBUcnVlLAp9CgpwcmludChjdXN0b21lclsibmFtZSJdKQ==

Output

Loading Python…

💡 Tip
The output prints Alice Smith by looking up the "name" key in the dictionary.

← Previous

Complete & Continue →

Silent Failures
A typo in the key name causes a KeyError at runtime:

Python

Run

Y3VzdG9tZXIgPSB7CiAgICAiY3VzdG9tZXJfaWQiOiAiQzAwMSIsCiAgICAibmFtZSI6ICJBbGljZSBTbWl0aCIsCiAgICAiZW1haWwiOiAiYWxpY2VAZXhhbXBsZS5jb20iLAogICAgImFnZSI6IDI4LAogICAgImlzX3ByZW1pdW0iOiBUcnVlLAp9Cgp0cnk6CiAgICBjdXN0b21lclsiZW1pYWwiXSAgIyBUeXBvOiBzaG91bGQgYmUgImVtYWlsIgpleGNlcHQgS2V5RXJyb3IgYXMgZToKICAgIHByaW50KGYiS2V5RXJyb3I6IHtlfSIp

Output

Loading Python…

The error tells you what went wrong but not where. When dictionaries pass through multiple functions, finding the source of a typo can take significant debugging time:

Python

Run

aW1wb3J0IHRyYWNlYmFjawoKCmRlZiBsb2FkX2N1c3RvbWVyKHJvdyk6CiAgICByZXR1cm4geyJjdXN0b21lcl9pZCI6IHJvd1swXSwgIm5hbWUiOiByb3dbMV0sICJlbWlhbCI6IHJvd1syXX0gICMgVHlwbyBoZXJlCgoKZGVmIHZhbGlkYXRlX2N1c3RvbWVyKGN1c3RvbWVyKToKICAgIHJldHVybiBjdXN0b21lciAgIyBQYXNzZXMgdGhyb3VnaCB1bmNoYW5nZWQKCgpkZWYgc2VuZF9lbWFpbChjdXN0b21lcik6CiAgICByZXR1cm4gY3VzdG9tZXJbImVtYWlsIl0gICMgS2V5RXJyb3IgcmFpc2VkIGhlcmUKCgp0cnk6CiAgICBjdXN0b21lciA9IGxvYWRfY3VzdG9tZXIoWyJDMDAxIiwgIkFsaWNlIiwgImFsaWNlQGV4YW1wbGUuY29tIl0pCiAgICB2YWxpZGF0ZWQgPSB2YWxpZGF0ZV9jdXN0b21lcihjdXN0b21lcikKICAgIHNlbmRfZW1haWwodmFsaWRhdGVkKSAgIyBFcnJvciBwb2ludHMgaGVyZSwgYnV0IGJ1ZyBpcyBpbiBsb2FkX2N1c3RvbWVyCmV4Y2VwdCBLZXlFcnJvcjoKICAgIHRyYWNlYmFjay5wcmludF9leGMoKQ==

Output

Loading Python…

💡 What the output shows
The error is raised in send_email(), but the actual bug (the typo "emial") was introduced in load_customer(). The bug and its symptom are in different functions.

Using .get() makes it worse by returning None silently:

Python

Run

ZGVmIGxvYWRfY3VzdG9tZXIocm93KToKICAgIHJldHVybiB7ImN1c3RvbWVyX2lkIjogcm93WzBdLCAibmFtZSI6IHJvd1sxXSwgImVtaWFsIjogcm93WzJdfQoKCmRlZiBzZW5kX2VtYWlsKGN1c3RvbWVyKToKICAgIGVtYWlsID0gY3VzdG9tZXIuZ2V0KCJlbWFpbCIpICAjIFJldHVybnMgTm9uZSBzaWxlbnRseQogICAgaWYgZW1haWw6CiAgICAgICAgcHJpbnQoZiJTZW5kaW5nIGVtYWlsIHRvIHtlbWFpbH0iKQoKCiMgUnVucyB3aXRob3V0IGFueSBlcnJvciBvciBvdXRwdXQKY3VzdG9tZXIgPSBsb2FkX2N1c3RvbWVyKFsiQzAwMSIsICJBbGljZSIsICJhbGljZUBleGFtcGxlLmNvbSJdKQpzZW5kX2VtYWlsKGN1c3RvbWVyKQpwcmludCgiRG9uZS4gQ3VzdG9tZXIgaGFkIG5vIGVtYWlsIG9uIGZpbGUuIik=

Output

Loading Python…

Quiz

What does {"name": "Alice"}.get("email") return?

A
It raises a KeyError

B
It returns None

C
It returns an empty string ""

⚠ Try Again
That’s what bracket access (d["email"]) does. .get() is designed to avoid raising errors, which is why it can hide bugs.

💡 Correct
Correct! .get() returns None when the key is missing. This is convenient but dangerous: your code keeps running with None instead of failing fast.

⚠ Try Again
.get() doesn’t return an empty string by default. It returns None unless you provide a second argument like .get("email", "").

← Previous

Complete & Continue →

Type Confusion
Missing keys aren’t the only risk. Without a schema, dictionaries also accept the wrong type for any field.

Let’s see what happens when age is stored as a string instead of an integer:

Python

Run

Y3VzdG9tZXIgPSB7CiAgICAiY3VzdG9tZXJfaWQiOiAiQzAwMSIsCiAgICAibmFtZSI6ICJBbGljZSBTbWl0aCIsCiAgICAiYWdlIjogIjI4IiwgICMgU3RyaW5nIGluc3RlYWQgb2YgaW50Cn0KCiMgTm8gZXJyb3Ig4oCUIGJ1dCB0aGUgbWF0aCBpcyB3cm9uZwpwcmludChmIkFnZToge2N1c3RvbWVyWydhZ2UnXX0iKQpwcmludChmIkFnZSB0aW1lcyAyOiB7Y3VzdG9tZXJbJ2FnZSddICogMn0iKQ==

Output

Loading Python…

💡 What the output shows
"28" * 2 produces "2828" instead of 56. Since "28" is a string, Python repeats it twice instead of doubling the number. The code runs fine, but the result is silently wrong.

← Previous

Complete & Continue →

Creating a NamedTuple
NamedTuple is a lightweight way to define a fixed structure with named fields and type hints, like a dictionary with a schema.

Instead of string keys, you declare a NamedTuple class with fixed fields. Every object created from it must provide values for those exact fields:

Python

Run

ZnJvbSB0eXBpbmcgaW1wb3J0IE5hbWVkVHVwbGUKCgpjbGFzcyBDdXN0b21lcihOYW1lZFR1cGxlKToKICAgIGN1c3RvbWVyX2lkOiBzdHIKICAgIG5hbWU6IHN0cgogICAgZW1haWw6IHN0cgogICAgYWdlOiBpbnQKICAgIGlzX3ByZW1pdW06IGJvb2wKCgpjdXN0b21lciA9IEN1c3RvbWVyKAogICAgY3VzdG9tZXJfaWQ9IkMwMDEiLAogICAgbmFtZT0iQWxpY2UgU21pdGgiLAogICAgZW1haWw9ImFsaWNlQGV4YW1wbGUuY29tIiwKICAgIGFnZT0yOCwKICAgIGlzX3ByZW1pdW09VHJ1ZSwKKQoKcHJpbnQoY3VzdG9tZXIp

Output

Loading Python…

💡 What the output shows
Printing the object displays all five fields by name and value in the order they were defined.

Once created, you can access fields with dot notation instead of string keys like customer["name"]. This allows your IDE to autocomplete the field names and catch typos immediately:

Python

Run

cHJpbnQoY3VzdG9tZXIubmFtZSkKcHJpbnQoY3VzdG9tZXIuZW1haWwp

Output

Loading Python…

Quiz

What happens if you create a Customer without providing the email field?

A
The email field is set to None by default

B
Python raises a TypeError because all fields are required

C
The object is created with an empty string for email

⚠ Try Again
Not quite. NamedTuple does not provide default values unless you explicitly define them. Every field must be provided at creation.

💡 Correct
Correct! NamedTuple requires values for all fields. Leaving one out raises a TypeError immediately, unlike a dict where missing keys fail silently later.

⚠ Try Again
Not quite. NamedTuple does not fill in missing fields with placeholder values. You must provide every field when creating the object.

← Previous

Complete & Continue →

Catching Typos at Runtime
In the dictionary pipeline, load_customer returned {"emial": row[2]} and the typo traveled through validate_customer before crashing in send_email. With NamedTuple, the same typo fails at the source:

Python

Run

ZnJvbSB0eXBpbmcgaW1wb3J0IE5hbWVkVHVwbGUKCgpjbGFzcyBDdXN0b21lcihOYW1lZFR1cGxlKToKICAgIGN1c3RvbWVyX2lkOiBzdHIKICAgIG5hbWU6IHN0cgogICAgZW1haWw6IHN0cgogICAgYWdlOiBpbnQKICAgIGlzX3ByZW1pdW06IGJvb2wKCgpkZWYgbG9hZF9jdXN0b21lcihyb3cpOgogICAgdHJ5OgogICAgICAgIHJldHVybiBDdXN0b21lcigKICAgICAgICAgICAgY3VzdG9tZXJfaWQ9cm93WzBdLAogICAgICAgICAgICBuYW1lPXJvd1sxXSwKICAgICAgICAgICAgZW1pYWw9cm93WzJdLCAgIyBTYW1lIHR5cG8gYXMgYmVmb3JlCiAgICAgICAgICAgIGFnZT1yb3dbM10sCiAgICAgICAgICAgIGlzX3ByZW1pdW09cm93WzRdLAogICAgICAgICkKICAgIGV4Y2VwdCBUeXBlRXJyb3IgYXMgZToKICAgICAgICBwcmludChmIlR5cGVFcnJvcjoge2V9IikKICAgICAgICByZXR1cm4gTm9uZQoKCmN1c3RvbWVyID0gbG9hZF9jdXN0b21lcihbIkMwMDEiLCAiQWxpY2UiLCAiYWxpY2VAZXhhbXBsZS5jb20iLCAyOCwgVHJ1ZV0pCnByaW50KGYiQ3VzdG9tZXI6IHtjdXN0b21lcn0iKQ==

Output

Loading Python…

💡 What the output shows
The error is raised inside load_customer, exactly where the typo was made, so you spend less time tracing through functions to find the bug.

Quiz

A NamedTuple Customer has fields customer_id, name, email, age, is_premium. You write Customer(customer_id="C001", nme="Alice", email="a@b.com", age=28, is_premium=True). When does the error appear?

A
When you try to access customer.name later in the code

B
Immediately when creating the object, before any other code runs

C
Only when you print the object

⚠ Try Again
Not quite. Unlike a dict where missing keys fail at access time, NamedTuple catches the typo at creation. The object is never created.

💡 Correct
Correct! NamedTuple raises a TypeError at creation because nme is not a valid field. The bug is caught at the source, not downstream.

⚠ Try Again
Not quite. The error happens before the object exists. NamedTuple validates field names during creation, not when you use the object.

← Previous

Complete & Continue →

Exercise: Fix a Buggy Pipeline

ScenarioThe load_customer function from the dictionary section had a typo ("emial") that traveled silently through the pipeline. Your team wants to prevent this class of bug entirely.TaskRewrite this dict-based pipeline to use a Customer NamedTuple so the typo is caught at creation. Fix the typo so the pipeline works.

IyBSZXdyaXRlIHVzaW5nIE5hbWVkVHVwbGUgYW5kIGZpeCB0aGUgYnVnCmRlZiBsb2FkX2N1c3RvbWVyKHJvdyk6CiAgICByZXR1cm4gewogICAgICAgICJjdXN0b21lcl9pZCI6IHJvd1swXSwKICAgICAgICAibmFtZSI6IHJvd1sxXSwKICAgICAgICAiZW1pYWwiOiByb3dbMl0sCiAgICB9CgpkZWYgc2VuZF9lbWFpbChjdXN0b21lcik6CiAgICBwcmludChmIlNlbmRpbmcgZW1haWwgdG8ge2N1c3RvbWVyWydlbWFpbCddfSIpCgpjdXN0b21lciA9IGxvYWRfY3VzdG9tZXIoWyJDMDAxIiwgIkFsaWNlIiwgImFsaWNlQGV4YW1wbGUuY29tIl0pCnNlbmRfZW1haWwoY3VzdG9tZXIp

Run

Submit

Solution

Reset

Output

Ready

← Previous

Complete & Continue →

Immutability Prevents Accidental Changes
Dictionaries let you change any value at any time, which means fields can be overwritten by accident. NamedTuples are immutable, so once created, their values cannot be changed:

Python

Run

ZnJvbSB0eXBpbmcgaW1wb3J0IE5hbWVkVHVwbGUKCgpjbGFzcyBDdXN0b21lcihOYW1lZFR1cGxlKToKICAgIGN1c3RvbWVyX2lkOiBzdHIKICAgIG5hbWU6IHN0cgogICAgZW1haWw6IHN0cgogICAgYWdlOiBpbnQKICAgIGlzX3ByZW1pdW06IGJvb2wKCgpjdXN0b21lciA9IEN1c3RvbWVyKAogICAgY3VzdG9tZXJfaWQ9IkMwMDEiLAogICAgbmFtZT0iQWxpY2UgU21pdGgiLAogICAgZW1haWw9ImFsaWNlQGV4YW1wbGUuY29tIiwKICAgIGFnZT0yOCwKICAgIGlzX3ByZW1pdW09VHJ1ZSwKKQoKdHJ5OgogICAgY3VzdG9tZXIubmFtZSA9ICJCb2IiCmV4Y2VwdCBBdHRyaWJ1dGVFcnJvciBhcyBlOgogICAgcHJpbnQoZiJBdHRyaWJ1dGVFcnJvcjoge2V9Iik=

Output

Loading Python…

💡 What the output shows
Assigning "Bob" to customer.name raises an AttributeError. Once a NamedTuple is created, its values are fixed.

Quiz

Why is immutability useful when passing a Customer object through multiple functions?

A
It makes the code run faster because Python optimizes immutable objects

B
No function can accidentally change the data, so each function sees the original values

C
It prevents other developers from reading the data

⚠ Try Again
Not quite. While immutable objects can have some performance benefits, the main advantage is data safety across function calls.

💡 Correct
Correct! When data is immutable, you can pass it through any number of functions knowing that no function can alter the original values. This eliminates a whole class of bugs.

⚠ Try Again
Not quite. Immutability prevents modification, not reading. Any function can still read and use the data freely.

← Previous

Complete & Continue →

Default Values
NamedTuple supports default values for simple types like bool and str:

Python

Run

ZnJvbSB0eXBpbmcgaW1wb3J0IE5hbWVkVHVwbGUKCgpjbGFzcyBDdXN0b21lcihOYW1lZFR1cGxlKToKICAgIG5hbWU6IHN0cgogICAgaXNfcHJlbWl1bTogYm9vbCA9IEZhbHNlCgoKYzEgPSBDdXN0b21lcigiQWxpY2UiKQpjMiA9IEN1c3RvbWVyKCJCb2IiLCBpc19wcmVtaXVtPVRydWUpCnByaW50KGYiQWxpY2UgcHJlbWl1bT8ge2MxLmlzX3ByZW1pdW19IikKcHJpbnQoZiJCb2IgcHJlbWl1bT8ge2MyLmlzX3ByZW1pdW19Iik=

Output

Loading Python…

💡 What the output shows
Customer("Alice") uses the default False for is_premium, while Customer("Bob", is_premium=True) overrides it. You only need to pass values that differ from the defaults.

However, mutable defaults like lists are shared across all instances, which can cause unexpected behavior:

Python

Run

ZnJvbSB0eXBpbmcgaW1wb3J0IE5hbWVkVHVwbGUKCgpjbGFzcyBDdXN0b21lcihOYW1lZFR1cGxlKToKICAgIG5hbWU6IHN0cgogICAgdGFnczogbGlzdCA9IFtdICAjIEFsbCBjdXN0b21lcnMgc2hhcmUgdGhpcyBsaXN0CgoKYzEgPSBDdXN0b21lcigiQWxpY2UiKQpjMiA9IEN1c3RvbWVyKCJCb2IiKQpjMS50YWdzLmFwcGVuZCgicHJlbWl1bSIpCnByaW50KGYiQWRkZWQgJ3ByZW1pdW0nIHRvIEFsaWNlIikKcHJpbnQoZiJBbGljZToge2MxLnRhZ3N9IikKcHJpbnQoZiJCb2I6ICAge2MyLnRhZ3N9Iik=

Output

Loading Python…

💡 What the output shows
Both Alice and Bob show ["premium"]. This happens because Python creates the default [] once when it reads the class, then hands that same list to every instance. There’s only one list in memory, so c1.tags and c2.tags are the same object.

This diagram shows how the single default list is shared before and after the append:


config:
theme: dark
layout: dagre
look: neo

flowchart TD
subgraph After c1.tags.append
c1b[c1.tags] –> list2[“[‘premium’]”]
c2b[c2.tags] –> list2
end

subgraph Before append
c1a[c1.tags] –> list1[“[ ]”]
c2a[c2.tags] –> list1
end

Quiz

NamedTuple is immutable, yet c1.tags.append("premium") works without error. Why?

A
append bypasses immutability because it’s a built-in method

B
Immutability prevents reassigning the field (c1.tags = […]), but the list itself is still mutable

C
NamedTuple is only immutable for string and number fields

⚠ Try Again
Not quite. append has no special privileges. The key is that immutability applies to the field reference, not to the object the field points to.

💡 Correct
Correct! c1.tags = new_list would raise an AttributeError. But c1.tags.append(…) modifies the list object that the field points to, which is allowed because the list itself is mutable.

⚠ Try Again
Not quite. NamedTuple immutability applies equally to all fields. The difference is between reassigning a field and modifying the object it references.

← Previous

Complete & Continue →

Limitations: No Runtime Type Validation
Type hints in NamedTuple are not enforced at runtime. You can still pass in wrong types:

Python

Run

ZnJvbSB0eXBpbmcgaW1wb3J0IE5hbWVkVHVwbGUKCgpjbGFzcyBDdXN0b21lcihOYW1lZFR1cGxlKToKICAgIGN1c3RvbWVyX2lkOiBzdHIKICAgIG5hbWU6IHN0cgogICAgZW1haWw6IHN0cgogICAgYWdlOiBpbnQKICAgIGlzX3ByZW1pdW06IGJvb2wKCgojIFdyb25nIHR5cGVzIGFyZSBhY2NlcHRlZCB3aXRob3V0IGVycm9yCmN1c3RvbWVyID0gQ3VzdG9tZXIoCiAgICBjdXN0b21lcl9pZD0iQzAwMSIsCiAgICBuYW1lPTEyMywgICMgU2hvdWxkIGJlIHN0ciwgYnV0IGludCBpcyBhY2NlcHRlZAogICAgZW1haWw9ImFsaWNlQGV4YW1wbGUuY29tIiwKICAgIGFnZT0idHdlbnR5LWVpZ2h0IiwgICMgU2hvdWxkIGJlIGludCwgYnV0IHN0ciBpcyBhY2NlcHRlZAogICAgaXNfcHJlbWl1bT1UcnVlLAopCgpwcmludChmIk5hbWU6IHtjdXN0b21lci5uYW1lfSwgQWdlOiB7Y3VzdG9tZXIuYWdlfSIpCnByaW50KGYiTmFtZSB0eXBlOiB7dHlwZShjdXN0b21lci5uYW1lKS5fX25hbWVfX30sIEFnZSB0eXBlOiB7dHlwZShjdXN0b21lci5hZ2UpLl9fbmFtZV9ffSIp

Output

Loading Python…

💡 What the output shows
Python accepts name=123 and age="old" without complaint. NamedTuple type hints are for documentation and static analysis only. They are not enforced at runtime.

Quiz

What is the purpose of type hints like age: int in a NamedTuple if they are not enforced?

A
They help IDEs and static analysis tools like mypy catch type errors before running the code

B
They convert values to the correct type automatically

C
They have no purpose and can be removed safely

💡 Correct
Correct! Type hints enable IDE autocomplete, inline warnings, and tools like mypy to catch type mismatches before you run the code. They are valuable documentation even without runtime enforcement.

⚠ Try Again
Not quite. NamedTuple does not convert types. If you pass age="28", it stays a string. Pydantic is the container that handles automatic type conversion.

⚠ Try Again
Not quite. Type hints are valuable for IDE support and static analysis, even though Python does not enforce them at runtime.

← Previous

Complete & Continue →

Exercise: Fix a Type Bug

ScenarioA sensor monitoring system adjusts temperature readings by a calibration factor of 2. A faulty sensor sends its reading as a string. The code runs without error, but one sensor’s adjusted value is wrong.TaskFix the readings list so that all adjusted temperatures are calculated correctly.

ZnJvbSB0eXBpbmcgaW1wb3J0IE5hbWVkVHVwbGUKCmNsYXNzIFNlbnNvclJlYWRpbmcoTmFtZWRUdXBsZSk6CiAgICBzZW5zb3JfaWQ6IHN0cgogICAgdGVtcGVyYXR1cmU6IGZsb2F0CgpyZWFkaW5ncyA9IFsKICAgIFNlbnNvclJlYWRpbmcoIlMxIiwgOTguNSksCiAgICBTZW5zb3JSZWFkaW5nKCJTMiIsICIxNSIpLAogICAgU2Vuc29yUmVhZGluZygiUzMiLCAyMi4xKSwKXQoKZm9yIHIgaW4gcmVhZGluZ3M6CiAgICBhZGp1c3RlZCA9IHIudGVtcGVyYXR1cmUgKiAyCiAgICBwcmludChmIntyLnNlbnNvcl9pZH06IHthZGp1c3RlZH0iKQ==

Run

Submit

Solution

Reset

Output

Ready

← Previous

Complete & Continue →

Creating a dataclass
A dataclass is a class decorator that automatically generates __init__, __repr__, and other methods from field definitions. It provides the same fixed fields and IDE support as NamedTuple, plus:

Mutable fields: Change values after creation, unlike NamedTuple
Default values: Fields can have defaults, including empty lists and dicts
Post-init logic: Run custom code right after an object is created

Creating a dataclass looks similar to NamedTuple, but you use the @dataclass decorator instead of inheriting:

Python

Run

ZnJvbSBkYXRhY2xhc3NlcyBpbXBvcnQgZGF0YWNsYXNzCgoKQGRhdGFjbGFzcwpjbGFzcyBDdXN0b21lcjoKICAgIGN1c3RvbWVyX2lkOiBzdHIKICAgIG5hbWU6IHN0cgogICAgZW1haWw6IHN0cgogICAgYWdlOiBpbnQKCgpjdXN0b21lciA9IEN1c3RvbWVyKAogICAgY3VzdG9tZXJfaWQ9IkMwMDEiLAogICAgbmFtZT0iQWxpY2UgU21pdGgiLAogICAgZW1haWw9ImFsaWNlQGV4YW1wbGUuY29tIiwKICAgIGFnZT0yOCwKKQoKcHJpbnQoY3VzdG9tZXIp

Output

Loading Python…

💡 What the output shows
The output matches the NamedTuple format. Both give you named fields and readable printing. Where they differ is mutability and default handling, which the next sections cover.

Quiz

What happens if you try to create Customer(customer_id="C001", nmae="Alice", email="a@b.com", age=28)?

A
The object is created with a nmae field instead of name

B
Python raises a TypeError because nmae is not a declared field

C
The nmae value is silently ignored

⚠ Try Again
Not quite. Unlike a dict that accepts any key, dataclass only accepts the fields you declared. Unknown field names are rejected.

💡 Correct
Correct! Like NamedTuple, dataclass only accepts its declared fields. Passing nmae instead of name raises a TypeError immediately.

⚠ Try Again
Not quite. Dataclass does not silently ignore unknown fields. It raises a TypeError because nmae is not in the field definitions.

← Previous

Complete & Continue →

Exercise: Build a Product Record

ScenarioAn inventory system receives product data as separate variables from a database query. You need to structure each product as a dataclass for type-safe access throughout the codebase.TaskDefine a Product dataclass with fields sku (str), name (str), price (float), and in_stock (bool). Create a product and print its formatted summary.

ZnJvbSBkYXRhY2xhc3NlcyBpbXBvcnQgZGF0YWNsYXNzCgojIERlZmluZSB0aGUgUHJvZHVjdCBkYXRhY2xhc3MgYmVsb3cKIyBGaWVsZHM6IHNrdSAoc3RyKSwgbmFtZSAoc3RyKSwgcHJpY2UgKGZsb2F0KSwgaW5fc3RvY2sgKGJvb2wpCgoKcHJvZHVjdCA9IFByb2R1Y3QoCiAgICBza3U9IldILTc4MjEiLAogICAgbmFtZT0iVVNCLUMgQ2FibGUiLAogICAgcHJpY2U9MTIuOTksCiAgICBpbl9zdG9jaz1UcnVlLAopCgpzdGF0dXMgPSAiQXZhaWxhYmxlIiBpZiBwcm9kdWN0LmluX3N0b2NrIGVsc2UgIk91dCBvZiBzdG9jayIKcHJpbnQoZiJ7cHJvZHVjdC5uYW1lfSAoe3Byb2R1Y3Quc2t1fSk6ICR7cHJvZHVjdC5wcmljZX0gLSB7c3RhdHVzfSIp

Run

Submit

Solution

Reset

Output

Ready

← Previous

Complete & Continue →

Mutability Allows Updates
Dataclass trades NamedTuple’s immutability protection for flexibility. You can modify fields after creation:

Python

Run

ZnJvbSBkYXRhY2xhc3NlcyBpbXBvcnQgZGF0YWNsYXNzCgoKQGRhdGFjbGFzcwpjbGFzcyBDdXN0b21lcjoKICAgIGN1c3RvbWVyX2lkOiBzdHIKICAgIG5hbWU6IHN0cgogICAgZW1haWw6IHN0cgogICAgYWdlOiBpbnQKICAgIGlzX3ByZW1pdW06IGJvb2wgPSBGYWxzZQoKCmN1c3RvbWVyID0gQ3VzdG9tZXIoCiAgICBjdXN0b21lcl9pZD0iQzAwMSIsCiAgICBuYW1lPSJBbGljZSBTbWl0aCIsCiAgICBlbWFpbD0iYWxpY2VAZXhhbXBsZS5jb20iLAogICAgYWdlPTI4LAopCgpjdXN0b21lci5uYW1lID0gIkFsaWNlIEpvaG5zb24iICAjIENoYW5nZWQgYWZ0ZXIgbWFycmlhZ2UKY3VzdG9tZXIuaXNfcHJlbWl1bSA9IFRydWUgICMgVXBncmFkZWQgdGhlaXIgYWNjb3VudAoKcHJpbnQoZiJ7Y3VzdG9tZXIubmFtZX0sIFByZW1pdW06IHtjdXN0b21lci5pc19wcmVtaXVtfSIp

Output

Loading Python…

💡 What the output shows
Unlike NamedTuple, dataclass allows field modification. This is useful for objects that need to change over time, like a customer upgrading their account.

To prevent accidentally adding new attributes, you can use @dataclass(slots=True), which creates a fixed set of attributes that cannot be changed:

Python

Run

ZnJvbSBkYXRhY2xhc3NlcyBpbXBvcnQgZGF0YWNsYXNzCgoKQGRhdGFjbGFzcyhzbG90cz1UcnVlKQpjbGFzcyBDdXN0b21lcjoKICAgIGN1c3RvbWVyX2lkOiBzdHIKICAgIG5hbWU6IHN0cgogICAgZW1haWw6IHN0cgogICAgYWdlOiBpbnQKICAgIGlzX3ByZW1pdW06IGJvb2wgPSBGYWxzZQoKCmN1c3RvbWVyID0gQ3VzdG9tZXIoCiAgICBjdXN0b21lcl9pZD0iQzAwMSIsCiAgICBuYW1lPSJBbGljZSIsCiAgICBlbWFpbD0iYWxpY2VAZXhhbXBsZS5jb20iLAogICAgYWdlPTI4LAopCgp0cnk6CiAgICBjdXN0b21lci5ubWFlID0gIkJvYiIgICMgVHlwbwpleGNlcHQgQXR0cmlidXRlRXJyb3IgYXMgZToKICAgIHByaW50KGYiQXR0cmlidXRlRXJyb3I6IHtlfSIp

Output

Loading Python…

💡 What the output shows
Without slots=True, the dataclass would silently create a new attribute nmae on the object. With slots, it raises an error immediately, catching the typo.

Quiz

What does @dataclass(slots=True) prevent?

A
Modifying existing fields like customer.name = "Bob"

B
Adding new attributes that were not declared in the class, like customer.nmae = "Bob"

C
Creating instances with missing fields

⚠ Try Again
Not quite. slots=True still allows modifying existing fields. It only prevents adding new attributes that were not part of the class definition.

💡 Correct
Correct! slots=True restricts the object to only the declared fields. Typos like customer.nmae raise an AttributeError instead of silently creating a new attribute.

⚠ Try Again
Not quite. Missing fields are already caught by the generated __init__ method, with or without slots=True. Slots specifically prevent adding undeclared attributes.

← Previous

Complete & Continue →

Mutable Defaults with default_factory
Remember the shared list problem from NamedTuple? Dataclass prevents this by rejecting mutable defaults entirely:

Python

Run

ZnJvbSBkYXRhY2xhc3NlcyBpbXBvcnQgZGF0YWNsYXNzCgp0cnk6CiAgICBAZGF0YWNsYXNzCiAgICBjbGFzcyBPcmRlcjoKICAgICAgICBpdGVtczogbGlzdCA9IFtdCmV4Y2VwdCBWYWx1ZUVycm9yIGFzIGU6CiAgICBwcmludChmIlZhbHVlRXJyb3I6IHtlfSIp

Output

Loading Python…

💡 What the output shows
Dataclass raises a ValueError instead of silently sharing the list. It forces you to use field(default_factory=…), which creates a new list for each instance.

Dataclass offers field(default_factory=…) as the solution. The factory function runs at instance creation, so each object gets its own list:

Python

Run

ZnJvbSBkYXRhY2xhc3NlcyBpbXBvcnQgZGF0YWNsYXNzLCBmaWVsZAoKCkBkYXRhY2xhc3MKY2xhc3MgT3JkZXI6CiAgICBvcmRlcl9pZDogc3RyCiAgICBpdGVtczogbGlzdCA9IGZpZWxkKGRlZmF1bHRfZmFjdG9yeT1saXN0KSAgIyBFYWNoIGluc3RhbmNlIGdldHMgaXRzIG93biBsaXN0CgoKb3JkZXIxID0gT3JkZXIoIjAwMSIpCm9yZGVyMiA9IE9yZGVyKCIwMDIiKQoKb3JkZXIxLml0ZW1zLmFwcGVuZCgiYXBwbGUiKQpwcmludChmIk9yZGVyIDE6IHtvcmRlcjEuaXRlbXN9IikKcHJpbnQoZiJPcmRlciAyOiB7b3JkZXIyLml0ZW1zfSIpICAjIE5vdCBhZmZlY3RlZCBieSBvcmRlcjE=

Output

Loading Python…

💡 What the output shows
Unlike the NamedTuple example, Order 2 stays empty because default_factory creates a fresh list for each instance. This is the safe way to use mutable defaults.

To see why this works, compare what happens at creation versus after appending:


config:
theme: dark
layout: dagre
look: neo

flowchart TD
subgraph After order1.items.append
o1b[order1.items] –> list1b[“[‘apple’]”]
o2b[order2.items] –> list2b[“[ ]”]
end

subgraph At creation
o1a[order1.items] –> list1a[“[ ]”]
o2a[order2.items] –> list2a[“[ ]”]
end

Quiz

Which of these dataclass fields requires field(default_factory=…)?

A
name: str = "unknown"

B
tags: list = field(default_factory=list)

C
is_active: bool = True

⚠ Try Again
Not quite. Strings are immutable in Python, so "unknown" is safe as a direct default. Each instance gets the same string object, but since it cannot be modified, sharing it causes no problems.

💡 Correct
Correct! Lists are mutable, so a direct default like tags: list = [] would be shared across instances. default_factory=list creates a fresh list for each instance.

⚠ Try Again
Not quite. Booleans are immutable, so True is safe as a direct default. Only mutable types like list, dict, and set need default_factory.

← Previous

Complete & Continue →

Exercise: Build a Shopping Cart

ScenarioAn e-commerce system creates a shopping cart for each customer. Each cart needs its own independent list of items so that adding to one cart doesn’t affect another.TaskDefine a Cart dataclass where items defaults to an empty list using default_factory. Add items to one cart and verify the other stays empty.

ZnJvbSBkYXRhY2xhc3NlcyBpbXBvcnQgZGF0YWNsYXNzLCBmaWVsZAoKIyBEZWZpbmUgQ2FydCBkYXRhY2xhc3MKIyBGaWVsZHM6IGN1c3RvbWVyIChzdHIpLCBpdGVtcyAobGlzdCwgZGVmYXVsdCBlbXB0eSkKCgpjYXJ0MSA9IENhcnQoY3VzdG9tZXI9IkFsaWNlIikKY2FydDIgPSBDYXJ0KGN1c3RvbWVyPSJCb2IiKQoKY2FydDEuaXRlbXMuYXBwZW5kKCJMYXB0b3AiKQpjYXJ0MS5pdGVtcy5hcHBlbmQoIk1vdXNlIikKCnByaW50KGYiQWxpY2U6IHtjYXJ0MS5pdGVtc30iKQpwcmludChmIkJvYjoge2NhcnQyLml0ZW1zfSIp

Run

Submit

Solution

Reset

Output

Ready

← Previous

Complete & Continue →

Post-Init Validation with __post_init__
Dataclass accepts any value that matches the type signature, so invalid data like empty names or negative ages passes through without warning:

Python

Run

ZnJvbSBkYXRhY2xhc3NlcyBpbXBvcnQgZGF0YWNsYXNzCgoKQGRhdGFjbGFzcwpjbGFzcyBDdXN0b21lcjoKICAgIGN1c3RvbWVyX2lkOiBzdHIKICAgIG5hbWU6IHN0cgogICAgZW1haWw6IHN0cgogICAgYWdlOiBpbnQKICAgIGlzX3ByZW1pdW06IGJvb2wgPSBGYWxzZQoKCmN1c3RvbWVyID0gQ3VzdG9tZXIoCiAgICBjdXN0b21lcl9pZD0iQzAwMSIsCiAgICBuYW1lPSIiLCAgIyBFbXB0eSBuYW1lCiAgICBlbWFpbD0iaW52YWxpZCIsCiAgICBhZ2U9LTEwMCwKKQpwcmludChmIkNyZWF0ZWQ6IHtjdXN0b21lcn0iKSAgIyBObyBlcnJvciAtIGJhZCBkYXRhIGlzIGluIHlvdXIgc3lzdGVt

Output

Loading Python…

💡 What the output shows
An empty name, invalid email, and negative age all pass through without any error. The bad data is now in your system, potentially corrupting downstream operations.

To catch these issues early, dataclass provides a special method called __post_init__ that runs automatically after __init__ finishes. You can add validation logic here to reject bad values at creation time:

Python

Run

ZnJvbSBkYXRhY2xhc3NlcyBpbXBvcnQgZGF0YWNsYXNzCgoKQGRhdGFjbGFzcwpjbGFzcyBDdXN0b21lcjoKICAgIGN1c3RvbWVyX2lkOiBzdHIKICAgIG5hbWU6IHN0cgogICAgZW1haWw6IHN0cgogICAgYWdlOiBpbnQKICAgIGlzX3ByZW1pdW06IGJvb2wgPSBGYWxzZQoKICAgIGRlZiBfX3Bvc3RfaW5pdF9fKHNlbGYpOgogICAgICAgIGlmIHNlbGYuYWdlIDwgMDoKICAgICAgICAgICAgcmFpc2UgVmFsdWVFcnJvcihmIkFnZSBjYW5ub3QgYmUgbmVnYXRpdmU6IHtzZWxmLmFnZX0iKQogICAgICAgIGlmICJAIiBub3QgaW4gc2VsZi5lbWFpbDoKICAgICAgICAgICAgcmFpc2UgVmFsdWVFcnJvcihmIkludmFsaWQgZW1haWw6IHtzZWxmLmVtYWlsfSIpCgoKdHJ5OgogICAgY3VzdG9tZXIgPSBDdXN0b21lcigKICAgICAgICBjdXN0b21lcl9pZD0iQzAwMSIsCiAgICAgICAgbmFtZT0iQWxpY2UiLAogICAgICAgIGVtYWlsPSJhbGljZS1hdC1lbWFpbCIsCiAgICAgICAgYWdlPTI4LAogICAgKQpleGNlcHQgVmFsdWVFcnJvciBhcyBlOgogICAgcHJpbnQoZiJWYWx1ZUVycm9yOiB7ZX0iKQ==

Output

Loading Python…

💡 What the output shows
The error fires at object creation, not later when you try to send an email. This means invalid data never enters your system in the first place.

Quiz

A dataclass has __post_init__ that validates email and age. You pass a valid email but age=-5. What happens?

A
The object is created with age=-5 and validation runs later when you use the field

B
The object is never created. __post_init__ raises a ValueError during construction.

C
The object is created but age is automatically set to 0

⚠ Try Again
Not quite. __post_init__ runs during construction, not when you access fields. The validation happens immediately.

💡 Correct
Correct! __post_init__ runs right after __init__. Since age=-5 fails the check, a ValueError is raised and the object is never returned to the caller.

⚠ Try Again
Not quite. __post_init__ does not auto-correct values. It either lets the object through or raises an error. Any correction logic must be written explicitly.

← Previous

Complete & Continue →

Limitations: Manual Validation Only
__post_init__ requires you to write every validation rule yourself. If you forget to check a field, bad data can still slip through:

Python

Run

ZnJvbSBkYXRhY2xhc3NlcyBpbXBvcnQgZGF0YWNsYXNzCgoKQGRhdGFjbGFzcwpjbGFzcyBDdXN0b21lcjoKICAgIGN1c3RvbWVyX2lkOiBzdHIKICAgIG5hbWU6IHN0cgogICAgZW1haWw6IHN0cgogICAgYWdlOiBpbnQKICAgIGlzX3ByZW1pdW06IGJvb2wgPSBGYWxzZQoKICAgIGRlZiBfX3Bvc3RfaW5pdF9fKHNlbGYpOgogICAgICAgIGlmICJAIiBub3QgaW4gc2VsZi5lbWFpbDoKICAgICAgICAgICAgcmFpc2UgVmFsdWVFcnJvcihmIkludmFsaWQgZW1haWw6IHtzZWxmLmVtYWlsfSIpCgoKIyBXcm9uZyB0eXBlcyBwYXNzIGJlY2F1c2UgX19wb3N0X2luaXRfXyBvbmx5IGNoZWNrcyBlbWFpbCBmb3JtYXQKY3VzdG9tZXIgPSBDdXN0b21lcigKICAgIGN1c3RvbWVyX2lkPSJDMDAxIiwKICAgIG5hbWU9MTIzLCAgIyBObyB2YWxpZGF0aW9uIGZvciBuYW1lIHR5cGUKICAgIGVtYWlsPSJhbGljZUBleGFtcGxlLmNvbSIsCiAgICBhZ2U9InR3ZW50eS1laWdodCIsICAjIE5vIHZhbGlkYXRpb24gZm9yIGFnZSB0eXBlCikKCnByaW50KGYiTmFtZToge2N1c3RvbWVyLm5hbWV9LCBBZ2U6IHtjdXN0b21lci5hZ2V9Iik=

Output

Loading Python…

💡 What the output shows
The name is an integer and the age is a string, yet dataclass accepted both. Type hints do not enforce types at runtime, so any validation you need must be written manually in __post_init__.

← Previous

Complete & Continue →

Limitations: Nested Validation
Most real data is nested: customers have addresses, orders have items. With dataclass, error messages don’t tell you where in the structure the problem occurred:

Python

Run

ZnJvbSBkYXRhY2xhc3NlcyBpbXBvcnQgZGF0YWNsYXNzCmltcG9ydCByZQoKCkBkYXRhY2xhc3MKY2xhc3MgQWRkcmVzczoKICAgIHN0cmVldDogc3RyCiAgICBjaXR5OiBzdHIKICAgIHppcF9jb2RlOiBzdHIKCiAgICBkZWYgX19wb3N0X2luaXRfXyhzZWxmKToKICAgICAgICBpZiBub3QgcmUubWF0Y2gociJeXGR7NX0kIiwgc2VsZi56aXBfY29kZSk6CiAgICAgICAgICAgIHJhaXNlIFZhbHVlRXJyb3IoZiJJbnZhbGlkIHppcDoge3NlbGYuemlwX2NvZGV9IikKCgpAZGF0YWNsYXNzCmNsYXNzIEN1c3RvbWVyOgogICAgY3VzdG9tZXJfaWQ6IHN0cgogICAgbmFtZTogc3RyCiAgICBhZGRyZXNzOiBBZGRyZXNzCgoKdHJ5OgogICAgY3VzdG9tZXIgPSBDdXN0b21lcigKICAgICAgICBjdXN0b21lcl9pZD0iQzAwMSIsCiAgICAgICAgbmFtZT0iQWxpY2UgU21pdGgiLAogICAgICAgIGFkZHJlc3M9QWRkcmVzcyhzdHJlZXQ9IjEyMyBNYWluIFN0IiwgY2l0eT0iTmV3IFlvcmsiLCB6aXBfY29kZT0iOUFCQzEiKSwKICAgICkKZXhjZXB0IFZhbHVlRXJyb3IgYXMgZToKICAgIHByaW50KGUp

Output

Loading Python…

💡 What the output shows
The error says “Invalid zip: 9ABC1” but doesn’t tell you it came from address.zip_code. In a deeply nested structure with multiple zip codes, you wouldn’t know which one failed.

Quiz

You pass address={"street": "123 Main St", "city": "NY", "zip_code": "10001"} to a dataclass Customer that expects address: Address. What happens?

A
Dataclass converts the dict to an Address object automatically

B
The dict is stored as-is without conversion or validation

C
Python raises a TypeError because a dict is not an Address

⚠ Try Again
Not quite. Dataclass does not convert types. You would need to write manual conversion in __post_init__ or pass Address(…) directly.

💡 Correct
Correct! Dataclass stores whatever you pass without checking types. The dict is accepted even though the type hint says Address. This means customer.address.city would raise an AttributeError later.

⚠ Try Again
Not quite. Dataclass does not enforce type hints at runtime. It accepts the dict silently, which can cause errors later when you try to access attributes.

← Previous

Complete & Continue →

Getting Started
So far, every container has treated type hints as documentation only. Pydantic is a third-party validation library that changes this. It checks types at runtime and raises clear errors when values don’t match.

To install Pydantic, run:

pip install pydantic

This course uses Pydantic 2.12.

Let’s verify the installation:

Python

Run

aW1wb3J0IHB5ZGFudGljCgpwcmludChmIlB5ZGFudGljIHZlcnNpb246IHtweWRhbnRpYy5fX3ZlcnNpb25fX30iKQpwcmludCgiSW5zdGFsbGF0aW9uIHN1Y2Nlc3NmdWwhIik=

Output

Loading Python…

← Previous

Complete & Continue →

Creating a Pydantic Model
Creating a Pydantic model looks similar to dataclass and NamedTuple. To create a Pydantic model, inherit from BaseModel and declare your fields:

Python

Run

ZnJvbSBweWRhbnRpYyBpbXBvcnQgQmFzZU1vZGVsCgoKY2xhc3MgQ3VzdG9tZXIoQmFzZU1vZGVsKToKICAgIGN1c3RvbWVyX2lkOiBzdHIKICAgIG5hbWU6IHN0cgogICAgZW1haWw6IHN0cgogICAgYWdlOiBpbnQKICAgIGlzX3ByZW1pdW06IGJvb2wgPSBGYWxzZQoKCmN1c3RvbWVyID0gQ3VzdG9tZXIoCiAgICBjdXN0b21lcl9pZD0iQzAwMSIsCiAgICBuYW1lPSJBbGljZSBTbWl0aCIsCiAgICBlbWFpbD0iYWxpY2VAZXhhbXBsZS5jb20iLAogICAgYWdlPTI4LAopCgpwcmludChmIntjdXN0b21lci5uYW1lfSwgQWdlOiB7Y3VzdG9tZXIuYWdlfSIp

Output

Loading Python…

💡 What the output shows
The syntax looks similar to dataclass, but Pydantic validates types automatically when you create the object. You’ll see the difference in the next section.

← Previous

Complete & Continue →

Runtime Validation
Remember how dataclass accepted name=123 without complaint? Pydantic catches this automatically:

Python

Run

ZnJvbSBweWRhbnRpYyBpbXBvcnQgQmFzZU1vZGVsLCBWYWxpZGF0aW9uRXJyb3IKCgpjbGFzcyBDdXN0b21lcihCYXNlTW9kZWwpOgogICAgY3VzdG9tZXJfaWQ6IHN0cgogICAgbmFtZTogc3RyCiAgICBlbWFpbDogc3RyCiAgICBhZ2U6IGludAogICAgaXNfcHJlbWl1bTogYm9vbCA9IEZhbHNlCgoKdHJ5OgogICAgY3VzdG9tZXIgPSBDdXN0b21lcigKICAgICAgICBjdXN0b21lcl9pZD0iQzAwMSIsCiAgICAgICAgbmFtZT0xMjMsCiAgICAgICAgZW1haWw9ImFsaWNlQGV4YW1wbGUuY29tIiwKICAgICAgICBhZ2U9InRoaXJ0eSIsCiAgICApCmV4Y2VwdCBWYWxpZGF0aW9uRXJyb3IgYXMgZToKICAgIHByaW50KGUp

Output

Loading Python…

💡 What the output shows
Pydantic reports all validation failures at once: name should be a string (got int 123) and age should be a valid integer (got string 'thirty'). This saves you from fixing one error, rerunning, and discovering another.

Quiz

How does Pydantic know that name=123 is invalid without any custom validation code?

A
Pydantic reads the type hint name: str and enforces it at runtime automatically

B
Pydantic uses the same __post_init__ mechanism as dataclass

C
Pydantic relies on Python’s built-in type checking

💡 Correct
Correct! Pydantic reads type hints and enforces them at runtime. With dataclass, name: str is just documentation. With Pydantic, it’s a rule that gets checked every time you create an object.

⚠ Try Again
Not quite. Pydantic uses its own validation engine, not __post_init__. Type enforcement is built into the BaseModel class itself.

⚠ Try Again
Not quite. Python does not enforce type hints at runtime. Pydantic adds this enforcement through its own validation layer on top of Python’s type system.

← Previous

Complete & Continue →

Exercise: Validate Signup Data

ScenarioA registration endpoint receives user signup data. Some entries have the wrong types: age as a non-numeric string and name as an integer. You need the model to catch all type errors at once.TaskDefine a UserSignup model that validates username (str), email (str), and age (int). Create a user with invalid data and print the validation errors.

ZnJvbSBweWRhbnRpYyBpbXBvcnQgQmFzZU1vZGVsLCBWYWxpZGF0aW9uRXJyb3IKCiMgRGVmaW5lIHRoZSBVc2VyU2lnbnVwIG1vZGVsIGJlbG93CiMgRmllbGRzOiB1c2VybmFtZSAoc3RyKSwgZW1haWwgKHN0ciksIGFnZSAoaW50KQoKCnRyeToKICAgIHVzZXIgPSBVc2VyU2lnbnVwKAogICAgICAgIHVzZXJuYW1lPTQyLAogICAgICAgIGVtYWlsPSJhbGljZUBleGFtcGxlLmNvbSIsCiAgICAgICAgYWdlPSJub3QtYS1udW1iZXIiLAogICAgKQogICAgcHJpbnQoZiJDcmVhdGVkOiB7dXNlci51c2VybmFtZX0sIHt1c2VyLmVtYWlsfSwgYWdlIHt1c2VyLmFnZX0iKQpleGNlcHQgVmFsaWRhdGlvbkVycm9yIGFzIGU6CiAgICBmb3IgZXJyb3IgaW4gZS5lcnJvcnMoKToKICAgICAgICBwcmludChmIntlcnJvclsnbG9jJ11bMF19OiB7ZXJyb3JbJ3R5cGUnXX0iKQ==

Run

Submit

Solution

Reset

Output

Ready

← Previous

Complete & Continue →

Type Coercion
Unlike dataclass which stores whatever you pass, Pydantic automatically converts compatible types:

Python

Run

ZnJvbSBweWRhbnRpYyBpbXBvcnQgQmFzZU1vZGVsCgoKY2xhc3MgQ3VzdG9tZXIoQmFzZU1vZGVsKToKICAgIGN1c3RvbWVyX2lkOiBzdHIKICAgIG5hbWU6IHN0cgogICAgZW1haWw6IHN0cgogICAgYWdlOiBpbnQKICAgIGlzX3ByZW1pdW06IGJvb2wgPSBGYWxzZQoKCmN1c3RvbWVyID0gQ3VzdG9tZXIoCiAgICBjdXN0b21lcl9pZD0iQzAwMSIsCiAgICBuYW1lPSJBbGljZSBTbWl0aCIsCiAgICBlbWFpbD0iYWxpY2VAZXhhbXBsZS5jb20iLAogICAgYWdlPSIyOCIsICAjIFN0cmluZyAiMjgiIGlzIGNvbnZlcnRlZCB0byBpbnQgMjgKICAgIGlzX3ByZW1pdW09InRydWUiLCAgIyBTdHJpbmcgInRydWUiIGlzIGNvbnZlcnRlZCB0byBib29sIFRydWUKKQoKcHJpbnQoZiJBZ2U6IHtjdXN0b21lci5hZ2V9ICh0eXBlOiB7dHlwZShjdXN0b21lci5hZ2UpLl9fbmFtZV9ffSkiKQpwcmludChmIlByZW1pdW06IHtjdXN0b21lci5pc19wcmVtaXVtfSAodHlwZToge3R5cGUoY3VzdG9tZXIuaXNfcHJlbWl1bSkuX19uYW1lX199KSIp

Output

Loading Python…

💡 What the output shows
The string "28" was converted to integer 28, and "true" was converted to boolean True. This is useful when reading data from CSV files or APIs where everything comes as strings.

Quiz

You pass age="twenty-eight" to a Pydantic model with age: int. What happens?

A
Pydantic converts it to 28 using natural language parsing

B
Pydantic raises a ValidationError because "twenty-eight" cannot be parsed as an integer

C
Pydantic stores it as a string since conversion failed

⚠ Try Again
Not quite. Pydantic only converts values that Python can parse directly, like "28" to 28. It does not interpret natural language.

💡 Correct
Correct! Pydantic tries to convert "twenty-eight" to an integer, fails, and raises a ValidationError. Coercion only works for values that can be directly parsed, like "28" or "3.14".

⚠ Try Again
Not quite. Pydantic does not silently fall back to storing the original value. If conversion fails, it raises a ValidationError.

← Previous

Complete & Continue →

Constraint Validation
Beyond types, you often need business rules: age must be positive, names can’t be empty, customer IDs must follow a pattern.

In dataclass, you define fields in one place and validate them in __post_init__. But raise stops at the first error, so you only learn about one problem at a time:

Python

Run

ZnJvbSBkYXRhY2xhc3NlcyBpbXBvcnQgZGF0YWNsYXNzCgoKQGRhdGFjbGFzcwpjbGFzcyBDdXN0b21lcjoKICAgIGN1c3RvbWVyX2lkOiBzdHIKICAgIG5hbWU6IHN0cgogICAgZW1haWw6IHN0cgogICAgYWdlOiBpbnQKICAgIGlzX3ByZW1pdW06IGJvb2wgPSBGYWxzZQoKICAgIGRlZiBfX3Bvc3RfaW5pdF9fKHNlbGYpOgogICAgICAgIGlmIG5vdCBzZWxmLmN1c3RvbWVyX2lkOgogICAgICAgICAgICByYWlzZSBWYWx1ZUVycm9yKCJDdXN0b21lciBJRCBjYW5ub3QgYmUgZW1wdHkiKQogICAgICAgIGlmIG5vdCBzZWxmLm5hbWUgb3IgbGVuKHNlbGYubmFtZSkgPCAxOgogICAgICAgICAgICByYWlzZSBWYWx1ZUVycm9yKCJOYW1lIGNhbm5vdCBiZSBlbXB0eSIpCiAgICAgICAgaWYgIkAiIG5vdCBpbiBzZWxmLmVtYWlsOgogICAgICAgICAgICByYWlzZSBWYWx1ZUVycm9yKGYiSW52YWxpZCBlbWFpbDoge3NlbGYuZW1haWx9IikKICAgICAgICBpZiBzZWxmLmFnZSA8IDAgb3Igc2VsZi5hZ2UgPiAxNTA6CiAgICAgICAgICAgIHJhaXNlIFZhbHVlRXJyb3IoZiJBZ2UgbXVzdCBiZSBiZXR3ZWVuIDAgYW5kIDE1MDoge3NlbGYuYWdlfSIpCgoKdHJ5OgogICAgY3VzdG9tZXIgPSBDdXN0b21lcigKICAgICAgICBjdXN0b21lcl9pZD0iIiwgICMgRW1wdHkgSUQKICAgICAgICBuYW1lPSIiLCAgIyBFbXB0eSBuYW1lCiAgICAgICAgZW1haWw9ImludmFsaWQiLCAgIyBNaXNzaW5nIEAKICAgICAgICBhZ2U9LTUsICAjIE5lZ2F0aXZlIGFnZQogICAgKQpleGNlcHQgVmFsdWVFcnJvciBhcyBlOgogICAgcHJpbnQoZSkgICMgT25seSByZXBvcnRzIHRoZSBmaXJzdCB2aW9sYXRpb24=

Output

Loading Python…

💡 What the output shows
Four fields need four if blocks, four raise calls, and four hand-written messages. That’s a lot of boilerplate for simple rules like “name can’t be empty.”
Worse, raise halts at the first failure, so you only learn about "Customer ID cannot be empty" even though three other fields are also invalid.

Pydantic puts constraints directly in Field(), keeping rules next to the data they validate:

Python

Run

ZnJvbSBweWRhbnRpYyBpbXBvcnQgQmFzZU1vZGVsLCBGaWVsZCwgVmFsaWRhdGlvbkVycm9yCgoKY2xhc3MgQ3VzdG9tZXIoQmFzZU1vZGVsKToKICAgIGN1c3RvbWVyX2lkOiBzdHIgPSBGaWVsZChtaW5fbGVuZ3RoPTEpCiAgICBuYW1lOiBzdHIgPSBGaWVsZChtaW5fbGVuZ3RoPTEpCiAgICBlbWFpbDogc3RyID0gRmllbGQocGF0dGVybj1yIi4rQC4rIikKICAgIGFnZTogaW50ID0gRmllbGQoZ2U9MCwgbGU9MTUwKQogICAgaXNfcHJlbWl1bTogYm9vbCA9IEZhbHNlCgoKdHJ5OgogICAgY3VzdG9tZXIgPSBDdXN0b21lcigKICAgICAgICBjdXN0b21lcl9pZD0iIiwgICMgRW1wdHkgSUQKICAgICAgICBuYW1lPSIiLCAgIyBFbXB0eSBuYW1lCiAgICAgICAgZW1haWw9ImludmFsaWQiLCAgIyBNaXNzaW5nIEAKICAgICAgICBhZ2U9LTUsICAjIE5lZ2F0aXZlIGFnZQogICAgKQpleGNlcHQgVmFsaWRhdGlvbkVycm9yIGFzIGU6CiAgICBwcmludChlKQ==

Output

Loading Python…

💡 What the output shows
The syntax is minimal: Field(min_length=1) and Field(ge=0, le=150) replace entire if blocks and hand-written error messages. Pydantic also checks every field in one pass, so all four violations surface together instead of one at a time.

Here are the most common Field() constraints:

Constraint
Type
Meaning

gt, ge
numeric
Greater than / greater than or equal

lt, le
numeric
Less than / less than or equal

multiple_of
numeric
Value must be divisible by this number

min_length, max_length
str, list
Minimum / maximum length

pattern
str
Must match a regex pattern

See the full list of Field parameters in the Pydantic docs.

Quiz

You pass name="", age=-5, and email="bad" to a Pydantic model with Field(min_length=1) on name, Field(ge=0) on age, and email validation. How many errors do you get?

A
One error for the first invalid field

B
Three errors, all reported in a single ValidationError

C
Three separate exceptions raised one after another

⚠ Try Again
Not quite. That’s how dataclass __post_init__ works, stopping at the first failure. Pydantic checks all fields.

💡 Correct
Correct! Pydantic validates every field and collects all failures into one ValidationError. Each violation is listed with its field name and the constraint that was broken.

⚠ Try Again
Not quite. Pydantic bundles all violations into a single ValidationError. You handle one exception that contains all the details.

← Previous

Complete & Continue →

Exercise: Validate a Job Posting

ScenarioA job board receives postings from employers. Each posting must have a non-empty title, a salary between 30,000 and 500,000, and a non-empty company name. Invalid postings should be rejected with all errors at once.TaskAdd Field() constraints to the JobPosting model so that invalid data is caught. Fix the constraints so the test case raises validation errors. 💡 Hint Useful Field() constraints: gt, ge, lt, le for numbers, min_length and max_length for strings.

ZnJvbSBweWRhbnRpYyBpbXBvcnQgQmFzZU1vZGVsLCBGaWVsZCwgVmFsaWRhdGlvbkVycm9yCgojIEFkZCBGaWVsZCgpIGNvbnN0cmFpbnRzIHRvIHJlamVjdCBpbnZhbGlkIGRhdGEKY2xhc3MgSm9iUG9zdGluZyhCYXNlTW9kZWwpOgogICAgdGl0bGU6IHN0cgogICAgY29tcGFueTogc3RyCiAgICBzYWxhcnk6IGludAoKCnRyeToKICAgIHBvc3RpbmcgPSBKb2JQb3N0aW5nKAogICAgICAgIHRpdGxlPSIiLAogICAgICAgIGNvbXBhbnk9IiIsCiAgICAgICAgc2FsYXJ5PTEwLAogICAgKQogICAgcHJpbnQoZiJDcmVhdGVkOiB7cG9zdGluZy50aXRsZX0gYXQge3Bvc3RpbmcuY29tcGFueX0sICR7cG9zdGluZy5zYWxhcnl9IikKZXhjZXB0IFZhbGlkYXRpb25FcnJvciBhcyBlOgogICAgZm9yIGVycm9yIGluIGUuZXJyb3JzKCk6CiAgICAgICAgcHJpbnQoZiJ7ZXJyb3JbJ2xvYyddWzBdfToge2Vycm9yWyd0eXBlJ119Iik=

Run

Submit

Solution

Reset

Output

Ready

← Previous

Complete & Continue →

Nested Validation
In the dataclass example, the error only said “Invalid zip: 9ABC1” with no way to trace it back to address.zip_code. Pydantic fixes this by reporting the full path to each error:

Python

Run

ZnJvbSBweWRhbnRpYyBpbXBvcnQgQmFzZU1vZGVsLCBGaWVsZCwgVmFsaWRhdGlvbkVycm9yCgoKY2xhc3MgQWRkcmVzcyhCYXNlTW9kZWwpOgogICAgc3RyZWV0OiBzdHIKICAgIGNpdHk6IHN0cgogICAgemlwX2NvZGU6IHN0ciA9IEZpZWxkKHBhdHRlcm49ciJeXGR7NX0kIikgICMgTXVzdCBiZSA1IGRpZ2l0cwoKCmNsYXNzIEN1c3RvbWVyKEJhc2VNb2RlbCk6CiAgICBjdXN0b21lcl9pZDogc3RyCiAgICBuYW1lOiBzdHIKICAgIGFkZHJlc3M6IEFkZHJlc3MKCgp0cnk6CiAgICBjdXN0b21lciA9IEN1c3RvbWVyKAogICAgICAgIGN1c3RvbWVyX2lkPSJDMDAxIiwKICAgICAgICBuYW1lPSJBbGljZSBTbWl0aCIsCiAgICAgICAgYWRkcmVzcz17CiAgICAgICAgICAgICJzdHJlZXQiOiAiMTIzIE1haW4gU3QiLAogICAgICAgICAgICAiY2l0eSI6ICJOZXcgWW9yayIsCiAgICAgICAgICAgICJ6aXBfY29kZSI6ICI5QUJDMSIsICAjIEludmFsaWQgemlwIGNvZGUKICAgICAgICB9LAogICAgKQpleGNlcHQgVmFsaWRhdGlvbkVycm9yIGFzIGU6CiAgICBwcmludChlKQ==

Output

Loading Python…

💡 What the output shows
Unlike the dataclass error, Pydantic points directly to address.zip_code. In a structure with multiple addresses or zip codes, you can trace the problem immediately.

Quiz

In the Pydantic example, address is passed as a plain dict, not an Address(…) object. What does Pydantic do with it?

A
Stores the dict as-is, like dataclass would

B
Converts the dict into an Address model and validates all its fields automatically

C
Raises an error because a dict is not an Address object

⚠ Try Again
Not quite. That’s what dataclass does. Pydantic recognizes that the dict matches the Address model’s fields and converts it automatically.

💡 Correct
Correct! Pydantic converts the raw dict into an Address model, then validates each field against its type hints and constraints. This is why you can pass nested data from JSON or APIs without manual conversion.

⚠ Try Again
Not quite. Pydantic accepts dicts for nested models and converts them automatically. This is one of the key differences from dataclass, which stores the dict without conversion.

← Previous

Complete & Continue →

Key Takeaways
Here’s what each tool provides:

dict: Quick to create, but silent failures from typos, missing keys, and wrong types make bugs hard to trace.
NamedTuple: Catches typos at creation and provides immutability, but does not enforce types at runtime and shares mutable defaults.
dataclass: Rejects mutable defaults with default_factory and supports validation via __post_init__, but errors are reported one at a time with no nesting path.
Pydantic: Enforces types at runtime, catches all validation errors at once, and reports the full path through nested structures like address.zip_code.

← Previous

Complete Course

×
Course Complete!
Nice work finishing this course. Ready to go deeper? Check out these courses with hands-on exercises:


DuckDB for Data Scientists
Query CSV, Parquet, and databases with SQL. No server needed.


Entity Extraction with spaCy and LLMs
Extract names, dates, and custom entities from text.

Browse all courses →

Python Data Modeling with Dataclasses and Pydantic Read More »

Entity Extraction with spaCy and LLMs

/* CodeMirror 5 CSS (inlined to prevent WordPress stripping) */
.CodeMirror{font-family:’Fira Code’,monospace;height:300px;color:#000;direction:ltr}.CodeMirror-lines{padding:4px 0}.CodeMirror pre.CodeMirror-line,.CodeMirror pre.CodeMirror-line-like{padding:0 4px}.CodeMirror-gutter-filler,.CodeMirror-scrollbar-filler{background-color:#fff}.CodeMirror-gutters{border-right:1px solid #ddd;background-color:#f7f7f7;white-space:nowrap}.CodeMirror-linenumber{padding:0 3px 0 5px;min-width:20px;text-align:right;color:#999;white-space:nowrap}.CodeMirror-guttermarker{color:#000}.CodeMirror-guttermarker-subtle{color:#999}.CodeMirror-cursor{border-left:1px solid #000;border-right:none;width:0}.CodeMirror div.CodeMirror-secondarycursor{border-left:1px solid silver}.cm-fat-cursor .CodeMirror-cursor{width:auto;border:0!important;background:#7e7}.cm-fat-cursor div.CodeMirror-cursors{z-index:1}.cm-fat-cursor .CodeMirror-line::selection,.cm-fat-cursor .CodeMirror-line>span::selection,.cm-fat-cursor .CodeMirror-line>span>span::selection{background:0 0}.cm-fat-cursor .CodeMirror-line::-moz-selection,.cm-fat-cursor .CodeMirror-line>span::-moz-selection,.cm-fat-cursor .CodeMirror-line>span>span::-moz-selection{background:0 0}.cm-fat-cursor{caret-color:transparent}@-moz-keyframes blink{50%{background-color:transparent}}@-webkit-keyframes blink{50%{background-color:transparent}}@keyframes blink{50%{background-color:transparent}}.cm-tab{display:inline-block;text-decoration:inherit}.CodeMirror-rulers{position:absolute;left:0;right:0;top:-50px;bottom:0;overflow:hidden}.CodeMirror-ruler{border-left:1px solid #ccc;top:0;bottom:0;position:absolute}.cm-s-default .cm-header{color:#00f}.cm-s-default .cm-quote{color:#090}.cm-negative{color:#d44}.cm-positive{color:#292}.cm-header,.cm-strong{font-weight:700}.cm-em{font-style:italic}.cm-link{text-decoration:underline}.cm-strikethrough{text-decoration:line-through}.cm-s-default .cm-keyword{color:#708}.cm-s-default .cm-atom{color:#219}.cm-s-default .cm-number{color:#164}.cm-s-default .cm-def{color:#00f}.cm-s-default .cm-variable-2{color:#05a}.cm-s-default .cm-type,.cm-s-default .cm-variable-3{color:#085}.cm-s-default .cm-comment{color:#a50}.cm-s-default .cm-string{color:#a11}.cm-s-default .cm-string-2{color:#f50}.cm-s-default .cm-meta{color:#555}.cm-s-default .cm-qualifier{color:#555}.cm-s-default .cm-builtin{color:#30a}.cm-s-default .cm-bracket{color:#997}.cm-s-default .cm-tag{color:#170}.cm-s-default .cm-attribute{color:#00c}.cm-s-default .cm-hr{color:#999}.cm-s-default .cm-link{color:#00c}.cm-s-default .cm-error{color:red}.cm-invalidchar{color:red}.CodeMirror-composing{border-bottom:2px solid}div.CodeMirror span.CodeMirror-matchingbracket{color:#0b0}div.CodeMirror span.CodeMirror-nonmatchingbracket{color:#a22}.CodeMirror-matchingtag{background:rgba(255,150,0,.3)}.CodeMirror-activeline-background{background:#e8f2ff}.CodeMirror{position:relative;overflow:hidden;background:#fff}.CodeMirror-scroll{overflow:scroll!important;margin-bottom:-50px;margin-right:-50px;padding-bottom:50px;height:100%;outline:0;position:relative;z-index:0}.CodeMirror-sizer{position:relative;border-right:50px solid transparent}.CodeMirror-gutter-filler,.CodeMirror-hscrollbar,.CodeMirror-scrollbar-filler,.CodeMirror-vscrollbar{position:absolute;z-index:6;display:none;outline:0}.CodeMirror-vscrollbar{right:0;top:0;overflow-x:hidden;overflow-y:scroll}.CodeMirror-hscrollbar{bottom:0;left:0;overflow-y:hidden;overflow-x:scroll}.CodeMirror-scrollbar-filler{right:0;bottom:0}.CodeMirror-gutter-filler{left:0;bottom:0}.CodeMirror-gutters{position:absolute;left:0;top:0;min-height:100%;z-index:3}.CodeMirror-gutter{white-space:normal;height:100%;display:inline-block;vertical-align:top;margin-bottom:-50px}.CodeMirror-gutter-wrapper{position:absolute;z-index:4;background:0 0!important;border:none!important}.CodeMirror-gutter-background{position:absolute;top:0;bottom:0;z-index:4}.CodeMirror-gutter-elt{position:absolute;cursor:default;z-index:4}.CodeMirror-gutter-wrapper ::selection{background-color:transparent}.CodeMirror-gutter-wrapper ::-moz-selection{background-color:transparent}.CodeMirror-lines{cursor:text;min-height:1px}.CodeMirror pre.CodeMirror-line,.CodeMirror pre.CodeMirror-line-like{-moz-border-radius:0;-webkit-border-radius:0;border-radius:0;border-width:0;background:0 0;font-family:inherit;font-size:inherit;margin:0;white-space:pre;word-wrap:normal;line-height:inherit;color:inherit;z-index:2;position:relative;overflow:visible;-webkit-tap-highlight-color:transparent;-webkit-font-variant-ligatures:contextual;font-variant-ligatures:contextual}.CodeMirror-wrap pre.CodeMirror-line,.CodeMirror-wrap pre.CodeMirror-line-like{word-wrap:break-word;white-space:pre-wrap;word-break:normal}.CodeMirror-linebackground{position:absolute;left:0;right:0;top:0;bottom:0;z-index:0}.CodeMirror-linewidget{position:relative;z-index:2;padding:.1px}.CodeMirror-rtl pre{direction:rtl}.CodeMirror-code{outline:0}.CodeMirror-gutter,.CodeMirror-gutters,.CodeMirror-linenumber,.CodeMirror-scroll,.CodeMirror-sizer{-moz-box-sizing:content-box;box-sizing:content-box}.CodeMirror-measure{position:absolute;width:100%;height:0;overflow:hidden;visibility:hidden}.CodeMirror-cursor{position:absolute;pointer-events:none}.CodeMirror-measure pre{position:static}div.CodeMirror-cursors{visibility:hidden;position:relative;z-index:3}div.CodeMirror-dragcursors{visibility:visible}.CodeMirror-focused div.CodeMirror-cursors{visibility:visible}.CodeMirror-selected{background:#d9d9d9}.CodeMirror-focused .CodeMirror-selected{background:#d7d4f0}.CodeMirror-crosshair{cursor:crosshair}.CodeMirror-line::selection,.CodeMirror-line>span::selection,.CodeMirror-line>span>span::selection{background:#d7d4f0}.CodeMirror-line::-moz-selection,.CodeMirror-line>span::-moz-selection,.CodeMirror-line>span>span::-moz-selection{background:#d7d4f0}.cm-searching{background-color:#ffa;background-color:rgba(255,255,0,.4)}.cm-force-border{padding-right:.1px}@media print{.CodeMirror div.CodeMirror-cursors{visibility:hidden}}.cm-tab-wrap-hack:after{content:”}span.CodeMirror-selectedtext{background:0 0}
/* Material Palenight theme */
.cm-s-material-palenight.CodeMirror{background-color:#292d3e;color:#a6accd}.cm-s-material-palenight .CodeMirror-gutters{background:#292d3e;color:#676e95;border:none}.cm-s-material-palenight .CodeMirror-guttermarker,.cm-s-material-palenight .CodeMirror-guttermarker-subtle,.cm-s-material-palenight .CodeMirror-linenumber{color:#676e95}.cm-s-material-palenight .CodeMirror-cursor{border-left:1px solid #fc0}.cm-s-material-palenight.cm-fat-cursor .CodeMirror-cursor{background-color:#607c8b80!important}.cm-s-material-palenight .cm-animate-fat-cursor{background-color:#607c8b80!important}.cm-s-material-palenight div.CodeMirror-selected{background:rgba(113,124,180,.2)}.cm-s-material-palenight.CodeMirror-focused div.CodeMirror-selected{background:rgba(113,124,180,.2)}.cm-s-material-palenight .CodeMirror-line::selection,.cm-s-material-palenight .CodeMirror-line>span::selection,.cm-s-material-palenight .CodeMirror-line>span>span::selection{background:rgba(128,203,196,.2)}.cm-s-material-palenight .CodeMirror-line::-moz-selection,.cm-s-material-palenight .CodeMirror-line>span::-moz-selection,.cm-s-material-palenight .CodeMirror-line>span>span::-moz-selection{background:rgba(128,203,196,.2)}.cm-s-material-palenight .CodeMirror-activeline-background{background:rgba(0,0,0,.5)}.cm-s-material-palenight .cm-keyword{color:#c792ea}.cm-s-material-palenight .cm-operator{color:#89ddff}.cm-s-material-palenight .cm-variable-2{color:#eff}.cm-s-material-palenight .cm-type,.cm-s-material-palenight .cm-variable-3{color:#f07178}.cm-s-material-palenight .cm-builtin{color:#ffcb6b}.cm-s-material-palenight .cm-atom{color:#f78c6c}.cm-s-material-palenight .cm-number{color:#ff5370}.cm-s-material-palenight .cm-def{color:#82aaff}.cm-s-material-palenight .cm-string{color:#c3e88d}.cm-s-material-palenight .cm-string-2{color:#f07178}.cm-s-material-palenight .cm-comment{color:#676e95}.cm-s-material-palenight .cm-variable{color:#f07178}.cm-s-material-palenight .cm-tag{color:#ff5370}.cm-s-material-palenight .cm-meta{color:#ffcb6b}.cm-s-material-palenight .cm-attribute{color:#c792ea}.cm-s-material-palenight .cm-property{color:#c792ea}.cm-s-material-palenight .cm-qualifier{color:#decb6b}.cm-s-material-palenight .cm-type,.cm-s-material-palenight .cm-variable-3{color:#decb6b}.cm-s-material-palenight .cm-error{color:#fff;background-color:#ff5370}.cm-s-material-palenight .CodeMirror-matchingbracket{text-decoration:underline;color:#fff!important}
* {
box-sizing: border-box;
margin: 0;
padding: 0;
}

body {
font-family: -apple-system, BlinkMacSystemFont, ‘Segoe UI’, Roboto, sans-serif;
background: #1a1a1a;
color: #f0f0f0;
line-height: 1.6;
}

/* Layout */
.course-layout {
display: flex;
min-height: 100vh;
}

/* Sidebar */
.course-sidebar {
width: 280px;
background: #2F2D2E;
border-right: 1px solid #4a4849;
position: fixed;
height: 100vh;
overflow-y: auto;
padding: 1.5rem 0;
}

.course-title {
padding: 0 1.5rem 1rem;
border-bottom: 1px solid #4a4849;
margin-bottom: 1rem;
}

.course-title h1 {
font-size: 1.1rem;
color: #72BEFA;
margin-bottom: 0.25rem;
}

.course-title .progress-text {
font-size: 0.75rem;
color: #888;
}

.progress-bar {
height: 4px;
background: #4a4849;
border-radius: 2px;
margin-top: 0.5rem;
overflow: hidden;
}

.progress-fill {
height: 100%;
background: #72BEFA;
width: 0%;
transition: width 0.3s;
}

/* Navigation */
.nav-section {
margin-bottom: 1rem;
}

.nav-section-title {
padding: 0.5rem 1.5rem;
font-size: 0.7rem;
text-transform: uppercase;
letter-spacing: 1px;
color: #888;
}

.nav-item {
display: flex;
align-items: center;
gap: 0.75rem;
padding: 0.6rem 1.5rem;
color: #ccc;
text-decoration: none;
font-size: 0.9rem;
transition: all 0.2s;
cursor: pointer;
border-left: 3px solid transparent;
}

.nav-item:hover {
background: #3d3b3c;
color: #fff;
}

.nav-item.active {
background: #3d3b3c;
border-left-color: #72BEFA;
color: #72BEFA;
}

.nav-item.completed .status-icon {
color: #72BEFA;
}

.status-icon {
width: 20px;
height: 20px;
min-width: 20px;
flex-shrink: 0;
display: flex;
align-items: center;
justify-content: center;
border: 2px solid #4a4849;
border-radius: 50%;
font-size: 0.7rem;
}

.nav-item.completed .status-icon {
border-color: #72BEFA;
background: rgba(114, 252, 219, 0.1);
}

.lock-icon {
margin-left: auto;
font-size: 0.75rem;
color: #666;
opacity: 0.7;
flex-shrink: 0;
min-width: 1rem;
}

/* Main content */
.course-content {
margin-left: 280px;
flex: 1;
padding: 2rem 3rem;
max-width: 900px;
}

.lesson {
display: none;
}

.lesson.active {
display: block;
}

.lesson h2 {
color: #72BEFA;
font-size: 1.75rem;
margin-bottom: 1.5rem;
padding-bottom: 0.5rem;
border-bottom: 2px solid #4a4849;
}

.lesson h3 {
color: #fff;
font-size: 1.25rem;
margin-top: 2rem;
margin-bottom: 1rem;
}

.lesson h4 {
color: #ccc;
font-size: 1.1rem;
margin-top: 1.5rem;
margin-bottom: 0.75rem;
}

.lesson p {
color: #ccc;
margin-bottom: 1rem;
}

.lesson ul, .lesson ol {
color: #ccc;
margin-bottom: 1rem;
padding-left: 1.5rem;
}

.lesson li {
margin-bottom: 0.5rem;
}

.lesson code {
background: #3d3b3c;
padding: 0.2rem 0.4rem;
border-radius: 4px;
font-family: ‘Fira Code’, monospace;
font-size: 0.9em;
color: #72BEFA;
}

.lesson pre {
background: #2F2D2E;
padding: 1rem;
border-radius: 8px;
overflow-x: auto;
margin-bottom: 1rem;
border: 1px solid #4a4849;
}

.lesson pre code {
background: none;
padding: 0;
color: #f8f8f2;
}

/* Callouts */
.callout {
padding: 1rem 1.25rem;
border-radius: 8px;
margin: 1.5rem 0;
border-left: 4px solid;
}

.callout-title {
font-weight: 600;
margin-bottom: 0.5rem;
display: flex;
align-items: center;
gap: 0.5rem;
}

.callout-tip {
background: rgba(114, 190, 250, 0.1);
border-color: #72BEFA;
}

.callout-tip .callout-title {
color: #72BEFA;
}

.callout-note {
background: rgba(114, 252, 219, 0.1);
border-color: #72FCDB;
}

.callout-note .callout-title {
color: #72FCDB;
}

.callout-warning {
background: rgba(229, 131, 182, 0.1);
border-color: #E583B6;
}

.callout-warning .callout-title {
color: #E583B6;
}

.callout a {
color: #fff;
text-decoration: underline;
}

.callout a:hover {
color: #72FCDB;
}

/* Collapsible callouts */
details.callout {
cursor: pointer;
}

details.callout summary.callout-title {
cursor: pointer;
list-style: none;
}

details.callout summary.callout-title::before {
content: ‘▶ ‘;
font-size: 0.8em;
transition: transform 0.2s;
display: inline-block;
}

details.callout[open] summary.callout-title::before {
transform: rotate(90deg);
}

details.callout summary.callout-title::-webkit-details-marker {
display: none;
}

details.callout > p {
margin-top: 0.75rem;
}

.callout pre {
background: #1a1a1a;
border-radius: 6px;
padding: 1rem;
margin-top: 0.75rem;
overflow-x: auto;
}

.callout pre code {
font-family: ‘Fira Code’, monospace;
font-size: 0.9rem;
color: #c3e88d;
}

/* Blockquotes */
.lesson blockquote {
border-left: 3px solid #72BEFA;
background: rgba(114, 190, 250, 0.08);
padding: 0.75rem 1.25rem;
border-radius: 0 6px 6px 0;
margin: 1rem 0;
}

.lesson blockquote p {
margin: 0;
color: rgba(255, 255, 255, 0.85);
}

/* Tables */
.course-table {
width: 100%;
border-collapse: collapse;
margin: 1rem 0 1.5rem 0;
font-size: 0.95rem;
}
.course-table th,
.course-table td {
border: 1px solid #4a4849;
padding: 0.6rem 1rem;
text-align: left;
}
.course-table thead th {
background: #3a3839;
color: #e0e0e0;
font-weight: 600;
}
.course-table tbody td {
color: #ccc;
}
.course-table tbody tr:nth-child(even) {
background: rgba(255, 255, 255, 0.03);
}

/* Quiz */
.quiz {
background: #2F2D2E;
border-radius: 8px;
padding: 1.5rem;
margin: 0 0 1.5rem 0;
border: 1px solid #4a4849;
}

.quiz-heading {
color: #ccc;
font-size: 1.1rem;
margin-top: 1.5rem;
margin-bottom: 0.75rem;
}

.quiz-divider {
border: none;
border-top: 1px solid #4a4849;
margin: 1.5rem 0;
}

.quiz-question {
color: #fff;
font-size: 1rem;
margin-bottom: 1rem;
font-weight: 500;
}

.quiz-options {
display: flex;
flex-direction: column;
gap: 0.75rem;
}

.quiz-option {
display: flex;
align-items: center;
gap: 0.75rem;
padding: 0.75rem 1rem;
background: #3d3b3c;
border: 2px solid #4a4849;
border-radius: 8px;
cursor: pointer;
transition: all 0.2s;
text-align: left;
width: 100%;
}

.quiz-option:hover:not(:disabled) {
border-color: #72BEFA;
background: #454243;
}

.quiz-option:disabled {
cursor: default;
}

.quiz-option.correct {
border-color: #72FCDB;
background: rgba(114, 252, 219, 0.15);
}

.quiz-option.incorrect {
border-color: #ff6b6b;
background: rgba(255, 107, 107, 0.15);
}

.option-label {
display: flex;
align-items: center;
justify-content: center;
width: 28px;
height: 28px;
min-width: 28px;
background: #4a4849;
border-radius: 50%;
font-weight: 600;
font-size: 0.85rem;
color: #fff;
}

.quiz-option.correct .option-label {
background: #72FCDB;
color: #2F2D2E;
}

.quiz-option.incorrect .option-label {
background: #ff6b6b;
color: #2F2D2E;
}

.option-content {
display: block;
flex: 1;
color: #ccc;
}

.option-content code {
background: #282a36;
padding: 0.15rem 0.4rem;
border-radius: 4px;
font-size: 0.85rem;
color: #f8f8f2;
}

.code-option code {
display: block;
padding: 0.5rem 0.75rem;
}

.quiz-feedback {
margin-top: 1rem;
padding-top: 1rem;
border-top: 1px solid #4a4849;
}

.quiz-feedback .callout {
margin: 0;
}

/* Code widget */
.codecut-widget {
background: #2F2D2E;
border-radius: 8px;
overflow: hidden;
margin: 1.5rem 0;
border: 1px solid #4a4849;
}

.codecut-widget-header {
display: flex;
justify-content: space-between;
align-items: center;
padding: 0.5rem 1rem;
background: #3d3b3c;
border-bottom: 1px solid #4a4849;
}

.codecut-widget-lang {
color: #72BEFA;
font-size: 0.75rem;
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.5px;
}

.codecut-run-btn {
display: flex;
align-items: center;
gap: 0.4rem;
background: #72BEFA;
color: #2F2D2E;
border: none;
padding: 0.4rem 0.8rem;
border-radius: 4px;
font-size: 0.8rem;
font-weight: 600;
cursor: pointer;
transition: all 0.2s;
}

.codecut-run-btn:hover {
background: #5aa8e8;
}

.codecut-run-btn:disabled {
background: #666;
cursor: not-allowed;
}

.codecut-editor {
min-height: 80px;
background: #2F2D2E;
}

.codecut-editor > textarea,
.exercise-editor > textarea {
display: none;
}

/* Static code widgets (read-only, no header/output) */
.codecut-widget[data-static=”true”] {
border-radius: 8px;
border: 1px solid #4a4849;
}

.codecut-widget[data-static=”true”] .codecut-editor {
border-radius: 8px;
min-height: auto;
}

.codecut-widget[data-static=”true”] .codecut-editor textarea {
min-height: auto;
}

.codecut-widget[data-static=”true”] .CodeMirror {
min-height: auto;
}

.codecut-widget[data-static=”true”] .CodeMirror-scroll {
min-height: auto;
}

.codecut-widget[data-demo=”true”] .codecut-editor {
min-height: auto;
}

.codecut-widget[data-demo=”true”] .codecut-editor textarea {
min-height: auto;
}

.codecut-widget[data-demo=”true”] .CodeMirror {
min-height: auto;
}

.codecut-widget[data-demo=”true”] .CodeMirror-scroll {
min-height: auto;
}

/* CodeMirror 5 styling overrides */
.CodeMirror {
height: auto;
min-height: 80px;
font-family: ‘Fira Code’, monospace;
font-size: 0.9rem;
line-height: 1.5;
background: #282a36;
border-radius: 0;
}

.CodeMirror-scroll {
min-height: 80px;
overflow-x: auto !important;
overflow-y: hidden !important;
}

.CodeMirror-gutters {
background: #282a36;
border-right: 1px solid #4a4849;
min-width: 40px;
}

.CodeMirror-linenumber {
color: #6272a4;
padding: 0 8px 0 5px;
min-width: 25px;
text-align: right;
}

.CodeMirror-sizer {
margin-left: 40px !important;
}

.CodeMirror-cursor {
border-left-color: #72BEFA;
}

.CodeMirror-selected {
background: rgba(114, 190, 250, 0.3) !important;
}

.CodeMirror-focused .CodeMirror-selected {
background: rgba(114, 190, 250, 0.4) !important;
}

/* Suppress red error background for $ and other valid-in-context tokens */
.cm-s-material-palenight .cm-error {
background: none;
}

.codecut-output-section {
margin-top: 0.75rem;
border-top: 2px solid #4a4849;
background: #252324;
}

.codecut-output-header {
padding: 0.4rem 1rem;
background: #3d3b3c;
border-bottom: 1px solid #4a4849;
}

.codecut-output-label {
color: #aaa;
font-size: 0.75rem;
font-weight: 600;
text-transform: uppercase;
}

.codecut-output {
padding: 1rem;
min-height: 60px;
max-height: 300px;
overflow-y: auto;
font-family: ‘Fira Code’, monospace;
font-size: 0.85rem;
line-height: 1.5;
color: #f8f8f2;
white-space: pre-wrap;
}

.course-image {
max-width: 100%;
height: auto;
border-radius: 4px;
display: block;
margin: 1em 0;
}

pre.mermaid {
text-align: center;
background: transparent;
border: none;
padding: 1em 0;
margin: 1em 0;
}

pre.mermaid svg {
background: transparent !important;
}

.codecut-output img {
max-width: 100%;
height: auto;
border-radius: 4px;
}

.codecut-output.has-image {
max-height: none;
white-space: normal;
}

.codecut-output.error { color: #ff6b6b; }
.codecut-output.loading { color: #72BEFA; }
.codecut-output .success { color: #72BEFA; }

.codecut-spinner {
display: inline-block;
width: 14px;
height: 14px;
border: 2px solid #2F2D2E;
border-top-color: transparent;
border-radius: 50%;
animation: spin 0.8s linear infinite;
}

@keyframes spin {
to { transform: rotate(360deg); }
}

/* Exercise widget */
.exercise-widget {
background: #1e1e2e;
border-radius: 12px;
overflow: hidden;
margin: 1.5rem 0;
border: 1px solid #4a4849;
}

.exercise-split {
display: flex;
flex-direction: column;
}

.exercise-left {
padding: 20px 24px;
background: #252535;
border-bottom: 1px solid #4a4849;
}

.exercise-title {
color: #72BEFA;
font-size: 1rem;
font-weight: 600;
margin: 0 0 1rem 0;
text-transform: uppercase;
letter-spacing: 0.5px;
}

.exercise-assignment {
color: #e0e0e0;
font-size: 0.9rem;
line-height: 1.6;
display: flex;
flex-wrap: wrap;
gap: 1.5rem 3rem;
}

.exercise-assignment p {
margin: 0;
}

.exercise-heading {
color: #72BEFA;
font-size: 0.75rem;
font-weight: 600;
margin: 0 0 0.4rem 0;
text-transform: uppercase;
letter-spacing: 0.5px;
}

.exercise-section {
flex: 1;
min-width: 200px;
}

.exercise-heading + p {
margin-top: 0;
}

.exercise-assignment em {
color: #ffffff;
font-style: italic;
}

.exercise-assignment code {
background: #3d3b3c;
padding: 0.2rem 0.4rem;
border-radius: 4px;
font-family: ‘Fira Code’, monospace;
font-size: 0.85rem;
}

.exercise-secrets {
margin-top: 1rem;
padding-top: 1rem;
border-top: 1px solid #3d3b3c;
}

.exercise-secret {
display: flex;
flex-direction: column;
gap: 0.4rem;
margin-bottom: 0.75rem;
}

.exercise-secret:last-child {
margin-bottom: 0;
}

.exercise-secret label {
color: #72BEFA;
font-size: 0.75rem;
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.5px;
}

.exercise-secret input {
padding: 0.6rem 0.8rem;
background: #1e1e2e;
border: 1px solid #4a4849;
border-radius: 6px;
color: #e0e0e0;
font-family: ‘Fira Code’, monospace;
font-size: 0.85rem;
outline: none;
transition: border-color 0.2s;
}

.exercise-secret input:focus {
border-color: #72BEFA;
}

.exercise-secret input::placeholder {
color: #666;
}

.exercise-right {
display: flex;
flex-direction: column;
background: #1e1e2e;
}

.exercise-editor {
flex: 1;
min-height: 200px;
background: #282a36;
}

.exercise-editor textarea {
width: 100%;
min-height: 200px;
padding: 1rem;
background: #282a36;
color: #f8f8f2;
border: none;
font-family: ‘Fira Code’, monospace;
font-size: 0.9rem;
line-height: 1.5;
resize: none;
outline: none;
}

.exercise-actions {
display: flex;
gap: 8px;
padding: 12px 16px;
background: #1a1a2e;
border-top: 1px solid #4a4849;
}

.exercise-btn {
display: flex;
align-items: center;
gap: 0.4rem;
padding: 0.5rem 1rem;
border: none;
border-radius: 6px;
font-size: 0.85rem;
font-weight: 600;
cursor: pointer;
transition: all 0.2s;
background: #3d3b3c;
color: #e0e0e0;
}

.exercise-btn:hover {
background: #4d4b4c;
}

.exercise-btn:disabled {
opacity: 0.5;
cursor: not-allowed;
}

.exercise-btn.primary {
background: #72BEFA;
color: #1e1e2e;
}

.exercise-btn.primary:hover {
background: #5aa8e8;
}

.exercise-btn.primary:disabled {
background: #666;
}

.exercise-output-section {
border-top: 1px solid #4a4849;
background: #1e1e2e;
}

.exercise-output-header {
padding: 0.5rem 1rem;
background: #252535;
border-bottom: 1px solid #4a4849;
}

.exercise-output-label {
color: #888;
font-size: 0.75rem;
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.5px;
}

.exercise-output {
padding: 1rem;
font-family: ‘Fira Code’, monospace;
font-size: 0.9rem;
line-height: 1.5;
color: #f8f8f2;
white-space: pre-wrap;
max-height: 200px;
overflow-y: auto;
}

.exercise-output.error { color: #ff6b6b; }
.exercise-output.loading { color: #72BEFA; }
.exercise-output.success { color: #72FCDB; }

.exercise-result {
padding: 1rem;
margin: 0;
font-weight: 600;
text-align: center;
}

.exercise-result.success {
background: rgba(114, 252, 219, 0.1);
color: #72FCDB;
border-top: 2px solid #72FCDB;
}

.exercise-result.failure {
background: rgba(255, 107, 107, 0.1);
color: #ff6b6b;
border-top: 2px solid #ff6b6b;
}

/* Navigation buttons */
.lesson-nav {
display: flex;
justify-content: space-between;
margin-top: 3rem;
padding-top: 2rem;
border-top: 1px solid #4a4849;
}

.lesson-nav-btn {
display: flex;
align-items: center;
gap: 0.5rem;
padding: 0.75rem 1.5rem;
background: #3d3b3c;
color: #fff;
border: none;
border-radius: 8px;
font-size: 0.9rem;
cursor: pointer;
transition: all 0.2s;
}

.lesson-nav-btn:hover {
background: #4a4849;
}

.lesson-nav-btn.primary {
background: #72BEFA;
color: #2F2D2E;
}

.lesson-nav-btn.primary:hover {
background: #5aa8e8;
}

/* Completion modal */
.completion-overlay {
display: none;
position: fixed;
inset: 0;
background: rgba(0, 0, 0, 0.7);
z-index: 1000;
align-items: center;
justify-content: center;
padding: 1rem;
}

.completion-modal {
background: #2F2D2E;
border: 1px solid #4a4849;
border-radius: 16px;
max-width: 520px;
width: 100%;
padding: 2.5rem;
text-align: center;
position: relative;
}

.completion-modal-close {
position: absolute;
top: 1rem;
right: 1rem;
background: none;
border: none;
color: #999;
font-size: 1.25rem;
cursor: pointer;
padding: 0.25rem;
line-height: 1;
}

.completion-modal-close:hover {
color: #fff;
}

.completion-modal h2 {
color: #72BEFA;
font-size: 1.5rem;
margin-bottom: 0.5rem;
}

.completion-modal p {
color: #ccc;
margin-bottom: 1.5rem;
font-size: 0.95rem;
line-height: 1.5;
}

.completion-courses {
display: flex;
flex-direction: column;
gap: 0.75rem;
margin-bottom: 1.5rem;
}

.completion-course-card {
display: block;
background: #3d3b3c;
border: 1px solid #4a4849;
border-radius: 10px;
padding: 1rem 1.25rem;
text-decoration: none;
text-align: left;
transition: border-color 0.2s;
}

.completion-course-card:hover {
border-color: #72BEFA;
}

.completion-course-card .card-title {
color: #72BEFA;
font-size: 0.95rem;
font-weight: 600;
margin-bottom: 0.25rem;
}

.completion-course-card .card-desc {
color: #999;
font-size: 0.8rem;
}

.completion-browse {
display: inline-block;
color: #E583B6;
font-size: 0.9rem;
text-decoration: none;
}

.completion-browse:hover {
text-decoration: underline;
}

/* Responsive */
@media (max-width: 768px) {
.course-sidebar {
width: 100%;
position: relative;
height: auto;
}

.course-content {
margin-left: 0;
padding: 1.5rem;
}

.course-layout {
flex-direction: column;
}
}

Entity Extraction with spaCy and LLMs
0 of 17 completed

Getting Started


What is Entity Extraction?


Sample Document

The Manual Approach


Why Not Use Regex?

spaCy NER


Production-Grade Named Entity Recognition


Exercise: Build a Contact List
🔒


Extracting from Business Documents
🔒


Exercise: Export Contact List
🔒


Visualizing Entities with displaCy
🔒

GLiNER


Zero-Shot Custom Entity Extraction


Extracting Business Entities
🔒


Exercise: Parse Business Metrics
🔒


Using Confidence Scores for Quality Control
🔒


Exercise: Route Low-Confidence to Review
🔒

langextract


AI-Powered Extraction with Source Grounding


Exercise: Analyze Customer Feedback
🔒


Visualizing Extractions
🔒

Summary


When to Use Each Tool
🔒

What is Entity Extraction?
Entity extraction (also called Named Entity Recognition or NER) automatically identifies and classifies key information from unstructured text. For instance, financial reports contain company names, monetary figures, executives, dates, and locations used for competitive analysis and executive tracking.

Extracting these entities manually is time-consuming and error-prone. Automated entity extraction provides a faster and more reliable alternative.

In this course, you’ll learn three modern tools for entity extraction:

spaCy: Production-ready NER with pre-trained models
GLiNER: Zero-shot custom entity recognition
langextract: AI-powered extraction with source grounding

Complete & Continue →

Sample Document
Throughout this course, we’ll extract entities from this earnings report.

Press Run below to try it out.

Python

Run

ZWFybmluZ19yZXBvcnQgPSAiIiIKQXBwbGUgSW5jLiAoTkFTREFROiBBQVBMKSByZXBvcnRlZCB0aGlyZCBxdWFydGVyIHJldmVudWUgb2YgJDgxLjQgYmlsbGlvbiwKdXAgMiUgeWVhciBvdmVyIHllYXIuIENFTyBUaW0gQ29vayBzdGF0ZWQgdGhhdCBTZXJ2aWNlcyByZXZlbnVlIHJlYWNoZWQKYSBuZXcgYWxsLXRpbWUgaGlnaCBvZiAkMjEuMiBiaWxsaW9uLiBUaGUgY29tcGFueSdzIGJvYXJkIG9mIGRpcmVjdG9ycwpkZWNsYXJlZCBhIGNhc2ggZGl2aWRlbmQgb2YgJDAuMjQgcGVyIHNoYXJlLgoKQ0ZPIEx1Y2EgTWFlc3RyaSBtZW50aW9uZWQgdGhhdCBpUGhvbmUgcmV2ZW51ZSB3YXMgJDM5LjMgYmlsbGlvbiBmb3IKdGhlIHF1YXJ0ZXIgZW5kaW5nIEp1bmUgMzAsIDIwMjMuIFRoZSBjb21wYW55IGV4cGVjdHMgdG90YWwgcmV2ZW51ZQpiZXR3ZWVuICQ4OSBiaWxsaW9uIGFuZCAkOTMgYmlsbGlvbiBmb3IgdGhlIGZvdXJ0aCBxdWFydGVyLgoKQXBwbGUncyBDdXBlcnRpbm8gaGVhZHF1YXJ0ZXJzIGFubm91bmNlZCB0aGUgYWNxdWlzaXRpb24gb2YgQUkgc3RhcnR1cApXYXZlT25lIGZvciBhbiB1bmRpc2Nsb3NlZCBhbW91bnQuIFRoZSBkZWFsIGlzIGV4cGVjdGVkIHRvIGNsb3NlIGluClE0IDIwMjMsIHBlbmRpbmcgcmVndWxhdG9yeSBhcHByb3ZhbCBmcm9tIHRoZSBTRUMuCiIiIgoKcHJpbnQoIkVhcm5pbmdzIHJlcG9ydCBsb2FkZWQhIikKcHJpbnQoZiJEb2N1bWVudCBsZW5ndGg6IHtsZW4oZWFybmluZ19yZXBvcnQpfSBjaGFyYWN0ZXJzIik=

Output

Loading Python…

We chose this report because it’s dense with overlapping entity types, which is exactly what makes real-world extraction challenging:

Monetary amounts appear in different contexts: revenue ($81.4B), dividends ($0.24), and forecasted ranges ($89B-$93B)
Named entities overlap: “Apple Inc.” is both a company and a stock ticker (AAPL), and “SEC” is an abbreviation that needs context to identify
Temporal references mix formats: exact dates (June 30, 2023), quarters (Q4 2023), and relative time (year over year)

← Previous

Complete & Continue →

Why Not Use Regex?
Regular expressions define text patterns using special syntax to find matches in strings. While they may seem like a natural first choice for entity extraction, they require a separate pattern for each entity type and fail when formats vary.

Here’s what extracting financial amounts, dates, stock symbols, and quarters with regex looks like:

Python

Run

aW1wb3J0IHJlCgplYXJuaW5nX3JlcG9ydCA9ICIiIgpBcHBsZSBJbmMuIChOQVNEQVE6IEFBUEwpIHJlcG9ydGVkIHRoaXJkIHF1YXJ0ZXIgcmV2ZW51ZSBvZiAkODEuNCBiaWxsaW9uLAp1cCAyJSB5ZWFyIG92ZXIgeWVhci4gQ0VPIFRpbSBDb29rIHN0YXRlZCB0aGF0IFNlcnZpY2VzIHJldmVudWUgcmVhY2hlZAphIG5ldyBhbGwtdGltZSBoaWdoIG9mICQyMS4yIGJpbGxpb24uIENGTyBMdWNhIE1hZXN0cmkgbWVudGlvbmVkIHRoYXQKaVBob25lIHJldmVudWUgd2FzICQzOS4zIGJpbGxpb24gZm9yIHRoZSBxdWFydGVyIGVuZGluZyBKdW5lIDMwLCAyMDIzLgoiIiIKCiMgRWFjaCBlbnRpdHkgdHlwZSBuZWVkcyBhIHNlcGFyYXRlIGNvbXBsZXggcGF0dGVybgpmaW5hbmNpYWxfcGF0dGVybiA9IHIiXCQoPzpcZHsxLDN9KD86LFxkezN9KSt8XGQrKSg/OlwuWzAtOV0rKT8oPzpccyooPzpiaWxsaW9ufG1pbGxpb258dHJpbGxpb24pKT8iCmRhdGVfcGF0dGVybiA9IHIiXGIoPzpKYW51YXJ5fEZlYnJ1YXJ5fE1hcmNofEFwcmlsfE1heXxKdW5lfEp1bHl8QXVndXN0fFNlcHRlbWJlcnxPY3RvYmVyfE5vdmVtYmVyfERlY2VtYmVyKVxzK1xkezEsMn0sXHMrXGR7NH0iCnN0b2NrX3BhdHRlcm4gPSByIlxiKD86TkFTREFRfE5ZU0V8TllTRUFSQ0EpOlxzKltBLVpdezIsNX1cYiIKcXVhcnRlcl9wYXR0ZXJuID0gciJcYihRWzEtNF1ccytcZHs0fSlcYiIKCnByaW50KCJGaW5hbmNpYWwgYW1vdW50czoiLCByZS5maW5kYWxsKGZpbmFuY2lhbF9wYXR0ZXJuLCBlYXJuaW5nX3JlcG9ydCwgcmUuSUdOT1JFQ0FTRSkpCnByaW50KCJEYXRlczoiLCByZS5maW5kYWxsKGRhdGVfcGF0dGVybiwgZWFybmluZ19yZXBvcnQpKQpwcmludCgiU3RvY2sgc3ltYm9sczoiLCByZS5maW5kYWxsKHN0b2NrX3BhdHRlcm4sIGVhcm5pbmdfcmVwb3J0KSkKcHJpbnQoIlF1YXJ0ZXJzOiIsIHJlLmZpbmRhbGwocXVhcnRlcl9wYXR0ZXJuLCBlYXJuaW5nX3JlcG9ydCkp

Output

Loading Python…

From the code above, several limitations become apparent:

Each entity type requires its own pattern, resulting in verbose boilerplate code that is difficult to read and maintain.
The patterns only match numeric quarter formats like “Q4 2023” and miss textual forms such as “third quarter” unless additional exact-match patterns are added.

Quiz

A document contains dates in formats like “January 15, 2024”, “15/01/2024”, and “2024-01-15”. What challenge does regex face here?

A
Regex cannot match numeric characters

B
Each date format requires a separate pattern, making the code harder to maintain as formats increase

C
Regex patterns are limited to 100 characters in length

⚠ Try Again
Not quite. Regex handles numeric characters easily with patterns like \d. The challenge is handling multiple format variations.

💡 Correct
Correct! Each date format (ISO, US, European, written) needs its own pattern. As formats multiply, the codebase grows harder to maintain and test.

⚠ Try Again
Not quite. Regex patterns have no practical length limit. The challenge is writing and maintaining patterns for every format variation.

← Previous

Complete & Continue →

Production-Grade Named Entity Recognition
spaCy provides pre-trained models that automatically identify entities like PERSON, ORG, MONEY, DATE, and PERCENT from context. No pattern writing required.

Let’s install spaCy and download a small English model to get started:

pip install spacy
python -m spacy download en_core_web_sm

Extracting entities with spaCy takes just two steps:

Load the model
Process your text

Python

Run

aW1wb3J0IHNwYWN5CgojIExvYWQgdGhlIG1vZGVsCm5scCA9IHNwYWN5LmxvYWQoImVuX2NvcmVfd2ViX3NtIikKCiMgUHJvY2VzcyB5b3VyIHRleHQKc2FtcGxlX3RleHQgPSAiQXBwbGUgSW5jLiByZXBvcnRlZCByZXZlbnVlIG9mICQ4MS40IGJpbGxpb24gd2l0aCBDRU8gVGltIENvb2suIgpkb2MgPSBubHAoc2FtcGxlX3RleHQpCgpwcmludCgiRW50aXRpZXMgZm91bmQ6IikKZm9yIGVudCBpbiBkb2MuZW50czoKICAgIHByaW50KGYiICAne2VudC50ZXh0fScgLT4ge2VudC5sYWJlbF99Iik=

Output

💡 What the output shows

spaCy extracted three entity types (ORG, MONEY, PERSON) without any configuration
The model understood that “Apple Inc.” is a company, not just a fruit
It captured the complete monetary amount “$81.4 billion” including the unit
Person names are recognized even without titles like “CEO”

How spaCy NER Works

spaCy labels each token individually using its BILUO tagging scheme, then groups consecutive entity tokens into spans:

"Apple" "Inc." "CEO" "Tim" "Cook" "$81.4" "billion"
│ │ │ │ │ │ │
▼ ▼ ▼ ▼ ▼ ▼ ▼
B-ORG L-ORG O B-PER L-PER B-MONEY L-MONEY
└───┬───┘ └──┬──┘ └────┬────┘
▼ ▼ ▼
"Apple Inc." → ORG "Tim Cook" → PERSON "$81.4 billion" → MONEY

Begin / Inside / Last mark multi-token entities
Unit marks single-token entities (e.g., “London” → U-LOC)
O means outside any entity

The model learns these tagging patterns from thousands of labeled examples during training.

Quiz

How does spaCy determine that “Apple Inc.” is an ORG entity?

A
It matches against a built-in dictionary of known company names

B
It uses regex to match common organization name patterns

C
The pre-trained model learned patterns from labeled training data

⚠ Try Again
Not quite. spaCy doesn’t use a fixed lookup table. It uses a statistical model that can recognize entities it has never seen before based on learned patterns.

⚠ Try Again
Not quite. Regex uses fixed text patterns. spaCy’s NER model uses neural networks trained on annotated text to predict entity types from context.

💡 Correct
Correct! spaCy’s NER is a statistical model trained on annotated text. It learned patterns like capitalization, surrounding words, and name structures from its training data, not from a fixed list or regex rules.

← Previous

Complete & Continue →

Exercise: Build a Contact List

← Previous

Complete & Continue →

Extracting from Business Documents

← Previous

Complete & Continue →

Exercise: Export Contact List

← Previous

Complete & Continue →

Visualizing Entities with displaCy

← Previous

Complete & Continue →

Zero-Shot Custom Entity Extraction
GLiNER solves spaCy’s limitation of fixed entity types. Instead of being locked into categories like ORG or GPE, GLiNER lets you define custom types using natural language descriptions.

pip install gliner

GLiNER offers several pretrained models. We’ll use gliner_small-v2.1 with threshold=0.3 to capture entities with at least 30% confidence:

Python

Run

ZnJvbSBnbGluZXIgaW1wb3J0IEdMaU5FUgoKbW9kZWwgPSBHTGlORVIuZnJvbV9wcmV0cmFpbmVkKCJ1cmNoYWRlL2dsaW5lcl9zbWFsbC12Mi4xIikKCnRlc3RfdGV4dCA9ICJBcHBsZSBJbmMuIENFTyBUaW0gQ29vayBhbm5vdW5jZWQgcXVhcnRlcmx5IHJldmVudWUgb2YgJDgxLjQgYmlsbGlvbi4iCmN1c3RvbV90eXBlcyA9IFsiQ29tcGFueSIsICJQZXJzb24iLCAiQ3VycmVuY3kiXQoKZW50aXRpZXMgPSBtb2RlbC5wcmVkaWN0X2VudGl0aWVzKHRlc3RfdGV4dCwgY3VzdG9tX3R5cGVzLCB0aHJlc2hvbGQ9MC4zKQoKZm9yIGVudGl0eSBpbiBlbnRpdGllczoKICAgIHByaW50KGYiJ3tlbnRpdHlbJ3RleHQnXX0nIC0+IHtlbnRpdHlbJ2xhYmVsJ119IChjb25maWRlbmNlOiB7ZW50aXR5WydzY29yZSddOi4zZn0pIik=

Output

💡 What the output shows

GLiNER recognized custom entity types without any training
Confidence scores vary: “Tim Cook” (0.563) scores highest as names are distinctive, while “$81.4 billion” (0.310) scores lower because “Currency” is a less common label

📝 Other model options
For higher accuracy, try gliner_medium-v2.1. For multilingual support, use gliner_multi-v2.1.

How GLiNER Works

Instead of tagging individual tokens, GLiNER scores entire spans against every label you provide. The highest-scoring label wins, and spans below your threshold are filtered out:

┌──────────────┬───────────┬──────────────────┐
│ Span │ Label │ Confidence │
├──────────────┼───────────┼──────────────────┤
│ Apple Inc │ Company │ ████░░░░░░░ 0.36 │ ✓ above 0.3
│ Apple Inc │ Person │ █░░░░░░░░░░ 0.05 │ ✗
├──────────────┼───────────┼──────────────────┤
│ Tim Cook │ Company │ █░░░░░░░░░░ 0.04 │ ✗
│ Tim Cook │ Person │ ██████░░░░░ 0.56 │ ✓ above 0.3
├──────────────┼───────────┼──────────────────┤
│ $81.4 billion│ Company │ ░░░░░░░░░░░ 0.01 │ ✗
│ $81.4 billion│ Currency │ ███░░░░░░░░ 0.31 │ ✓ above 0.3
└──────────────┴───────────┴──────────────────┘
threshold = 0.3 ▲

This gives you two controls spaCy doesn’t: custom labels (any text, not a fixed set) and a confidence threshold to filter results.

Quiz

How does GLiNER decide which label to assign to a text span?

A
It picks the first label in your list that partially matches

B
It scores the span against every label and picks the highest

C
It uses a dictionary lookup to map known words to labels

⚠ Try Again
Not quite. The order of labels in your list doesn’t affect the result. GLiNER evaluates all labels equally for each span.

💡 Correct
Correct! As shown in the diagram, each span is scored against all labels. “Apple Inc” scored 0.36 for Company, 0.05 for Person, and 0.02 for Currency. The highest score (Company) wins.

⚠ Try Again
Not quite. GLiNER doesn’t use a fixed dictionary. It uses a BERT-like encoder to compare text spans against label descriptions semantically.

← Previous

Complete & Continue →

Extracting Business Entities

← Previous

Complete & Continue →

Exercise: Parse Business Metrics

← Previous

Complete & Continue →

Using Confidence Scores for Quality Control

← Previous

Complete & Continue →

Exercise: Route Low-Confidence to Review

← Previous

Complete & Continue →

AI-Powered Extraction with Source Grounding
langextract uses large language models (Gemini, GPT) to understand entity relationships and provide source attribution.

It captures semantic context like “AI startup WaveOne” (category + name) and “between $89 billion and $93 billion” (revenue ranges) as complete phrases rather than separate pieces.

Let’s install langextract along with its dependencies to try it out:

pip install langextract python-dotenv google-genai

To authenticate, add your API key to a .env file. This course uses Gemini (get a key from AI Studio), but OpenAI models also work:

# .env file
LANGEXTRACT_API_KEY=your-api-key-here

langextract uses an LLM to extract entities. You provide examples that teach the model what to look for and how to format the output:

Example (you provide):
┌─────────────────────────────────────────────────────┐
│ Text: "Microsoft Corp. CEO Satya Nadella reported │
│ Q2 2024 revenue of $65B" │
│ │
│ Extractions: │
│ company → "Microsoft Corp." │
│ executive → "CEO Satya Nadella" ← role + name │
│ quarter → "Q2 2024" │
│ financial → "$65B" │
└──────────────────────┬──────────────────────────────┘
│ teaches format

New text: "Apple Inc… CEO Tim Cook… $81.4 billion"


Output (model generates):
┌─────────────────────────────────────────────────────┐
│ company → "Apple Inc." │
│ executive → "CEO Tim Cook" ← same format │
│ executive → "CFO Luca Maestri" ← generalized │
│ financial → "undisclosed amount" ← semantic │
└─────────────────────────────────────────────────────┘

The LLM generalizes from your examples. One example showing “CEO Satya Nadella” is enough for it to also extract “CFO Luca Maestri” and understand “undisclosed amount” as a financial figure, something spaCy and GLiNER would miss.

Few-Shot Learning with Examples

To use langextract, provide two components:

Prompt: A description listing entity types to extract (companies, executives, financial figures)
Examples: Sample text paired with labeled extractions showing expected output

Python

Run

aW1wb3J0IG9zCmZyb20gZG90ZW52IGltcG9ydCBsb2FkX2RvdGVudgppbXBvcnQgbGFuZ2V4dHJhY3QgYXMgbHgKZnJvbSBsYW5nZXh0cmFjdCBpbXBvcnQgZXh0cmFjdAoKbG9hZF9kb3RlbnYoKQoKZGVmIGV4dHJhY3RfZmluYW5jaWFsX2VudGl0aWVzKHRleHQpOgogICAgIiIiRXh0cmFjdCBlbnRpdGllcyB1c2luZyBsYW5nZXh0cmFjdC4iIiIKICAgIHByb21wdF9kZXNjcmlwdGlvbiA9ICIiIkV4dHJhY3QgYnVzaW5lc3MgZW50aXRpZXM6IGNvbXBhbmllcywgZXhlY3V0aXZlcywKICAgIGZpbmFuY2lhbCBmaWd1cmVzLCBxdWFydGVycywgbG9jYXRpb25zLCBwcm9kdWN0cywgc3RhcnR1cHMsCiAgICByZWd1bGF0b3J5IGJvZGllcywgc3RvY2tfc3ltYm9scywgbWFya2V0X3JlYWN0aW9uLiIiIgoKICAgIGV4YW1wbGVzID0gWwogICAgICAgIGx4LmRhdGEuRXhhbXBsZURhdGEoCiAgICAgICAgICAgIHRleHQ9Ik1pY3Jvc29mdCBDb3JwLiAoTllTRTogTVNGVCkgQ0VPIFNhdHlhIE5hZGVsbGEgcmVwb3J0ZWQgUTIgMjAyNCByZXZlbnVlIG9mICQ2NUIsIGRvd24gNSUgcXVhcnRlci1vdmVyLXF1YXJ0ZXIuIiwKICAgICAgICAgICAgZXh0cmFjdGlvbnM9WwogICAgICAgICAgICAgICAgbHguZGF0YS5FeHRyYWN0aW9uKGV4dHJhY3Rpb25fY2xhc3M9ImNvbXBhbnkiLCBleHRyYWN0aW9uX3RleHQ9Ik1pY3Jvc29mdCBDb3JwLiIpLAogICAgICAgICAgICAgICAgbHguZGF0YS5FeHRyYWN0aW9uKGV4dHJhY3Rpb25fY2xhc3M9ImV4ZWN1dGl2ZSIsIGV4dHJhY3Rpb25fdGV4dD0iQ0VPIFNhdHlhIE5hZGVsbGEiKSwKICAgICAgICAgICAgICAgIGx4LmRhdGEuRXh0cmFjdGlvbihleHRyYWN0aW9uX2NsYXNzPSJzdG9ja19zeW1ib2wiLCBleHRyYWN0aW9uX3RleHQ9Ik5ZU0U6IE1TRlQiKSwKICAgICAgICAgICAgICAgIGx4LmRhdGEuRXh0cmFjdGlvbihleHRyYWN0aW9uX2NsYXNzPSJxdWFydGVyIiwgZXh0cmFjdGlvbl90ZXh0PSJRMiAyMDI0IiksCiAgICAgICAgICAgICAgICBseC5kYXRhLkV4dHJhY3Rpb24oZXh0cmFjdGlvbl9jbGFzcz0iZmluYW5jaWFsX2ZpZ3VyZSIsIGV4dHJhY3Rpb25fdGV4dD0iJDY1QiIpLAogICAgICAgICAgICAgICAgbHguZGF0YS5FeHRyYWN0aW9uKGV4dHJhY3Rpb25fY2xhc3M9Im1hcmtldF9yZWFjdGlvbiIsIGV4dHJhY3Rpb25fdGV4dD0iZG93biA1JSBxdWFydGVyLW92ZXItcXVhcnRlciIpLAogICAgICAgICAgICBdCiAgICAgICAgKQogICAgXQoKICAgIHJldHVybiBleHRyYWN0KAogICAgICAgIHRleHRfb3JfZG9jdW1lbnRzPXRleHQsCiAgICAgICAgcHJvbXB0X2Rlc2NyaXB0aW9uPXByb21wdF9kZXNjcmlwdGlvbiwKICAgICAgICBleGFtcGxlcz1leGFtcGxlcywKICAgICAgICBtb2RlbF9pZD0iZ2VtaW5pLTIuNS1mbGFzaCIKICAgICk=

Output

Now extract entities from the earnings report:

Python

Run

ZnJvbSBjb2xsZWN0aW9ucyBpbXBvcnQgZGVmYXVsdGRpY3QKCmVhcm5pbmdfcmVwb3J0ID0gIiIiCkFwcGxlIEluYy4gKE5BU0RBUTogQUFQTCkgcmVwb3J0ZWQgdGhpcmQgcXVhcnRlciByZXZlbnVlIG9mICQ4MS40IGJpbGxpb24sCnVwIDIlIHllYXIgb3ZlciB5ZWFyLiBDRU8gVGltIENvb2sgc3RhdGVkIHRoYXQgU2VydmljZXMgcmV2ZW51ZSByZWFjaGVkCmEgbmV3IGFsbC10aW1lIGhpZ2ggb2YgJDIxLjIgYmlsbGlvbi4gVGhlIGNvbXBhbnkncyBib2FyZCBvZiBkaXJlY3RvcnMKZGVjbGFyZWQgYSBjYXNoIGRpdmlkZW5kIG9mICQwLjI0IHBlciBzaGFyZS4KCkNGTyBMdWNhIE1hZXN0cmkgbWVudGlvbmVkIHRoYXQgaVBob25lIHJldmVudWUgd2FzICQzOS4zIGJpbGxpb24gZm9yCnRoZSBxdWFydGVyIGVuZGluZyBKdW5lIDMwLCAyMDIzLiBUaGUgY29tcGFueSBleHBlY3RzIHRvdGFsIHJldmVudWUKYmV0d2VlbiAkODkgYmlsbGlvbiBhbmQgJDkzIGJpbGxpb24gZm9yIHRoZSBmb3VydGggcXVhcnRlci4KCkFwcGxlJ3MgQ3VwZXJ0aW5vIGhlYWRxdWFydGVycyBhbm5vdW5jZWQgdGhlIGFjcXVpc2l0aW9uIG9mIEFJIHN0YXJ0dXAKV2F2ZU9uZSBmb3IgYW4gdW5kaXNjbG9zZWQgYW1vdW50LiBUaGUgZGVhbCBpcyBleHBlY3RlZCB0byBjbG9zZSBpbgpRNCAyMDIzLCBwZW5kaW5nIHJlZ3VsYXRvcnkgYXBwcm92YWwgZnJvbSB0aGUgU0VDLgoiIiIKCnJlc3VsdCA9IGV4dHJhY3RfZmluYW5jaWFsX2VudGl0aWVzKGVhcm5pbmdfcmVwb3J0KQoKbm9uX2VtcHR5ID0gW2UgZm9yIGUgaW4gcmVzdWx0LmV4dHJhY3Rpb25zIGlmIGUuZXh0cmFjdGlvbl90ZXh0XQpwcmludChmIkV4dHJhY3RlZCB7bGVuKG5vbl9lbXB0eSl9IGVudGl0aWVzOiIpCgpncm91cGVkID0gZGVmYXVsdGRpY3QobGlzdCkKZm9yIGV4dHJhY3Rpb24gaW4gcmVzdWx0LmV4dHJhY3Rpb25zOgogICAgaWYgZXh0cmFjdGlvbi5leHRyYWN0aW9uX3RleHQ6ICAjIEZpbHRlciBlbXB0eSBleHRyYWN0aW9ucwogICAgICAgIGdyb3VwZWRbZXh0cmFjdGlvbi5leHRyYWN0aW9uX2NsYXNzXS5hcHBlbmQoZXh0cmFjdGlvbi5leHRyYWN0aW9uX3RleHQpCgpmb3IgZW50aXR5X2NsYXNzLCB0ZXh0cyBpbiBncm91cGVkLml0ZW1zKCk6CiAgICBwcmludChmIlxue2VudGl0eV9jbGFzcy51cHBlcigpfSAoe2xlbih0ZXh0cyl9IGZvdW5kKToiKQogICAgZm9yIHRleHQgaW4gdGV4dHM6CiAgICAgICAgcHJpbnQoZiIgICd7dGV4dH0nIik=

Output

💡 What the output shows

Role-linked executives (“CEO Tim Cook”) instead of just the name
Semantic understanding of “undisclosed amount” as a financial figure
Market reaction “up 2% year over year” captured with full context

Quiz

The example extracts “CEO Satya Nadella” as an executive. How does this affect the model’s output?

A
The model will only extract executives from Microsoft

B
The model learns to include the role (CEO/CFO) with the name

C
The model copies the exact format and ignores other patterns

⚠ Try Again
Not quite. The example teaches a pattern, not a specific company. The model applied the same pattern to extract “CEO Tim Cook” and “CFO Luca Maestri” from Apple’s report.

💡 Correct
Correct! The few-shot example teaches the model what format to use. Since the example linked the role to the name, the model did the same for “CEO Tim Cook” and “CFO Luca Maestri.”

⚠ Try Again
Not quite. The model generalizes from the example. It extracted “CFO Luca Maestri” even though the example only showed a CEO pattern.

langextract extracted “undisclosed amount” as a financial figure. Why would spaCy and GLiNER likely miss this?

A
“undisclosed amount” is too long for token-based models

B
It contains no numbers or currency symbols, which pattern-based models rely on to identify financial entities

C
spaCy and GLiNER can’t process sentences about acquisitions

⚠ Try Again
Not quite. Both spaCy and GLiNER handle multi-token spans. “Cupertino headquarters” was captured as a two-word span by GLiNER.

💡 Correct
Correct! spaCy’s MONEY type and GLiNER’s “Monetary Value” label both depend on numeric patterns. langextract’s LLM understands that “undisclosed amount” refers to money semantically, even without numbers.

⚠ Try Again
Not quite. Both tools can process any text. The issue is that “undisclosed amount” lacks the numeric patterns these models use to identify financial entities.

← Previous

Complete & Continue →

Exercise: Analyze Customer Feedback

← Previous

Complete & Continue →

Visualizing Extractions

← Previous

Complete & Continue →

When to Use Each Tool

← Previous

Complete Course

×
Course Complete!
Nice work finishing this course. Ready to go deeper? Check out these courses with hands-on exercises:


DuckDB for Data Scientists
Query CSV, Parquet, and databases with SQL. No server needed.


Python Data Modeling with Dataclasses and Pydantic
Choose the right data container: dict, NamedTuple, dataclass, or Pydantic.

Browse all courses →

Entity Extraction with spaCy and LLMs Read More »

DuckDB for Data Scientists

/* CodeMirror 5 CSS (inlined to prevent WordPress stripping) */
.CodeMirror{font-family:’Fira Code’,monospace;height:300px;color:#000;direction:ltr}.CodeMirror-lines{padding:4px 0}.CodeMirror pre.CodeMirror-line,.CodeMirror pre.CodeMirror-line-like{padding:0 4px}.CodeMirror-gutter-filler,.CodeMirror-scrollbar-filler{background-color:#fff}.CodeMirror-gutters{border-right:1px solid #ddd;background-color:#f7f7f7;white-space:nowrap}.CodeMirror-linenumber{padding:0 3px 0 5px;min-width:20px;text-align:right;color:#999;white-space:nowrap}.CodeMirror-guttermarker{color:#000}.CodeMirror-guttermarker-subtle{color:#999}.CodeMirror-cursor{border-left:1px solid #000;border-right:none;width:0}.CodeMirror div.CodeMirror-secondarycursor{border-left:1px solid silver}.cm-fat-cursor .CodeMirror-cursor{width:auto;border:0!important;background:#7e7}.cm-fat-cursor div.CodeMirror-cursors{z-index:1}.cm-fat-cursor .CodeMirror-line::selection,.cm-fat-cursor .CodeMirror-line>span::selection,.cm-fat-cursor .CodeMirror-line>span>span::selection{background:0 0}.cm-fat-cursor .CodeMirror-line::-moz-selection,.cm-fat-cursor .CodeMirror-line>span::-moz-selection,.cm-fat-cursor .CodeMirror-line>span>span::-moz-selection{background:0 0}.cm-fat-cursor{caret-color:transparent}@-moz-keyframes blink{50%{background-color:transparent}}@-webkit-keyframes blink{50%{background-color:transparent}}@keyframes blink{50%{background-color:transparent}}.cm-tab{display:inline-block;text-decoration:inherit}.CodeMirror-rulers{position:absolute;left:0;right:0;top:-50px;bottom:0;overflow:hidden}.CodeMirror-ruler{border-left:1px solid #ccc;top:0;bottom:0;position:absolute}.cm-s-default .cm-header{color:#00f}.cm-s-default .cm-quote{color:#090}.cm-negative{color:#d44}.cm-positive{color:#292}.cm-header,.cm-strong{font-weight:700}.cm-em{font-style:italic}.cm-link{text-decoration:underline}.cm-strikethrough{text-decoration:line-through}.cm-s-default .cm-keyword{color:#708}.cm-s-default .cm-atom{color:#219}.cm-s-default .cm-number{color:#164}.cm-s-default .cm-def{color:#00f}.cm-s-default .cm-variable-2{color:#05a}.cm-s-default .cm-type,.cm-s-default .cm-variable-3{color:#085}.cm-s-default .cm-comment{color:#a50}.cm-s-default .cm-string{color:#a11}.cm-s-default .cm-string-2{color:#f50}.cm-s-default .cm-meta{color:#555}.cm-s-default .cm-qualifier{color:#555}.cm-s-default .cm-builtin{color:#30a}.cm-s-default .cm-bracket{color:#997}.cm-s-default .cm-tag{color:#170}.cm-s-default .cm-attribute{color:#00c}.cm-s-default .cm-hr{color:#999}.cm-s-default .cm-link{color:#00c}.cm-s-default .cm-error{color:red}.cm-invalidchar{color:red}.CodeMirror-composing{border-bottom:2px solid}div.CodeMirror span.CodeMirror-matchingbracket{color:#0b0}div.CodeMirror span.CodeMirror-nonmatchingbracket{color:#a22}.CodeMirror-matchingtag{background:rgba(255,150,0,.3)}.CodeMirror-activeline-background{background:#e8f2ff}.CodeMirror{position:relative;overflow:hidden;background:#fff}.CodeMirror-scroll{overflow:scroll!important;margin-bottom:-50px;margin-right:-50px;padding-bottom:50px;height:100%;outline:0;position:relative;z-index:0}.CodeMirror-sizer{position:relative;border-right:50px solid transparent}.CodeMirror-gutter-filler,.CodeMirror-hscrollbar,.CodeMirror-scrollbar-filler,.CodeMirror-vscrollbar{position:absolute;z-index:6;display:none;outline:0}.CodeMirror-vscrollbar{right:0;top:0;overflow-x:hidden;overflow-y:scroll}.CodeMirror-hscrollbar{bottom:0;left:0;overflow-y:hidden;overflow-x:scroll}.CodeMirror-scrollbar-filler{right:0;bottom:0}.CodeMirror-gutter-filler{left:0;bottom:0}.CodeMirror-gutters{position:absolute;left:0;top:0;min-height:100%;z-index:3}.CodeMirror-gutter{white-space:normal;height:100%;display:inline-block;vertical-align:top;margin-bottom:-50px}.CodeMirror-gutter-wrapper{position:absolute;z-index:4;background:0 0!important;border:none!important}.CodeMirror-gutter-background{position:absolute;top:0;bottom:0;z-index:4}.CodeMirror-gutter-elt{position:absolute;cursor:default;z-index:4}.CodeMirror-gutter-wrapper ::selection{background-color:transparent}.CodeMirror-gutter-wrapper ::-moz-selection{background-color:transparent}.CodeMirror-lines{cursor:text;min-height:1px}.CodeMirror pre.CodeMirror-line,.CodeMirror pre.CodeMirror-line-like{-moz-border-radius:0;-webkit-border-radius:0;border-radius:0;border-width:0;background:0 0;font-family:inherit;font-size:inherit;margin:0;white-space:pre;word-wrap:normal;line-height:inherit;color:inherit;z-index:2;position:relative;overflow:visible;-webkit-tap-highlight-color:transparent;-webkit-font-variant-ligatures:contextual;font-variant-ligatures:contextual}.CodeMirror-wrap pre.CodeMirror-line,.CodeMirror-wrap pre.CodeMirror-line-like{word-wrap:break-word;white-space:pre-wrap;word-break:normal}.CodeMirror-linebackground{position:absolute;left:0;right:0;top:0;bottom:0;z-index:0}.CodeMirror-linewidget{position:relative;z-index:2;padding:.1px}.CodeMirror-rtl pre{direction:rtl}.CodeMirror-code{outline:0}.CodeMirror-gutter,.CodeMirror-gutters,.CodeMirror-linenumber,.CodeMirror-scroll,.CodeMirror-sizer{-moz-box-sizing:content-box;box-sizing:content-box}.CodeMirror-measure{position:absolute;width:100%;height:0;overflow:hidden;visibility:hidden}.CodeMirror-cursor{position:absolute;pointer-events:none}.CodeMirror-measure pre{position:static}div.CodeMirror-cursors{visibility:hidden;position:relative;z-index:3}div.CodeMirror-dragcursors{visibility:visible}.CodeMirror-focused div.CodeMirror-cursors{visibility:visible}.CodeMirror-selected{background:#d9d9d9}.CodeMirror-focused .CodeMirror-selected{background:#d7d4f0}.CodeMirror-crosshair{cursor:crosshair}.CodeMirror-line::selection,.CodeMirror-line>span::selection,.CodeMirror-line>span>span::selection{background:#d7d4f0}.CodeMirror-line::-moz-selection,.CodeMirror-line>span::-moz-selection,.CodeMirror-line>span>span::-moz-selection{background:#d7d4f0}.cm-searching{background-color:#ffa;background-color:rgba(255,255,0,.4)}.cm-force-border{padding-right:.1px}@media print{.CodeMirror div.CodeMirror-cursors{visibility:hidden}}.cm-tab-wrap-hack:after{content:”}span.CodeMirror-selectedtext{background:0 0}
/* Material Palenight theme */
.cm-s-material-palenight.CodeMirror{background-color:#292d3e;color:#a6accd}.cm-s-material-palenight .CodeMirror-gutters{background:#292d3e;color:#676e95;border:none}.cm-s-material-palenight .CodeMirror-guttermarker,.cm-s-material-palenight .CodeMirror-guttermarker-subtle,.cm-s-material-palenight .CodeMirror-linenumber{color:#676e95}.cm-s-material-palenight .CodeMirror-cursor{border-left:1px solid #fc0}.cm-s-material-palenight.cm-fat-cursor .CodeMirror-cursor{background-color:#607c8b80!important}.cm-s-material-palenight .cm-animate-fat-cursor{background-color:#607c8b80!important}.cm-s-material-palenight div.CodeMirror-selected{background:rgba(113,124,180,.2)}.cm-s-material-palenight.CodeMirror-focused div.CodeMirror-selected{background:rgba(113,124,180,.2)}.cm-s-material-palenight .CodeMirror-line::selection,.cm-s-material-palenight .CodeMirror-line>span::selection,.cm-s-material-palenight .CodeMirror-line>span>span::selection{background:rgba(128,203,196,.2)}.cm-s-material-palenight .CodeMirror-line::-moz-selection,.cm-s-material-palenight .CodeMirror-line>span::-moz-selection,.cm-s-material-palenight .CodeMirror-line>span>span::-moz-selection{background:rgba(128,203,196,.2)}.cm-s-material-palenight .CodeMirror-activeline-background{background:rgba(0,0,0,.5)}.cm-s-material-palenight .cm-keyword{color:#c792ea}.cm-s-material-palenight .cm-operator{color:#89ddff}.cm-s-material-palenight .cm-variable-2{color:#eff}.cm-s-material-palenight .cm-type,.cm-s-material-palenight .cm-variable-3{color:#f07178}.cm-s-material-palenight .cm-builtin{color:#ffcb6b}.cm-s-material-palenight .cm-atom{color:#f78c6c}.cm-s-material-palenight .cm-number{color:#ff5370}.cm-s-material-palenight .cm-def{color:#82aaff}.cm-s-material-palenight .cm-string{color:#c3e88d}.cm-s-material-palenight .cm-string-2{color:#f07178}.cm-s-material-palenight .cm-comment{color:#676e95}.cm-s-material-palenight .cm-variable{color:#f07178}.cm-s-material-palenight .cm-tag{color:#ff5370}.cm-s-material-palenight .cm-meta{color:#ffcb6b}.cm-s-material-palenight .cm-attribute{color:#c792ea}.cm-s-material-palenight .cm-property{color:#c792ea}.cm-s-material-palenight .cm-qualifier{color:#decb6b}.cm-s-material-palenight .cm-type,.cm-s-material-palenight .cm-variable-3{color:#decb6b}.cm-s-material-palenight .cm-error{color:#fff;background-color:#ff5370}.cm-s-material-palenight .CodeMirror-matchingbracket{text-decoration:underline;color:#fff!important}
* {
box-sizing: border-box;
margin: 0;
padding: 0;
}

body {
font-family: -apple-system, BlinkMacSystemFont, ‘Segoe UI’, Roboto, sans-serif;
background: #1a1a1a;
color: #f0f0f0;
line-height: 1.6;
}

/* Layout */
.course-layout {
display: flex;
min-height: 100vh;
}

/* Sidebar */
.course-sidebar {
width: 280px;
background: #2F2D2E;
border-right: 1px solid #4a4849;
position: fixed;
height: 100vh;
overflow-y: auto;
padding: 1.5rem 0;
}

.course-title {
padding: 0 1.5rem 1rem;
border-bottom: 1px solid #4a4849;
margin-bottom: 1rem;
}

.course-title h1 {
font-size: 1.1rem;
color: #72BEFA;
margin-bottom: 0.25rem;
}

.course-title .progress-text {
font-size: 0.75rem;
color: #888;
}

.progress-bar {
height: 4px;
background: #4a4849;
border-radius: 2px;
margin-top: 0.5rem;
overflow: hidden;
}

.progress-fill {
height: 100%;
background: #72BEFA;
width: 0%;
transition: width 0.3s;
}

/* Navigation */
.nav-section {
margin-bottom: 1rem;
}

.nav-section-title {
padding: 0.5rem 1.5rem;
font-size: 0.7rem;
text-transform: uppercase;
letter-spacing: 1px;
color: #888;
}

.nav-item {
display: flex;
align-items: center;
gap: 0.75rem;
padding: 0.6rem 1.5rem;
color: #ccc;
text-decoration: none;
font-size: 0.9rem;
transition: all 0.2s;
cursor: pointer;
border-left: 3px solid transparent;
}

.nav-item:hover {
background: #3d3b3c;
color: #fff;
}

.nav-item.active {
background: #3d3b3c;
border-left-color: #72BEFA;
color: #72BEFA;
}

.nav-item.completed .status-icon {
color: #72BEFA;
}

.status-icon {
width: 20px;
height: 20px;
min-width: 20px;
flex-shrink: 0;
display: flex;
align-items: center;
justify-content: center;
border: 2px solid #4a4849;
border-radius: 50%;
font-size: 0.7rem;
}

.nav-item.completed .status-icon {
border-color: #72BEFA;
background: rgba(114, 252, 219, 0.1);
}

.lock-icon {
margin-left: auto;
font-size: 0.75rem;
color: #666;
opacity: 0.7;
flex-shrink: 0;
min-width: 1rem;
}

/* Main content */
.course-content {
margin-left: 280px;
flex: 1;
padding: 2rem 3rem;
max-width: 900px;
}

.lesson {
display: none;
}

.lesson.active {
display: block;
}

.lesson h2 {
color: #72BEFA;
font-size: 1.75rem;
margin-bottom: 1.5rem;
padding-bottom: 0.5rem;
border-bottom: 2px solid #4a4849;
}

.lesson h3 {
color: #fff;
font-size: 1.25rem;
margin-top: 2rem;
margin-bottom: 1rem;
}

.lesson h4 {
color: #ccc;
font-size: 1.1rem;
margin-top: 1.5rem;
margin-bottom: 0.75rem;
}

.lesson p {
color: #ccc;
margin-bottom: 1rem;
}

.lesson ul, .lesson ol {
color: #ccc;
margin-bottom: 1rem;
padding-left: 1.5rem;
}

.lesson li {
margin-bottom: 0.5rem;
}

.lesson code {
background: #3d3b3c;
padding: 0.2rem 0.4rem;
border-radius: 4px;
font-family: ‘Fira Code’, monospace;
font-size: 0.9em;
color: #72BEFA;
}

.lesson pre {
background: #2F2D2E;
padding: 1rem;
border-radius: 8px;
overflow-x: auto;
margin-bottom: 1rem;
border: 1px solid #4a4849;
}

.lesson pre code {
background: none;
padding: 0;
color: #f8f8f2;
}

/* Callouts */
.callout {
padding: 1rem 1.25rem;
border-radius: 8px;
margin: 1.5rem 0;
border-left: 4px solid;
}

.callout-title {
font-weight: 600;
margin-bottom: 0.5rem;
display: flex;
align-items: center;
gap: 0.5rem;
}

.callout-tip {
background: rgba(114, 190, 250, 0.1);
border-color: #72BEFA;
}

.callout-tip .callout-title {
color: #72BEFA;
}

.callout-note {
background: rgba(114, 252, 219, 0.1);
border-color: #72FCDB;
}

.callout-note .callout-title {
color: #72FCDB;
}

.callout-warning {
background: rgba(229, 131, 182, 0.1);
border-color: #E583B6;
}

.callout-warning .callout-title {
color: #E583B6;
}

.callout a {
color: #fff;
text-decoration: underline;
}

.callout a:hover {
color: #72FCDB;
}

/* Collapsible callouts */
details.callout {
cursor: pointer;
}

details.callout summary.callout-title {
cursor: pointer;
list-style: none;
}

details.callout summary.callout-title::before {
content: ‘▶ ‘;
font-size: 0.8em;
transition: transform 0.2s;
display: inline-block;
}

details.callout[open] summary.callout-title::before {
transform: rotate(90deg);
}

details.callout summary.callout-title::-webkit-details-marker {
display: none;
}

details.callout > p {
margin-top: 0.75rem;
}

.callout pre {
background: #1a1a1a;
border-radius: 6px;
padding: 1rem;
margin-top: 0.75rem;
overflow-x: auto;
}

.callout pre code {
font-family: ‘Fira Code’, monospace;
font-size: 0.9rem;
color: #c3e88d;
}

/* Blockquotes */
.lesson blockquote {
border-left: 3px solid #72BEFA;
background: rgba(114, 190, 250, 0.08);
padding: 0.75rem 1.25rem;
border-radius: 0 6px 6px 0;
margin: 1rem 0;
}

.lesson blockquote p {
margin: 0;
color: rgba(255, 255, 255, 0.85);
}

/* Tables */
.course-table {
width: 100%;
border-collapse: collapse;
margin: 1rem 0 1.5rem 0;
font-size: 0.95rem;
}
.course-table th,
.course-table td {
border: 1px solid #4a4849;
padding: 0.6rem 1rem;
text-align: left;
}
.course-table thead th {
background: #3a3839;
color: #e0e0e0;
font-weight: 600;
}
.course-table tbody td {
color: #ccc;
}
.course-table tbody tr:nth-child(even) {
background: rgba(255, 255, 255, 0.03);
}

/* Quiz */
.quiz {
background: #2F2D2E;
border-radius: 8px;
padding: 1.5rem;
margin: 0 0 1.5rem 0;
border: 1px solid #4a4849;
}

.quiz-heading {
color: #ccc;
font-size: 1.1rem;
margin-top: 1.5rem;
margin-bottom: 0.75rem;
}

.quiz-divider {
border: none;
border-top: 1px solid #4a4849;
margin: 1.5rem 0;
}

.quiz-question {
color: #fff;
font-size: 1rem;
margin-bottom: 1rem;
font-weight: 500;
}

.quiz-options {
display: flex;
flex-direction: column;
gap: 0.75rem;
}

.quiz-option {
display: flex;
align-items: center;
gap: 0.75rem;
padding: 0.75rem 1rem;
background: #3d3b3c;
border: 2px solid #4a4849;
border-radius: 8px;
cursor: pointer;
transition: all 0.2s;
text-align: left;
width: 100%;
}

.quiz-option:hover:not(:disabled) {
border-color: #72BEFA;
background: #454243;
}

.quiz-option:disabled {
cursor: default;
}

.quiz-option.correct {
border-color: #72FCDB;
background: rgba(114, 252, 219, 0.15);
}

.quiz-option.incorrect {
border-color: #ff6b6b;
background: rgba(255, 107, 107, 0.15);
}

.option-label {
display: flex;
align-items: center;
justify-content: center;
width: 28px;
height: 28px;
min-width: 28px;
background: #4a4849;
border-radius: 50%;
font-weight: 600;
font-size: 0.85rem;
color: #fff;
}

.quiz-option.correct .option-label {
background: #72FCDB;
color: #2F2D2E;
}

.quiz-option.incorrect .option-label {
background: #ff6b6b;
color: #2F2D2E;
}

.option-content {
display: block;
flex: 1;
color: #ccc;
}

.option-content code {
background: #282a36;
padding: 0.15rem 0.4rem;
border-radius: 4px;
font-size: 0.85rem;
color: #f8f8f2;
}

.code-option code {
display: block;
padding: 0.5rem 0.75rem;
}

.quiz-feedback {
margin-top: 1rem;
padding-top: 1rem;
border-top: 1px solid #4a4849;
}

.quiz-feedback .callout {
margin: 0;
}

/* Code widget */
.codecut-widget {
background: #2F2D2E;
border-radius: 8px;
overflow: hidden;
margin: 1.5rem 0;
border: 1px solid #4a4849;
}

.codecut-widget-header {
display: flex;
justify-content: space-between;
align-items: center;
padding: 0.5rem 1rem;
background: #3d3b3c;
border-bottom: 1px solid #4a4849;
}

.codecut-widget-lang {
color: #72BEFA;
font-size: 0.75rem;
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.5px;
}

.codecut-run-btn {
display: flex;
align-items: center;
gap: 0.4rem;
background: #72BEFA;
color: #2F2D2E;
border: none;
padding: 0.4rem 0.8rem;
border-radius: 4px;
font-size: 0.8rem;
font-weight: 600;
cursor: pointer;
transition: all 0.2s;
}

.codecut-run-btn:hover {
background: #5aa8e8;
}

.codecut-run-btn:disabled {
background: #666;
cursor: not-allowed;
}

.codecut-editor {
min-height: 80px;
background: #2F2D2E;
}

.codecut-editor > textarea,
.exercise-editor > textarea {
display: none;
}

/* Static code widgets (read-only, no header/output) */
.codecut-widget[data-static=”true”] {
border-radius: 8px;
border: 1px solid #4a4849;
}

.codecut-widget[data-static=”true”] .codecut-editor {
border-radius: 8px;
min-height: auto;
}

.codecut-widget[data-static=”true”] .codecut-editor textarea {
min-height: auto;
}

.codecut-widget[data-static=”true”] .CodeMirror {
min-height: auto;
}

.codecut-widget[data-static=”true”] .CodeMirror-scroll {
min-height: auto;
}

.codecut-widget[data-demo=”true”] .codecut-editor {
min-height: auto;
}

.codecut-widget[data-demo=”true”] .codecut-editor textarea {
min-height: auto;
}

.codecut-widget[data-demo=”true”] .CodeMirror {
min-height: auto;
}

.codecut-widget[data-demo=”true”] .CodeMirror-scroll {
min-height: auto;
}

/* CodeMirror 5 styling overrides */
.CodeMirror {
height: auto;
min-height: 80px;
font-family: ‘Fira Code’, monospace;
font-size: 0.9rem;
line-height: 1.5;
background: #282a36;
border-radius: 0;
}

.CodeMirror-scroll {
min-height: 80px;
overflow-x: auto !important;
overflow-y: hidden !important;
}

.CodeMirror-gutters {
background: #282a36;
border-right: 1px solid #4a4849;
min-width: 40px;
}

.CodeMirror-linenumber {
color: #6272a4;
padding: 0 8px 0 5px;
min-width: 25px;
text-align: right;
}

.CodeMirror-sizer {
margin-left: 40px !important;
}

.CodeMirror-cursor {
border-left-color: #72BEFA;
}

.CodeMirror-selected {
background: rgba(114, 190, 250, 0.3) !important;
}

.CodeMirror-focused .CodeMirror-selected {
background: rgba(114, 190, 250, 0.4) !important;
}

/* Suppress red error background for $ and other valid-in-context tokens */
.cm-s-material-palenight .cm-error {
background: none;
}

.codecut-output-section {
margin-top: 0.75rem;
border-top: 2px solid #4a4849;
background: #252324;
}

.codecut-output-header {
padding: 0.4rem 1rem;
background: #3d3b3c;
border-bottom: 1px solid #4a4849;
}

.codecut-output-label {
color: #aaa;
font-size: 0.75rem;
font-weight: 600;
text-transform: uppercase;
}

.codecut-output {
padding: 1rem;
min-height: 60px;
max-height: 300px;
overflow-y: auto;
font-family: ‘Fira Code’, monospace;
font-size: 0.85rem;
line-height: 1.5;
color: #f8f8f2;
white-space: pre-wrap;
}

.course-image {
max-width: 100%;
height: auto;
border-radius: 4px;
display: block;
margin: 1em 0;
}

pre.mermaid {
text-align: center;
background: transparent;
border: none;
padding: 1em 0;
margin: 1em 0;
}

pre.mermaid svg {
background: transparent !important;
}

.codecut-output img {
max-width: 100%;
height: auto;
border-radius: 4px;
}

.codecut-output.has-image {
max-height: none;
white-space: normal;
}

.codecut-output.error { color: #ff6b6b; }
.codecut-output.loading { color: #72BEFA; }
.codecut-output .success { color: #72BEFA; }

.codecut-spinner {
display: inline-block;
width: 14px;
height: 14px;
border: 2px solid #2F2D2E;
border-top-color: transparent;
border-radius: 50%;
animation: spin 0.8s linear infinite;
}

@keyframes spin {
to { transform: rotate(360deg); }
}

/* Exercise widget */
.exercise-widget {
background: #1e1e2e;
border-radius: 12px;
overflow: hidden;
margin: 1.5rem 0;
border: 1px solid #4a4849;
}

.exercise-split {
display: flex;
flex-direction: column;
}

.exercise-left {
padding: 20px 24px;
background: #252535;
border-bottom: 1px solid #4a4849;
}

.exercise-title {
color: #72BEFA;
font-size: 1rem;
font-weight: 600;
margin: 0 0 1rem 0;
text-transform: uppercase;
letter-spacing: 0.5px;
}

.exercise-assignment {
color: #e0e0e0;
font-size: 0.9rem;
line-height: 1.6;
display: flex;
flex-wrap: wrap;
gap: 1.5rem 3rem;
}

.exercise-assignment p {
margin: 0;
}

.exercise-heading {
color: #72BEFA;
font-size: 0.75rem;
font-weight: 600;
margin: 0 0 0.4rem 0;
text-transform: uppercase;
letter-spacing: 0.5px;
}

.exercise-section {
flex: 1;
min-width: 200px;
}

.exercise-heading + p {
margin-top: 0;
}

.exercise-assignment em {
color: #ffffff;
font-style: italic;
}

.exercise-assignment code {
background: #3d3b3c;
padding: 0.2rem 0.4rem;
border-radius: 4px;
font-family: ‘Fira Code’, monospace;
font-size: 0.85rem;
}

.exercise-secrets {
margin-top: 1rem;
padding-top: 1rem;
border-top: 1px solid #3d3b3c;
}

.exercise-secret {
display: flex;
flex-direction: column;
gap: 0.4rem;
margin-bottom: 0.75rem;
}

.exercise-secret:last-child {
margin-bottom: 0;
}

.exercise-secret label {
color: #72BEFA;
font-size: 0.75rem;
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.5px;
}

.exercise-secret input {
padding: 0.6rem 0.8rem;
background: #1e1e2e;
border: 1px solid #4a4849;
border-radius: 6px;
color: #e0e0e0;
font-family: ‘Fira Code’, monospace;
font-size: 0.85rem;
outline: none;
transition: border-color 0.2s;
}

.exercise-secret input:focus {
border-color: #72BEFA;
}

.exercise-secret input::placeholder {
color: #666;
}

.exercise-right {
display: flex;
flex-direction: column;
background: #1e1e2e;
}

.exercise-editor {
flex: 1;
min-height: 200px;
background: #282a36;
}

.exercise-editor textarea {
width: 100%;
min-height: 200px;
padding: 1rem;
background: #282a36;
color: #f8f8f2;
border: none;
font-family: ‘Fira Code’, monospace;
font-size: 0.9rem;
line-height: 1.5;
resize: none;
outline: none;
}

.exercise-actions {
display: flex;
gap: 8px;
padding: 12px 16px;
background: #1a1a2e;
border-top: 1px solid #4a4849;
}

.exercise-btn {
display: flex;
align-items: center;
gap: 0.4rem;
padding: 0.5rem 1rem;
border: none;
border-radius: 6px;
font-size: 0.85rem;
font-weight: 600;
cursor: pointer;
transition: all 0.2s;
background: #3d3b3c;
color: #e0e0e0;
}

.exercise-btn:hover {
background: #4d4b4c;
}

.exercise-btn:disabled {
opacity: 0.5;
cursor: not-allowed;
}

.exercise-btn.primary {
background: #72BEFA;
color: #1e1e2e;
}

.exercise-btn.primary:hover {
background: #5aa8e8;
}

.exercise-btn.primary:disabled {
background: #666;
}

.exercise-output-section {
border-top: 1px solid #4a4849;
background: #1e1e2e;
}

.exercise-output-header {
padding: 0.5rem 1rem;
background: #252535;
border-bottom: 1px solid #4a4849;
}

.exercise-output-label {
color: #888;
font-size: 0.75rem;
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.5px;
}

.exercise-output {
padding: 1rem;
font-family: ‘Fira Code’, monospace;
font-size: 0.9rem;
line-height: 1.5;
color: #f8f8f2;
white-space: pre-wrap;
max-height: 200px;
overflow-y: auto;
}

.exercise-output.error { color: #ff6b6b; }
.exercise-output.loading { color: #72BEFA; }
.exercise-output.success { color: #72FCDB; }

.exercise-result {
padding: 1rem;
margin: 0;
font-weight: 600;
text-align: center;
}

.exercise-result.success {
background: rgba(114, 252, 219, 0.1);
color: #72FCDB;
border-top: 2px solid #72FCDB;
}

.exercise-result.failure {
background: rgba(255, 107, 107, 0.1);
color: #ff6b6b;
border-top: 2px solid #ff6b6b;
}

/* Navigation buttons */
.lesson-nav {
display: flex;
justify-content: space-between;
margin-top: 3rem;
padding-top: 2rem;
border-top: 1px solid #4a4849;
}

.lesson-nav-btn {
display: flex;
align-items: center;
gap: 0.5rem;
padding: 0.75rem 1.5rem;
background: #3d3b3c;
color: #fff;
border: none;
border-radius: 8px;
font-size: 0.9rem;
cursor: pointer;
transition: all 0.2s;
}

.lesson-nav-btn:hover {
background: #4a4849;
}

.lesson-nav-btn.primary {
background: #72BEFA;
color: #2F2D2E;
}

.lesson-nav-btn.primary:hover {
background: #5aa8e8;
}

/* Completion modal */
.completion-overlay {
display: none;
position: fixed;
inset: 0;
background: rgba(0, 0, 0, 0.7);
z-index: 1000;
align-items: center;
justify-content: center;
padding: 1rem;
}

.completion-modal {
background: #2F2D2E;
border: 1px solid #4a4849;
border-radius: 16px;
max-width: 520px;
width: 100%;
padding: 2.5rem;
text-align: center;
position: relative;
}

.completion-modal-close {
position: absolute;
top: 1rem;
right: 1rem;
background: none;
border: none;
color: #999;
font-size: 1.25rem;
cursor: pointer;
padding: 0.25rem;
line-height: 1;
}

.completion-modal-close:hover {
color: #fff;
}

.completion-modal h2 {
color: #72BEFA;
font-size: 1.5rem;
margin-bottom: 0.5rem;
}

.completion-modal p {
color: #ccc;
margin-bottom: 1.5rem;
font-size: 0.95rem;
line-height: 1.5;
}

.completion-courses {
display: flex;
flex-direction: column;
gap: 0.75rem;
margin-bottom: 1.5rem;
}

.completion-course-card {
display: block;
background: #3d3b3c;
border: 1px solid #4a4849;
border-radius: 10px;
padding: 1rem 1.25rem;
text-decoration: none;
text-align: left;
transition: border-color 0.2s;
}

.completion-course-card:hover {
border-color: #72BEFA;
}

.completion-course-card .card-title {
color: #72BEFA;
font-size: 0.95rem;
font-weight: 600;
margin-bottom: 0.25rem;
}

.completion-course-card .card-desc {
color: #999;
font-size: 0.8rem;
}

.completion-browse {
display: inline-block;
color: #E583B6;
font-size: 0.9rem;
text-decoration: none;
}

.completion-browse:hover {
text-decoration: underline;
}

/* Responsive */
@media (max-width: 768px) {
.course-sidebar {
width: 100%;
position: relative;
height: auto;
}

.course-content {
margin-left: 0;
padding: 1.5rem;
}

.course-layout {
flex-direction: column;
}
}

DuckDB for Data Scientists
0 of 25 completed

Getting Started


What is DuckDB?


Installation


Zero Configuration

Working with DataFrames


Integrate Seamlessly with pandas and Polars


Memory Efficiency


Out-of-Core Processing
🔒


Fast Performance

SQL Syntax Shortcuts


FROM-First Syntax


GROUP BY ALL
🔒


SELECT * EXCLUDE
🔒


SELECT * REPLACE
🔒

File Operations


Streamlined File Reading
🔒


Query Cloud Storage
🔒


Automatic Parsing of CSV Files
🔒


Automatic Flattening of Nested Parquet Files
🔒


Automatic Flattening of Nested JSON Files
🔒


Reading Multiple Files
🔒


Hive Partitioned Datasets
🔒


Exporting Data
🔒

Working with Complex Types


Creating Lists, Structs, and Maps
🔒


Manipulating Nested Data
🔒

Advanced Features


Parameterized Queries


ACID Transactions
🔒


Attach External Databases
🔒

Summary


Key Takeaways
🔒

What is DuckDB?
DuckDB is a fast, in-process SQL OLAP database optimized for analytics. Unlike traditional databases like PostgreSQL or MySQL that require server setup and maintenance, DuckDB runs directly in your Python process.

It’s perfect for data scientists because:

Zero Configuration: No database server setup required
Memory Efficiency: Out-of-core processing for datasets larger than RAM
Familiar Interface: SQL syntax with shortcuts like GROUP BY ALL
Performance: Columnar-vectorized engine faster than pandas
Universal Access: Query files, cloud storage, and external databases

Complete & Continue →

Installation
Install DuckDB with pip:

pip install duckdb

Let’s verify the installation:

Python

Run

aW1wb3J0IGR1Y2tkYgoKcHJpbnQoZiJEdWNrREIgdmVyc2lvbjoge2R1Y2tkYi5fX3ZlcnNpb25fX30iKQpwcmludCgiSW5zdGFsbGF0aW9uIHN1Y2Nlc3NmdWwhIik=

Output

Loading Python…

← Previous

Complete & Continue →

Zero Configuration
SQL operations on DataFrames typically require setting up database servers. With pandas and PostgreSQL, you need to:

Install and configure a database server
Ensure the service is running
Set up credentials and connections
Write the DataFrame to a table first

IyBUcmFkaXRpb25hbCBhcHByb2FjaCB3aXRoIHBhbmRhcyArIFBvc3RncmVTUUwKaW1wb3J0IHBhbmRhcyBhcyBwZApmcm9tIHNxbGFsY2hlbXkgaW1wb3J0IGNyZWF0ZV9lbmdpbmUKCnNhbGVzID0gcGQuRGF0YUZyYW1lKHsKICAgICJwcm9kdWN0IjogWyJBIiwgIkIiLCAiQyJdLAogICAgImFtb3VudCI6IFsxMDAsIDE1MCwgMjAwXQp9KQoKIyBSZXF1aXJlcyBzZXJ2ZXIgc2V0dXAsIGNyZWRlbnRpYWxzLCBydW5uaW5nIHNlcnZpY2UuLi4KZW5naW5lID0gY3JlYXRlX2VuZ2luZSgicG9zdGdyZXNxbDovL3VzZXI6cGFzc0Bsb2NhbGhvc3Q6NTQzMi9kYiIpCnNhbGVzLnRvX3NxbCgic2FsZXMiLCBlbmdpbmUsIGlmX2V4aXN0cz0icmVwbGFjZSIpCgp3aXRoIGVuZ2luZS5jb25uZWN0KCkgYXMgY29ubjoKICAgIHJlc3VsdCA9IHBkLnJlYWRfc3FsKCJTRUxFQ1QgKiBGUk9NIHNhbGVzIiwgY29ubik=

DuckDB eliminates this overhead. Query DataFrames directly with SQL:

Python

Run

aW1wb3J0IGR1Y2tkYgppbXBvcnQgcGFuZGFzIGFzIHBkCgpzYWxlcyA9IHBkLkRhdGFGcmFtZSh7CiAgICAicHJvZHVjdCI6IFsiQSIsICJCIiwgIkMiXSwKICAgICJhbW91bnQiOiBbMTAwLCAxNTAsIDIwMF0KfSkKCiMgTm8gc2VydmVyIG5lZWRlZCAtIHF1ZXJ5IERhdGFGcmFtZSBkaXJlY3RseSEKcmVzdWx0ID0gZHVja2RiLnNxbCgiU0VMRUNUICogRlJPTSBzYWxlcyIpLmRmKCkKcHJpbnQocmVzdWx0KQ==

Output

Loading Python…

💡 What the output shows
Notice how the query returns results instantly. There’s no connection string, no server startup time, and no authentication steps.

Try it

Edit the query to select items with quantity greater than 30 from the inventory DataFrame:

Python

Run

aW1wb3J0IGR1Y2tkYgppbXBvcnQgcGFuZGFzIGFzIHBkCgppbnZlbnRvcnkgPSBwZC5EYXRhRnJhbWUoewogICAgIml0ZW0iOiBbIkNoYWlyIiwgIkRlc2siLCAiTGFtcCJdLAogICAgInF1YW50aXR5IjogWzUwLCAyMCwgMTAwXQp9KQoKIyBFZGl0IHRoaXMgcXVlcnkgdG8gZmlsdGVyIGZvciBxdWFudGl0eSA+IDMwCnJlc3VsdCA9IGR1Y2tkYi5zcWwoIlNFTEVDVCAqIEZST00gaW52ZW50b3J5IikuZGYoKQpwcmludChyZXN1bHQp

Output

Loading Python…

💡 Solution
“python result = duckdb.sql("SELECT * FROM inventory WHERE quantity > 30").df() “

Quiz

In the code above, how does DuckDB access the sales DataFrame?

A
It automatically detects Python variables and makes them queryable

B
You must register the DataFrame with duckdb.register() first

C
The DataFrame must be saved to disk before querying

💡 Correct
Correct! DuckDB scans your Python namespace and makes DataFrames available as SQL tables automatically.

⚠ Try Again
Not quite. Look at the code above. There’s no duckdb.register() call before the SQL query runs.

⚠ Try Again
Not quite. The DataFrame stays in memory. There’s no file saving step before the SQL query runs.

← Previous

Complete & Continue →

Integrate Seamlessly with pandas and Polars
Have you ever wanted to leverage SQL’s power while working with your favorite data manipulation libraries such as pandas and Polars?

DuckDB makes it seamless to query pandas and Polars DataFrames via the duckdb.sql function.

Python

Run

aW1wb3J0IGR1Y2tkYgppbXBvcnQgcGFuZGFzIGFzIHBkCmltcG9ydCBwb2xhcnMgYXMgcGwKCnBkX2RmID0gcGQuRGF0YUZyYW1lKHsiYSI6IFsxLCAyLCAzXSwgImIiOiBbNCwgNSwgNl19KQoKcGxfZGYgPSBwbC5EYXRhRnJhbWUoeyJhIjogWzEsIDIsIDNdLCAiYiI6IFs0LCA1LCA2XX0pCgpwcmludCgiUXVlcnkgcGFuZGFzIERhdGFGcmFtZToiKQpwcmludChkdWNrZGIuc3FsKCJTRUxFQ1QgKiBGUk9NIHBkX2RmIikuZGYoKSkKCnByaW50KCJcblF1ZXJ5IFBvbGFycyBEYXRhRnJhbWU6IikKcHJpbnQoZHVja2RiLnNxbCgiU0VMRUNUICogRlJPTSBwbF9kZiIpLmRmKCkp

Output

💡 What the output shows
DuckDB recognized both pd_df (pandas) and pl_df (Polars) as DataFrame variables and queried them directly with SQL. No imports or registration needed.

DuckDB’s integration with pandas and Polars lets you combine the strengths of each tool. For example, you can:

Use pandas for data cleaning and feature engineering
Use DuckDB for complex aggregations and complex queries

Python

Run

aW1wb3J0IHBhbmRhcyBhcyBwZAppbXBvcnQgZHVja2RiCgojIENyZWF0ZSBzYWxlcyBkYXRhCnNhbGVzID0gcGQuRGF0YUZyYW1lKHsKICAgICJwcm9kdWN0IjogWyJBIiwgIkIiLCAiQyIsICJBIiwgIkIiLCAiQyJdICogMiwKICAgICJyZWdpb24iOiBbIk5vcnRoIiwgIlNvdXRoIl0gKiA2LAogICAgImFtb3VudCI6IFsxMDAsIDE1MCwgMjAwLCAxMjAsIDE4MCwgMjIwLCAxMTAsIDE2MCwgMjEwLCAxMzAsIDE3MCwgMjMwXSwKICAgICJkYXRlIjogcGQuZGF0ZV9yYW5nZSgiMjAyNC0wMS0wMSIsIHBlcmlvZHM9MTIpCn0pCgojIFVzZSBwYW5kYXMgZm9yIGZlYXR1cmUgZW5naW5lZXJpbmcKc2FsZXNbJ21vbnRoJ10gPSBzYWxlc1snZGF0ZSddLmR0Lm1vbnRoCnNhbGVzWydpc19oaWdoX3ZhbHVlJ10gPSBzYWxlc1snYW1vdW50J10gPiAxNTAKcHJpbnQoIlNhbGVzIGFmdGVyIGZlYXR1cmUgZW5naW5lZXJpbmc6IikKcHJpbnQoc2FsZXMuaGVhZCgpKQ==

Output

Loading Python…

💡 What the output shows
pandas makes feature engineering straightforward: extracting month from dates and creating is_high_value flags are common transformations for preparing data for analysis or machine learning.

Now use DuckDB for complex aggregations:

Python

Run

IyBVc2UgRHVja0RCIGZvciBjb21wbGV4IGFnZ3JlZ2F0aW9ucwphbmFseXNpcyA9IGR1Y2tkYi5zcWwoIiIiCiAgICBTRUxFQ1QKICAgICAgICBwcm9kdWN0LAogICAgICAgIHJlZ2lvbiwKICAgICAgICBDT1VOVCgqKSBhcyB0b3RhbF9zYWxlcywKICAgICAgICBBVkcoYW1vdW50KSBhcyBhdmdfYW1vdW50LAogICAgICAgIFNVTShDQVNFIFdIRU4gaXNfaGlnaF92YWx1ZSBUSEVOIDEgRUxTRSAwIEVORCkgYXMgaGlnaF92YWx1ZV9zYWxlcwogICAgRlJPTSBzYWxlcwogICAgR1JPVVAgQlkgcHJvZHVjdCwgcmVnaW9uCiAgICBPUkRFUiBCWSBhdmdfYW1vdW50IERFU0MKIiIiKS5kZigpCgpwcmludCgiU2FsZXMgYW5hbHlzaXMgYnkgcHJvZHVjdCBhbmQgcmVnaW9uOiIpCnByaW50KGFuYWx5c2lzKQ==

Output

Loading Python…

💡 What the output shows
DuckDB excels at complex aggregations: combining GROUP BY, AVG, and conditional CASE WHEN in a single query is more readable and efficient than equivalent pandas code.

Try it

Edit the query to combine results from both df_2023 and df_2024 using UNION ALL:

Python

Run

aW1wb3J0IGR1Y2tkYgppbXBvcnQgcGFuZGFzIGFzIHBkCgpkZl8yMDIzID0gcGQuRGF0YUZyYW1lKHsieWVhciI6IFsyMDIzLCAyMDIzXSwgInNhbGVzIjogWzEwMCwgMTUwXX0pCmRmXzIwMjQgPSBwZC5EYXRhRnJhbWUoeyJ5ZWFyIjogWzIwMjQsIDIwMjRdLCAic2FsZXMiOiBbMjAwLCAyNTBdfSkKCiMgRWRpdCB0byBjb21iaW5lIGJvdGggRGF0YUZyYW1lcyB3aXRoIFVOSU9OIEFMTApyZXN1bHQgPSBkdWNrZGIuc3FsKCJTRUxFQ1QgKiBGUk9NIGRmXzIwMjMiKS5kZigpCnByaW50KHJlc3VsdCk=

Output

Loading Python…

💡 Solution
“python result = duckdb.sql("SELECT * FROM df_2023 UNION ALL SELECT * FROM df_2024").df() “

Quiz

What makes DuckDB’s approach to complex aggregations more readable than pandas?

A
All operations are expressed in a single, declarative query

B
DuckDB uses shorter function names

C
DuckDB automatically formats the output

💡 Correct
Correct! SQL lets you express GROUP BY, aggregates, and sorting in one cohesive statement, while pandas requires chaining multiple methods.

⚠ Try Again
Not quite. Function name length isn’t the key difference. Think about how operations are structured.

⚠ Try Again
Not quite. Output formatting isn’t what makes DuckDB’s approach more readable. Look at how the query combines multiple operations.

← Previous

Complete & Continue →

Memory Efficiency
Pandas loads entire datasets into RAM before filtering, which can cause out-of-memory errors. DuckDB processes only the rows that match your filter, using a fraction of the memory. To see this in action, let’s compare both approaches on the same dataset.

First, create a sample CSV file:

Python

Run

aW1wb3J0IHBhbmRhcyBhcyBwZAoKIyBDcmVhdGUgc2FtcGxlIGRhdGEgYW5kIHNhdmUgdG8gQ1NWCmN1c3RvbWVycyA9IHBkLkRhdGFGcmFtZSh7CiAgICAiaWQiOiByYW5nZSgxMDAwKSwKICAgICJuYW1lIjogW2YiQ3VzdG9tZXJfe2l9IiBmb3IgaSBpbiByYW5nZSgxMDAwKV0sCiAgICAicmVnaW9uIjogWyJOb3J0aCIsICJTb3V0aCIsICJFYXN0IiwgIldlc3QiXSAqIDI1MAp9KQpjdXN0b21lcnMudG9fY3N2KCJjdXN0b21lcnMuY3N2IiwgaW5kZXg9RmFsc2UpCnByaW50KGYiQ3JlYXRlZCBjdXN0b21lcnMuY3N2IHdpdGgge2xlbihjdXN0b21lcnMpfSByb3dzIik=

Output

Loading Python…

With pandas, filtering loads ALL records into RAM first:

Python

Run

aW1wb3J0IHBhbmRhcyBhcyBwZAoKIyBSZWFkIGVudGlyZSBDU1YgaW50byBtZW1vcnksIHRoZW4gZmlsdGVyCmRmID0gcGQucmVhZF9jc3YoImN1c3RvbWVycy5jc3YiKQpyZXN1bHQgPSBkZltkZlsicmVnaW9uIl0gPT0gIk5vcnRoIl0KcHJpbnQoZiJMb2FkZWQge2xlbihkZil9IHJvd3MgdG8gZ2V0IHtsZW4ocmVzdWx0KX0gbWF0Y2hlcyIp

Output

Loading Python…

With DuckDB, only matching rows enter memory:

Python

Run

aW1wb3J0IGR1Y2tkYgoKIyBTdHJlYW0gZnJvbSBmaWxlLCBmaWx0ZXIgZHVyaW5nIHJlYWQKcmVzdWx0ID0gZHVja2RiLnNxbCgiIiIKICAgIFNFTEVDVCAqCiAgICBGUk9NICdjdXN0b21lcnMuY3N2JwogICAgV0hFUkUgcmVnaW9uID0gJ05vcnRoJwoiIiIpLmRmKCkKcHJpbnQoZiJSZXR1cm5lZCB7bGVuKHJlc3VsdCl9IHJvd3Mgd2l0aG91dCBsb2FkaW5nIGZ1bGwgZmlsZSIp

Output

Loading Python…

The diagram below summarizes the memory difference:

RAM Usage

│ ████████████ Pandas (loads all 1M rows)

│ ██ DuckDB (streams, keeps 10K matches)

└──────────────────────────────────────────────

Quiz

What’s the key difference between how pandas and DuckDB handle the filter region = 'North'?

A
Pandas loads all rows first then filters; DuckDB processes only matching rows

B
Pandas uses more CPU; DuckDB uses more RAM

C
Pandas filters in Python; DuckDB filters in C++

💡 Correct
Correct! Pandas must load the entire DataFrame into memory before applying any filter. DuckDB evaluates the WHERE clause during scanning, never loading non-matching rows.

⚠ Try Again
Not quite. The difference isn’t about CPU vs RAM usage. Think about when filtering happens relative to data loading.

⚠ Try Again
Not quite. While implementation languages differ, the key difference is the order of operations: load-then-filter vs filter-while-loading.

← Previous

Complete & Continue →

Out-of-Core Processing

← Previous

Complete & Continue →

Fast Performance
While pandas processes data sequentially row-by-row, DuckDB uses a columnar-vectorized execution engine that processes data in parallel chunks. The diagram below shows how each approach handles data:

Pandas DuckDB
│ │
├─ Row 1 ──────> process ├─ Chunk 1 (2048 rows) ─┐
├─ Row 2 ──────> process ├─ Chunk 2 (2048 rows) ─┼─> process
├─ Row 3 ──────> process ├─ Chunk 3 (2048 rows) ─┘
├─ Row 4 ──────> process │
│ … │
▼ ▼
Sequential Parallel chunks

This architectural difference enables DuckDB to significantly outperform pandas, especially for computationally intensive operations like aggregations and joins.

Let’s compare the performance of pandas and DuckDB for aggregations on a million rows of data.

Python

Run

aW1wb3J0IHRpbWUKCiMgUGFuZGFzIGFnZ3JlZ2F0aW9uCnN0YXJ0X3RpbWUgPSB0aW1lLnRpbWUoKQpwYW5kYXNfYWdnID0gY3VzdG9tZXJzLmdyb3VwYnkoWydyZWdpb24nLCAnc2VnbWVudCddKS5zaXplKCkucmVzZXRfaW5kZXgobmFtZT0nY291bnQnKQpwYW5kYXNfdGltZSA9IHRpbWUudGltZSgpIC0gc3RhcnRfdGltZQoKIyBEdWNrREIgYWdncmVnYXRpb24Kc3RhcnRfdGltZSA9IHRpbWUudGltZSgpCmR1Y2tkYl9hZ2cgPSBkdWNrZGIuc3FsKCIiIgogICAgU0VMRUNUIHJlZ2lvbiwgc2VnbWVudCwgQ09VTlQoKikgYXMgY291bnQgRlJPTSBjdXN0b21lcnMgR1JPVVAgQlkgcmVnaW9uLCBzZWdtZW50CiIiIikuZGYoKQpkdWNrZGJfdGltZSA9IHRpbWUudGltZSgpIC0gc3RhcnRfdGltZQoKcHJpbnQoZiJQYW5kYXMgYWdncmVnYXRpb24gdGltZToge3BhbmRhc190aW1lOi4yZn0gc2Vjb25kcyIpCnByaW50KGYiRHVja0RCIGFnZ3JlZ2F0aW9uIHRpbWU6IHtkdWNrZGJfdGltZTouMmZ9IHNlY29uZHMiKQpwcmludChmIlNwZWVkdXA6IHtwYW5kYXNfdGltZS9kdWNrZGJfdGltZTouMWZ9eCIp

Output

💡 What the output shows
DuckDB completes the same aggregation ~8x faster than pandas. The speedup comes from DuckDB’s columnar-vectorized execution engine processing data in parallel chunks.

📝 Note
The following benchmark was run on native Python. Results may vary in browser-based environments.

Quiz

How does pandas process data differently from DuckDB?

A
Pandas processes rows sequentially; DuckDB processes chunks in parallel

B
Pandas uses disk storage; DuckDB uses only RAM

C
Pandas compiles queries; DuckDB interprets them

💡 Correct
Correct! Pandas iterates through rows one at a time. DuckDB’s columnar-vectorized engine processes multiple rows simultaneously, enabling significant speedups for operations like GROUP BY.

⚠ Try Again
Not quite. Both can work with in-memory data. The difference is in execution strategy, not storage location.

⚠ Try Again
Not quite. This is reversed. DuckDB actually compiles queries into optimized execution plans, while pandas interprets method chains.

← Previous

Complete & Continue →

FROM-First Syntax
Traditional SQL requires SELECT before FROM. This adds unnecessary boilerplate when you just want a quick look at your data:

Python

Run

aW1wb3J0IGR1Y2tkYgppbXBvcnQgcGFuZGFzIGFzIHBkCgpzYWxlcyA9IHBkLkRhdGFGcmFtZSh7CiAgICAicHJvZHVjdCI6IFsiQSIsICJCIiwgIkMiLCAiQSIsICJCIl0sCiAgICAicmVnaW9uIjogWyJOb3J0aCIsICJTb3V0aCIsICJOb3J0aCIsICJTb3V0aCIsICJOb3J0aCJdLAogICAgImFtb3VudCI6IFsxMDAsIDIwMCwgMTUwLCAxMjAsIDE4MF0KfSkKCiMgVHJhZGl0aW9uYWwgU1FMCnJlc3VsdCA9IGR1Y2tkYi5zcWwoIlNFTEVDVCAqIEZST00gc2FsZXMiKS5kZigpCnByaW50KHJlc3VsdCk=

Output

Loading Python…

DuckDB lets you skip SELECT * entirely, making quick data exploration faster:

Python

Run

IyBEdWNrREI6IEZST00tZmlyc3QgKFNFTEVDVCAqIGlzIGltcGxpZWQpCnJlc3VsdCA9IGR1Y2tkYi5zcWwoIkZST00gc2FsZXMiKS5kZigpCnByaW50KHJlc3VsdCk=

Output

Loading Python…

💡 What the output shows
Notice the results are the same. This confirms that FROM table automatically selects all columns.

Try it

Write a FROM-first query to get all sales with amount > 150:

Python

Run

aW1wb3J0IGR1Y2tkYgppbXBvcnQgcGFuZGFzIGFzIHBkCgpzYWxlcyA9IHBkLkRhdGFGcmFtZSh7CiAgICAicHJvZHVjdCI6IFsiQSIsICJCIiwgIkMiLCAiQSIsICJCIl0sCiAgICAicmVnaW9uIjogWyJOb3J0aCIsICJTb3V0aCIsICJOb3J0aCIsICJTb3V0aCIsICJOb3J0aCJdLAogICAgImFtb3VudCI6IFsxMDAsIDIwMCwgMTUwLCAxMjAsIDE4MF0KfSkKCiMgV3JpdGUgYSBGUk9NLWZpcnN0IHF1ZXJ5IHdpdGggV0hFUkUgY2xhdXNlCnJlc3VsdCA9IGR1Y2tkYi5zcWwoIl9fXyIpLmRmKCkKcHJpbnQocmVzdWx0KQ==

Output

Loading Python…

💡 Solution
“python result = duckdb.sql("FROM sales WHERE amount > 150").df() “

Quiz

What happens when you run FROM sales in DuckDB?

A
Returns only the first row from the sales table

B
Returns all rows and columns from the sales table

C
Returns the table schema without data

⚠ Try Again
Not quite. FROM table returns all rows, not just the first one. To limit rows, you’d use FROM table LIMIT 1.

💡 Correct
Correct! FROM table is shorthand for SELECT * FROM table, returning all rows and all columns.

⚠ Try Again
Not quite. FROM table returns data, not schema. To see the schema, use DESCRIBE table or SUMMARIZE table.

← Previous

Complete & Continue →

GROUP BY ALL

← Previous

Complete & Continue →

SELECT * EXCLUDE

← Previous

Complete & Continue →

SELECT * REPLACE

← Previous

Complete & Continue →

Streamlined File Reading

← Previous

Complete & Continue →

Query Cloud Storage

← Previous

Complete & Continue →

Automatic Parsing of CSV Files

← Previous

Complete & Continue →

Automatic Flattening of Nested Parquet Files

← Previous

Complete & Continue →

Automatic Flattening of Nested JSON Files

← Previous

Complete & Continue →

Reading Multiple Files

← Previous

Complete & Continue →

Hive Partitioned Datasets

← Previous

Complete & Continue →

Exporting Data

← Previous

Complete & Continue →

Creating Lists, Structs, and Maps

← Previous

Complete & Continue →

Manipulating Nested Data

← Previous

Complete & Continue →

Parameterized Queries
When working with databases, you often need to run similar queries with different parameters. For instance, you might want to filter a table using various criteria.

First, let’s create a sample products table:

Python

Run

aW1wb3J0IGR1Y2tkYgoKY29ubiA9IGR1Y2tkYi5jb25uZWN0KCI6bWVtb3J5OiIpCmNvbm4uc3FsKCIiIgogICAgQ1JFQVRFIFRBQkxFIHByb2R1Y3RzIChpZCBJTlQsIG5hbWUgVkFSQ0hBUiwgcHJpY2UgREVDSU1BTCkKIiIiKQpjb25uLnNxbCgiIiIKICAgIElOU0VSVCBJTlRPIHByb2R1Y3RzIFZBTFVFUwogICAgKDEsICdMYXB0b3AnLCA5OTkuOTkpLAogICAgKDIsICdQaG9uZScsIDY5OS45OSksCiAgICAoMywgJ1RhYmxldCcsIDQ0OS45OSksCiAgICAoNCwgJ1dhdGNoJywgMjk5Ljk5KQoiIiIpCgpwcmludChjb25uLnNxbCgiU0VMRUNUICogRlJPTSBwcm9kdWN0cyIpLmRmKCkp

Output

Loading Python…

You might use f-strings to pass parameters to your queries:

Python

Run

bWluX3ByaWNlID0gNDAwCnJlc3VsdCA9IGNvbm4uc3FsKAogICAgZiJTRUxFQ1QgKiBGUk9NIHByb2R1Y3RzIFdIRVJFIHByaWNlID4ge21pbl9wcmljZX0iCikuZGYoKQoKcHJpbnQoZiJQcm9kdWN0cyBvdmVyICR7bWluX3ByaWNlfToiKQpwcmludChyZXN1bHQp

Output

Loading Python…

⚠ Caution
While this works, f-strings are dangerous. A malicious user could:

Input "0; DROP TABLE products; –" to delete your table
Input "0 UNION SELECT * FROM secrets" to steal data

DuckDB provides a safer way with parameterized queries using the ? placeholder:

Python

Run

bWluX3ByaWNlID0gNDAwCnJlc3VsdCA9IGNvbm4uZXhlY3V0ZSgKICAgICJTRUxFQ1QgKiBGUk9NIHByb2R1Y3RzIFdIRVJFIHByaWNlID4gPyIsCiAgICAobWluX3ByaWNlLCkKKS5kZigpCgpwcmludChmIlByb2R1Y3RzIG92ZXIgJHttaW5fcHJpY2V9OiIpCnByaW50KHJlc3VsdCk=

Output

Loading Python…

💡 What the output shows
DuckDB binds 400 to the ? placeholder separately from parsing. Even if min_price contained malicious SQL, it would be treated as a literal value. This makes your database immune to injection attacks.

Try it

Use the ? placeholder to find products under $300:

Python

Run

aW1wb3J0IGR1Y2tkYgoKY29ubiA9IGR1Y2tkYi5jb25uZWN0KCI6bWVtb3J5OiIpCmNvbm4uc3FsKCIiIgogICAgQ1JFQVRFIFRBQkxFIHByb2R1Y3RzIChpZCBJTlQsIG5hbWUgVkFSQ0hBUiwgcHJpY2UgREVDSU1BTCkKIiIiKQpjb25uLnNxbCgiIiIKICAgIElOU0VSVCBJTlRPIHByb2R1Y3RzIFZBTFVFUwogICAgKDEsICdMYXB0b3AnLCA5OTkuOTkpLAogICAgKDIsICdQaG9uZScsIDY5OS45OSksCiAgICAoMywgJ1RhYmxldCcsIDQ0OS45OSksCiAgICAoNCwgJ1dhdGNoJywgMjk5Ljk5KQoiIiIpCgptYXhfcHJpY2UgPSAzMDAKcmVzdWx0ID0gY29ubi5leGVjdXRlKAogICAgIlNFTEVDVCAqIEZST00gcHJvZHVjdHMgV0hFUkUgX19fIiwKICAgIF9fXwopLmRmKCkKcHJpbnQocmVzdWx0KQ==

Output

Loading Python…

💡 Solution
“python "SELECT * FROM products WHERE price < ?", (max_price,) “
The ? placeholder gets replaced with the value from the tuple. The trailing comma is required for single-element tuples.

Quiz

If a malicious user sets min_price = "0; DROP TABLE products", what happens with parameterized queries?

A
DuckDB treats the entire string as a literal value, causing a type error

B
The products table gets deleted

C
DuckDB ignores the input and uses a default value

💡 Correct
Correct! The malicious string is treated as a literal value to compare against price. Since it’s not a valid number, the query fails safely without executing any DROP command.

⚠ Try Again
Not quite. That would happen with f-strings. Parameterized queries prevent the injected SQL from being executed as code.

⚠ Try Again
Not quite. DuckDB doesn’t silently replace bad input. It processes the input as a literal value, which would cause a type mismatch error.

← Previous

Complete & Continue →

ACID Transactions

← Previous

Complete & Continue →

Attach External Databases

← Previous

Complete & Continue →

Key Takeaways

← Previous

Complete Course

×
Course Complete!
Nice work finishing this course. Ready to go deeper? Check out these courses with hands-on exercises:


Entity Extraction with spaCy and LLMs
Extract names, dates, and custom entities from text.


Python Data Modeling with Dataclasses and Pydantic
Choose the right data container: dict, NamedTuple, dataclass, or Pydantic.

Browse all courses →

DuckDB for Data Scientists Read More »

Scroll to Top

Work with Khuyen Tran

Work with Khuyen Tran