Wrting Scalable Data Science Code with the Open-Closed Principle

Use the Open-Closed Principle to create easily extendable classes without modifying existing code. This approach reduces the risk of introducing bugs in existing, tested code and allows for easy addition of new features or functionalities.

Without adhering to the Open-Closed Principle, you might frequently modify existing classes to add new features. Consider the following example of a data visualization class:

class DataVisualizer:
    def visualize(self, data, chart_type):
        if chart_type == "bar":
            self.create_bar_chart(data)
        elif chart_type == "line":
            self.create_line_chart(data)
        elif chart_type == "scatter":
            self.create_scatter_plot(data)

    def create_bar_chart(self, data):
        print("Creating bar chart...")

    def create_line_chart(self, data):
        print("Creating line chart...")

    def create_scatter_plot(self, data):
        print("Creating scatter plot...")

In this example, every time you want to add a new chart type, you need to modify the visualize method and add a new method for the chart type. This violates the Open-Closed Principle and can lead to a growing, unwieldy class.

Instead, you can redesign the class to be open for extension but closed for modification:

from abc import ABC, abstractmethod

class Chart(ABC):
    @abstractmethod
    def create(self, data):
        pass

class BarChart(Chart):
    def create(self, data):
        print("Creating bar chart...")

class LineChart(Chart):
    def create(self, data):
        print("Creating line chart...")

class ScatterPlot(Chart):
    def create(self, data):
        print("Creating scatter plot...")

class DataVisualizer:
    def visualize(self, data, chart):
        chart.create(data)

# Usage
visualizer = DataVisualizer()
visualizer.visualize(data, BarChart())
visualizer.visualize(data, LineChart())

In this improved version:

  • We define an abstract Chart class with a create method.
  • Each specific chart type (BarChart, LineChart, ScatterPlot) inherits from Chart and implements its own create method.
  • The DataVisualizer class now takes a Chart object as a parameter, making it closed for modification.

This design allows for adding new chart types by creating classes that inherit from Chart, without modifying the DataVisualizer class, making the code more modular and easier to test.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top