Mastering the Art of Writing Effective Python Functions

Chapter 1: Understanding Python Functions

In Python, much like other contemporary programming languages, functions serve as a fundamental mechanism for abstraction and encapsulation. As a developer, you have likely crafted numerous functions throughout your career. However, it's essential to recognize that not all functions are designed equally. The quality of your functions significantly impacts the clarity and maintainability of your code. So, what constitutes a poorly designed function, and conversely, what defines a well-crafted one?

A Quick Overview of Functions

Mathematics is filled with functions, though they may not be fresh in our memories. Let's revisit a familiar topic: calculus. You might recall equations such as f(x) = 2x + 3. This represents a function, named f, which accepts an input x and yields the result of two times x plus three. While this may not resemble the functions we are accustomed to in Python, it has a direct parallel in the following code:

def f(x):

return 2 * x + 3

Functions have long been a staple in mathematics, but they possess even greater capabilities in computer science. With this power comes the potential for various pitfalls. Let's delve into the characteristics that define a good function and identify signs that indicate a function may require refactoring.

Keys to Crafting a Good Function

What sets a well-designed Python function apart from a poorly constructed one? You might be surprised by the numerous interpretations of what "good" entails. For our discussion, I will define a good Python function as one that meets most of the following criteria (though not all may be achievable):

Sensibly named
Adheres to a single responsibility
Includes a comprehensive docstring
Returns a value
Remains concise, ideally no longer than 50 lines
Is idempotent and, when feasible, pure

While this list may appear stringent, adhering to these principles will elevate your code's beauty to a level that even unicorns would envy. Below, I will dedicate a section to each of these items, concluding with an explanation of how they synergistically contribute to the creation of good functions.

Naming Conventions

There's a quote I appreciate, often misattributed to Donald Knuth, but originally from Phil Karlton:

"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton

As trivial as it may seem, naming things appropriately is a challenge. Consider this example of a poorly named function:

def get_knn_from_df(df):

I've encountered poorly named functions in various contexts, but this one originates from Data Science (specifically, Machine Learning). Practitioners often write code in Jupyter notebooks and later attempt to consolidate the various cells into a coherent program.

The first issue with this function's name is its use of acronyms and abbreviations. It's preferable to use full English words rather than abbreviations or obscure acronyms. The only justification for abbreviating is to save time typing, but modern editors typically offer autocomplete features, meaning you'll only need to type the full name once. Abbreviations pose a problem because they are often domain-specific. In the example above, "knn" refers to "K-Nearest Neighbors" and "df" denotes a "DataFrame," a common pandas structure. For a programmer unfamiliar with these terms, the name becomes largely incomprehensible.

There are also two minor issues with this function's name: the term "get" is unnecessary. A well-named function should make it evident that a value is being returned, and its name should reflect this. The "from_df" portion is also redundant; either the function's docstring or its type annotation should clarify the parameter type if it isn't apparent from the parameter's name.

So, how can we improve this function's name? It's straightforward:

def k_nearest_neighbors(dataframe):

This revision clearly communicates the function's purpose, even to those who may not be experts, while the parameter name "dataframe" indicates the expected argument type.

Single Responsibility Principle

The Single Responsibility Principle, attributed to "Uncle" Bob Martin, applies equally to functions as it does to classes and modules. It asserts that a function should focus on a single task. The rationale is clear: if each function has one responsibility, the only reason for modification arises when the method of accomplishing that task changes. This clarity also facilitates the decision to remove a function when it's no longer necessary.

To illustrate this, consider the following function that handles multiple tasks:

def calculate_and_print_stats(list_of_numbers):

total = sum(list_of_numbers)

mean = statistics.mean(list_of_numbers)

median = statistics.median(list_of_numbers)

mode = statistics.mode(list_of_numbers)

print('-----------------Stats-----------------')

print(f'SUM: {total}')

print(f'MEAN: {mean}')

print(f'MEDIAN: {median}')

print(f'MODE: {mode}')

This function violates the principle of single responsibility, as it both computes statistics and prints them to standard output. There are two reasons why this function may need to change: either new statistics need to be calculated, or the output format must be adjusted. It would be more appropriate to separate this into two distinct functions: one for performing calculations and another for displaying the results.

A clear indicator that a function has multiple responsibilities is the presence of "and" in its name.

This separation enables easier testing of each function's behavior, and the two components can reside in different modules if necessary, leading to cleaner testing and simpler maintenance.

The Importance of Docstrings

While many developers are familiar with PEP-8, which outlines Python's style guide, fewer are aware of PEP-257, which addresses docstring conventions. Here are the main points to remember:

Every function must have a docstring.
Use proper grammar and punctuation; write in full sentences.
Start with a concise summary of the function's purpose.
Employ prescriptive language instead of descriptive.

Incorporating docstrings should become second nature when crafting functions. Aim to write them before developing the actual code. If you're unable to articulate a clear docstring, it may indicate that you need to reconsider the function's purpose.

Return Values

Functions can be viewed as self-contained mini-programs. They accept inputs through parameters and yield results, although parameters are optional. Return values, however, are mandatory from a Python internals perspective. Even if you attempt to create a function without a return statement, the Python interpreter will default to returning None. You can test this with the following:

def add(a, b):

print(a + b)

result = add(1, 2)

print(result) # This will show None

You will find that the value of result is indeed None. Hence, every function should yield a meaningful return value. Programs that produce no output, including confirmation of successful execution, are of limited utility. Additionally, returning a value enables method chaining, a coding practice that allows for more concise and readable code.

Here are some common justifications for functions that don't return values:

"It solely performs I/O tasks, like saving data to a database."
- In this case, the function could still return True to indicate success.
"We modify one of the parameters in place."
- It's best to avoid this approach, as altering a parameter can lead to unexpected behavior. Instead, return a new instance with the changes applied.
"I need to return multiple values; there isn't a single value that makes sense."
- You can use a tuple to return multiple values.

Ultimately, there are compelling reasons to always return a value from a function. Callers are free to disregard it, and this practice rarely disrupts existing codebases.

Function Length

As I've mentioned before, I'm not the sharpest tool in the shed; I can only manage a few concepts in my head at once. If I encounter a 200-line function and am asked to explain it, my mind will likely drift after a mere ten seconds. The length of a function directly influences its readability and, consequently, its maintainability. Thus, strive to keep functions concise—ideally no more than 50 lines.

If a function adheres to the Single Responsibility Principle, it will likely be shorter. If it is also pure or idempotent, this will further contribute to brevity. These principles work in tandem to foster clean and effective code.

So, what should you do if a function is excessively long? Refactor it! Refactoring involves restructuring a program without altering its behavior. A common method is to extract lines of code from a lengthy function and create a new function from them. This process not only shortens the original function but also enhances readability through well-named smaller functions.

Idempotency and Functional Purity

While the terms may sound daunting, the concepts themselves are straightforward. An idempotent function consistently returns the same output for the same set of inputs, regardless of how many times it is invoked. Its return value is not influenced by external variables, mutable arguments, or I/O data. For instance, consider the following idempotent function:

def add_three(number):

"""Return number + 3."""

return number + 3

No matter how many times you call add_three(7), the output will always be 10. In contrast, the following function is not idempotent:

def add_three():

"""Return 3 plus the user-entered number."""

number = int(input('Enter a number: '))

return number + 3

This example's return value hinges on user input, making it non-idempotent.

A real-world analogy for idempotency could be pressing the "up" button for an elevator. The first press signals that you wish to go up, and subsequent presses have no additional effect—the outcome remains the same.

Why is idempotency significant? For testability and maintainability. Idempotent functions simplify testing because they guarantee consistent outputs for identical inputs. Testing becomes a matter of confirming that the function returns the expected result. This reliability also contributes to faster tests, an often-overlooked aspect of unit testing. Furthermore, refactoring becomes straightforward, as the function's output remains unchanged when called with the same arguments, regardless of any external code modifications.

What constitutes a "pure" function?

In functional programming, a function is deemed pure if it is both idempotent and devoid of observable side effects. Remember, an idempotent function consistently returns the same output for a specific input set. However, this does not imply that the function cannot interact with external variables or I/O. For example, if the idempotent add_three(number) function also printed the result before returning it, it remains idempotent; the print operation merely serves as a side effect, not impacting the returned value.

Let's take our add_three(number) example one step further:

add_three_calls = 0

def add_three(number):

"""Return number + 3."""

global add_three_calls

print(f'Returning {number + 3}')

add_three_calls += 1

return number + 3

def num_calls():

"""Return the number of times add_three was invoked."""

return add_three_calls

Although we print to the console and modify a global variable, the function remains idempotent as these side effects do not influence the return value.

A pure function has no side effects whatsoever. It does not rely on external data to compute its output and avoids any interactions with the broader system or program beyond simply computing and returning a value. While our revised add_three(number) function is still idempotent, it is no longer pure.

Pure functions refrain from logging or printing, avoid database or network connections, and do not modify non-local variables or invoke non-pure functions. In essence, they do not engage in what Einstein referred to as "spooky action at a distance."

In Python, they are the most reliable functions, being highly testable and maintainable. Testing them is swift and uncomplicated, as they do not require the mocking of external resources, no setup, and no cleanup.

To clarify, idempotency and purity are aspirational goals rather than strict requirements. The aim is to structure code in a way that minimizes side effects and external dependencies, resulting in easier testing, even if not every function is purely idempotent.

In Summary

In conclusion, the key to writing effective functions is straightforward: adhere to established best practices and guidelines. I hope this article has proven useful. Now, go out there and share the knowledge! Let's strive to write exceptional code consistently, or at the very least, commit to reducing the amount of "bad" code we contribute to the world.

The first video titled "5 Tips To Write Better Python Functions" offers practical insights into improving your function-writing skills.

The second video, "The Ultimate Guide to Writing Functions," serves as a comprehensive resource for mastering function writing in Python.

takarajapaneseramen.com