January 12, 2024
Python Generators Are Underutilized
One of my favorite features in the python ecosystem is one I don’t often see utilized:
Generators
Introduction
Python generators generate values upon reaching yield statements. Unlike loops, generators yield values one at a time. Consider the following code snippet:
def square_vals(x : list):
return [val * val for val in x]
nums_to_square = list(range(500))
print(square_vals(nums_to_square))
This will yield the following output:
[0, 1, 4, 9, 16, 25, 36, ...]
Let’s bring in a utility library called tracemalloc. This is a python built-in library with some very simple utilities for tracking memory allocations in python programs. Let’s modify our program to track the amount of memory allocated for this entire program:
def square_vals(x : list):
to_return = [val * val for val in x]
return to_return
tracemalloc.start()
nums_to_square = list(range(500))
squares = square_vals(nums_to_square)
initial_mem, peak_mem = tracemalloc.get_traced_memory()
snapshot = tracemalloc.take_snapshot()
pprint.pprint(snapshot.statistics('lineno'))
pprint.pprint("Peak usage:"+ str(peak_mem))
tracemalloc.stop()
This will output:
[<Statistic traceback=<Traceback (<Frame filename='<filepath>' lineno=<2>>,)> size=19616 count=484>,
<Statistic traceback=<Traceback (<Frame filename='<filepath>' lineno=<5>>,)> size=11832 count=245>]
<Statistic traceback=<Traceback (<Frame filename='<filepath>' lineno=7>,)> size=64 count=2>]
'Peak usage:31648
Here we’re seeing the amount of memory allocated at each line, as well as the peak memory. So in this code block, we’re performing the following allocations:
py to_return = [val * val for val in x]
allocates 19616 bytespy squares = square_vals(nums_to_square)
allocates 11832 bytespy to_return = initial_mem, peak_mem = tracemalloc.get_traced_memory()
allocates 64 bytes
In total, we utilized a maximum of 31,468 bytes in this program.
Now to add a generator.
Consider the following code:
def gen_square_vals(x : list):
for val in x:
yield val * val
tracemalloc.start()
nums_to_square = list(range(500))
squares = gen_square_vals(nums_to_square)
initial_mem, peak_mem = tracemalloc.get_traced_memory()
snapshot = tracemalloc.take_snapshot()
pprint.pprint(snapshot.statistics('lineno'))
pprint.pprint("Peak usage:"+ str(peak_mem))
tracemalloc.stop()
This will output:
[<Statistic traceback=<Traceback (<Frame filename='<filepath>' lineno=<7>>,)> size=11932 count=484>,
<Statistic traceback=<Traceback (<Frame filename='<filepath>' lineno=<8>>,)> size=208 count=245>]
<Statistic traceback=<Traceback (<Frame filename='<filepath>' lineno=9>,)> size=64 count=2>]
'Peak usage:12040
This is interesting! Note that, compared to our first example, the act of providing the squared numbers yield val * val
does not actually lead to any memory allocation, whereas to_return = [val * val for val in x]
does. Instead, in the latter code block, we end up allocating 208 bytes on the line that calls the gen_square_values()
. This is the beauty of a python generator!
So what’s happening here?
In our first example, when we call square_vals()
, we create the list containing all the squared values in memory and return that memory block from the method. When we turn this into a generator, we are instead creating a generator object. This generator object creates values on-the-fly, that is: we won’t allocate any of the squared values until we’re ready to use them.
If we go back to our codeblock, we can see this in action by adding a new loop:
def gen_square_vals(x : list):
for val in x:
yield val * val
tracemalloc.start()
nums_to_square = list(range(500))
squares = gen_square_vals(nums_to_square)
gen_to_list = [square for square in squares]
initial_mem, peak_mem = tracemalloc.get_traced_memory()
snapshot = tracemalloc.take_snapshot()
pprint.pprint(snapshot.statistics('lineno'))
pprint.pprint("Peak usage:"+ str(peak_mem))
tracemalloc.stop()
This just block just takes our generator and yields all the values into a list called gen_to_list
. When we run this, we get additional output:
[<Statistic traceback=<Traceback (<Frame filename='<file_path> lineno=5>,)> size=15456 count=483>,
<Statistic traceback=<Traceback (<Frame filename='<file_path> lineno=2>,)> size=11832 count=245>,
<Statistic traceback=<Traceback (<Frame filename='<file_path> lineno=7>,)> size=4160 count=1>,
<Statistic traceback=<Traceback (<Frame filename='<file_path> lineno=6>,)> size=208 count=1>,
<Statistic traceback=<Traceback (<Frame filename='<file_path> lineno=8>,)> size=64 count=2>]
'Peak usage:31856'
We’ve managed to get our peak memory usage back up! Notice that the line that generates values is, once again, allocating 11832 bytes when we yield all of our values into a list. This reveals the underlying functionality of the generator: it exists as a small, lightweight object that generates values one by one.
Scaling Up
So what’s the utility here? Well that becomes more obvious when we change things around a little. Let’s modify our code to generate 100,000 numbers instead of 500. Additionally I’m going to change the line:
nums_to_square = list(range(100000))
to:
nums_to_square = range(100000)
in our generator example. (The range builtin in Python acts very similarly to a generator but is slightly different)
Running our example using lists give us the following benchmarks:
[<Statistic traceback=<Traceback (<Frame filename='<file_path> lineno=6>,)> size=4000384 count=99984>,
<Statistic traceback=<Traceback (<Frame filename='<file_path> lineno=14>,)> size=3991832 count=99745>,
<Statistic traceback=<Traceback (<Frame filename='<file_path> lineno=16>,)> size=64 count=2>]
'Peak usage:7992416'
7,992,416 is a lot of bytes for not doing a lot of work. We’re already at 7MB of memory usage. Let’s instead look at a generator implementation:
[<Statistic traceback=<Traceback (<Frame filename='<file_path> lineno=8>,)> size=208 count=1>,
<Statistic traceback=<Traceback (<Frame filename='<file_path> lineno=7>,)> size=80 count=2>,
<Statistic traceback=<Traceback (<Frame filename='<file_path> lineno=9>,)> size=64 count=2>]
'Peak usage:288'
288 bytes? Okay, but we’re not doing anything with those values, let’s actually do something with the generator:
def gen_square_vals(x : list):
for val in x:
yield val * val
tracemalloc.start()
nums_to_square = range(100000)
squares = gen_square_vals(nums_to_square)
for square in squares:
print(square)
initial_mem, peak_mem = tracemalloc.get_traced_memory()
snapshot = tracemalloc.take_snapshot()
pprint.pprint(snapshot.statistics('lineno'))
pprint.pprint("Peak usage:"+ str(peak_mem))
tracemalloc.stop()
This prints:
[<Statistic traceback=<Traceback (<Frame filename='<file_path> lineno=8>,)> size=208 count=1>,
<Statistic traceback=<Traceback (<Frame filename='<file_path> lineno=7>,)> size=80 count=2>,
<Statistic traceback=<Traceback (<Frame filename='<file_path> lineno=11>,)> size=64 count=2>,
<Statistic traceback=<Traceback (<Frame filename='<file_path> lineno=3>,)> size=32 count=1>]
'Peak usage:784'
That’s still tiny!
Why?
This is obviously just an exceptionally simple example of using generators to put data to the console, but generators are exceptionally useful in many applications. Python is often used as a glue language for ETL pipelines and batch jobs, and many engineers implement these by creating massive list objects within their application. Consider a simple JSON schema that we parse into a python object:
{
"user_id": 123456789,
"username": "john_doe",
"full_name": "John Doe",
"email": "john.doe@example.com",
"age": 30,
"gender": "male",
"country": "United States",
"city": "New York",
"occupation": "Software Engineer",
"education": "Bachelor's Degree",
"interests": ["programming", "hiking", "reading"],
"languages": ["English", "Spanish"],
"last_login": "2024-02-22T08:30:00Z",
"subscription": {
"type": "premium",
"start_date": "2023-01-01",
"end_date": "2024-12-31"
}
}
If we’re ingesting hundreds of thousands of these objects, storing those all in memory is going to get expensive quickly. Generators enable us to avoid creating these objects all in memory at once. This can be invaluable in reducing costs and increasing performance in memory limited cloud compute environments.
Gotchas
Although convenient, generators do have some limitations that must be considered before using them:
Generators can’t be rewound or peeked without additional code
Once a value has been generated, it cannot be regenerated from the same generator. Likewise, there is no way to compute the next value of a generator without also consuming it.
Generators can be tricky to debug
Line-by-line debuggers often struggle inspecting the underlying object as they are unrewindable by nature. I am not currently aware of a python visual debugging application that allows for the inspection of generated values without also consuming the value. This can be worked around by implementing custom code on top of generators.
Generators can increase mental load of understanding code
The under utilization of generators means that they are often difficult to understand for those with limited experience with python. It can take some time for people to develop the mental abstraction of “this effectively becomes a list when iterated over”. The apparent “out-of-order” execution can be hard to parse at first. Take the example:
def do_hello():
print("hello")
yield 5
print("world")
hi = do_hello()
print("goodbye")
for test in hi:
pass
It can be confusing, without a solid understanding of generators, to parse the order of output in this script.
December 14, 2023
Click Ops Is A Recipe For Disaster
Click-Ops
Developer operations performed by clicking through a cloud providers UI and manually entering data
Anyone who has used AWS, GCP, or any other cloud provider knows the massive amount of configuration that goes into setting up resources in these environments. Most major cloud providers provide a web-based user interface on-top of cloud resources for configuration.
These consoles can be great tools for learning to use these services, but often become a hinderance to work in more complex environments. Here’s a list of reasons to kick your click ops habits:
You ARE going to do that more than once
Without fail, one of the go-to excuses for performing Click-Ops actions within a cloud provider is:
“Writing a script will take more time, I’m only going to do this once!”
I would challenge that assumption. In most engineering environments, it’s reasonable to expect that any action performed in a production environment would first need testing in a non-production environment. Assuming that you are performing work that will eventually need to exist in prod, you can help yourself by automating the process and ensuring the actions performed are identical across environments.
Additionally, the biggest contributor to my understanding of IaC scripting and configuration was requiring myself to write scripts.
Your cloud provider’s configuration set is bigger than human working memory
Human working memory is a concept that models the ability of the human brain to store information temporarily. If you are memorizing AWS configuration variable temporarily, you are using working memory. This poses the following problems:
- Working memory is limited in size (most untrained individuals can consistently memorize 7-12 items before becoming inconsistent)
- Very few AWS components have less than 12 configuration properties
- Moving from working memory to long-term memory is not easy
Pure text configuration is easier to read and validate
Cloud-provider UIs are not designed for collaborative work. The only way to properly validate a configuration with another engineer from a cloud provider user interface is to share a screen with another developer. Written scripts create a written, auditable, verifiable record of the actions that were taken.
This:
resource "aws_route53_record" "www" {
zone_id = aws_route53_zone.primary.zone_id
name = "www.example.com"
type = "A"
ttl = 300
records = [aws_e
Will always be easier to parse than this:
You’ll be surprised how often they come in handy
I maintain a list of scripts that I frequently use in my day-to-day tasks at work. When I first started retaining these scripts, I didn’t realize how often I was doing certain tasks. I was pleasantly surprised every time I thought I had to perform a time-consuming task manually, only to remember I had scripted it out previously
September 22, 2023
What We Talk About When We Talk About Maintenance
Building software creates a need to maintain that software. This is true regardless of the user base. Chances are: if it is being used, it will need to be maintained. This has become especially true in the day-and-age of containerized, dependency heavy applications that, even completely static functionally, will likely benefit from frequent dependency updates to avoid critical security vulnerabilities. In my professional career, I’ve heard many names given to engineer capacity that is assigned to performing these tasks, but the most common I’ve encountered is Keeping The Lights On or KTLO maintenance.
Defining The Work
In a good engineering culture, KTLO would only consist of work that doesn’t involve adding new functionality to an application. This might seem simple at first: If all the work does is maintain the current functionality, then it’s considered KTLO. But not all engineering cultures are created equal and not everyone is understanding of this. What follows is my experience of the varieties and corruption of the KTLO maintenance experience.
Basic Functionality As KTLO
Corruption Level: Uncorrupted
This is KTLO work that actually keeps the lights on; work that ensures that the functionality of software is consistent. In cloud-based environments, KTLO should consist of:
- Updating dependencies
- Patching security vulnerabilities
- Validating hardware configuration (scaling groups, load balancers)
A common pitfall I’ve come across in engineering cultures is a misunderstanding of what is entailed in this type of KTLO work. Dependency version bumping can and will break functionality of your software when not properly tested. A bad engineering culture will likely not have proper automated end-to-end/functionality testing in place, and having a minor version bump of a dependency consume an appreciable amount of developer capacity will confuse management. Optimally, automation will replace the need for anyone to manually validate a minor dependency bump within a software stack, but, in lieu of that, management needs to understand that manual validation takes time.
Enhancement As KTLO
Corruption Level: Mildly Corrupt
Consider the following scenario:
You manage backend services for a shipping that allow the retrieval of customer data related to shipping addresses. This consists of an API sitting on top of a database that stores customer address data. In the past, internal customer service representatives have requested the ability to mark a shipping address as “blocked” or inactive, preventing shipments.
Requirement gathering revealed that the customer service department expects to block roughly 100 addresses/week. Your team built a simple application that allows customer service reps to send these requests by uploading a file containing customer information to an endpoint. A year later, the customer service department has need to suppress 100,000+ addresses. They upload the file using your tool and find this breaks the process. They submit a request to have the expected functionality of this tool restored.
How do you classify this work?
From an engineer’s perspective, this seems like an obvious instance of an enhancement: the requirements for the software have changed. However, in some engineering culture, this work can quickly be considered KTLO. My professional experience has revealed that this is a significant differentiator between positive and negative engineering culture.
Is the business understanding that an increase in usage by an order of magnitude is not always as easy as increasing the size of the input?
There are many engineering/business cultures that would simply consider this KTLO. After all: the functionality isn’t changing, there’s just more work expected. Management that makes this classification will likely find themselves concerned with the amount of “KTLO” that is consuming a team’s capacity.
Recovery and Resiliency Exercises As KTLO
Corruption Level: Moderately Corrupt
This one seems a bit confusing at first.
Generally speaking, KTLO is considered high priority, low cognitive load work. This is not to say that KTLO is something that can just be performed passively or without thought. Instead, it means that KTLO should not be difficult work to perform. If you find that developers are consistently having meetings to discuss and solve issues that are simply “keeping the lights on”, it might be time to re-examine your processes.
That brings us to application recovery and resiliency. For large businesses that have SLAs demanding 99.(insert remarkable amount of 9s here) uptime requirements, this is considered essential.
Resiliency and recovery for incidents is a notoriously difficult problem to solve
Huge companies struggle with it. Small companies struggle with it. A bad engineering and management culture exacerbates the issue. I’ve been on teams where the expectation was that recovery exercises and resiliency efforts were considered part of KTLO maintenance. Without fail, these teams would be questioned on the metrics surrounding the amount of capacity being consumed by these exercises.
Having your application not crash into flames during a server disruption should be considered a feature. Engineering management teams should want their developers laser focused on this functionality. To create an incentive to not focus on this work leads to applications prone to critical failure.
Ad-Hoc Requests As KTLO
Corruption Level: Dangerously Corrupt
I once worked on an engineering team managing backend APIs for internal data. Our ‘customers’ were other developers inside the company. I received a request from the product owner of a customer team asking that we make a significant feature change to our service. I informed them that we would be happy to complete the work in an upcoming sprint and would happily schedule a meeting for requirement gathering. I also informed them that, if the work was considered critical, we could work with business management to prioritize this work in the current sprint.
A day later, I received notice that this work should be considered KTLO. According to the management I discussed the issue with: this was blocking potential business, meaning the lights were off, and this work needed to be completed this sprint in addition to the assigned work.
This is exactly how a team ends up with poorly developed applications and consistent overrun of deadlines. Considering ad-hoc requests as part of KTLO defeats the purpose of categorizing the work altogether.
My team and I would eventually pull together and complete all the work requested for the sprint, but the customer ended up deciding they didn’t need the new feature we had developed after all…
Managing KTLO: Dos and Don’ts
I have worked on engineering teams that have seen significant reductions in KLTO work without impacting application performance, development velocity, or causing other disruptions. Here is what worked and what didn’t:
Do: Start discussions about KTLO at the developer level, preferably the most junior level
Newer developers that have recently had to understand what work goes into maintaining your software stack are likely to have the most insight into what is most time-consuming aspects of maintaining your systems. They can tell you what was unintuitive and difficult to onboard with.
More senior developers can describe what can be automated and what major roadblocks appear when managing your software stacks. They will know what work can be done to reduce overhead.
Don’t: Assume KTLO will be reduced once we build the new system
I’ve been on engineering teams where entirely new systems were conceived, designed, and built with a major goal being the reduction of KTLO. The problem is: it’s nearly impossible to predict what KTLO work will look like in the design of a new system. It’s likely that the KTLO that you’re struggling with on your existing system was not part of the original design plan. Consider other options before resorting to redesigns.
Do: Test, Test, Test
I am a strong proponent of the idea that consistent end-to-end testing is an exceptionally effective way to reduce the quantity of KTLO. Having an automated testing suite that can easily communicate to developers that your application functions as intended in a minimal amount of time is invaluable to validating the small changes that KTLO work entails
Don’t: Assume SaaS will be a cure all
I’ve been part of two major migrations to cloud provider managed database systems with the intent of “reducing the KTLO work associated with managing it ourselves”. I’ll admit: cloud-provider managed services do have a lot of great upsides and should definitely be part of your considerations when designing applications. But I have also seen them break in ways that require escalation and support. KTLO that is self-managed is often much cheaper than high-level AWS support for a DB cluster issue.
Do: Realize it takes time
Due to the nature of KTLO work, it often gets lost that reducing it can be a major undertaking. Communicate with your developers about how much time is needed to make these reductions, and consider if it is worth the tradeoff for you.