Thursday 17 February 2022

How to reduce complexity and know everything.

Quality


We have a great QA.

His code is like a Rolex.

You look at his gherkin, and smile.  It says what it's testing, in plain English, and there's nothing else on the page to confuse or distract you... just English.

"Why then, did our unit tests give me a feeling of dread?"  I asked myself.  It was quite obvious, that even though we used a neat, little NuGet package, called BDDfy to allow us to write English gherkins, our code was littered with private methods and other necessary evils:

private readonly Subject _subject;

MyTestClass()
{
_subject = new Subject();
}

[Fact]
public void WhenNoOneReadsYourBlog_WriteABunchOfCrap()
{
this
.Given(_ => _.NoOneReadsYourBlog())
.When(_ => _.YouFeelLikeWriting())
.Then(_ => _.JustWriteABunchOfCrap())
.BDDfy()
}

private async Task IAmADumbAnnoyingPrivateMethodThatNoOneWantsToSeeAsync()
{
var IMyInterface mockableMockObj = Substitute.For<IMyInterface>();
mockableMockObj.SomeStupidMethodThatNoOneReadingTheTestCaresAbout
.ShouldReturnSomethingElseInsteadOfNotDoingSomethingElse();

for(var i in someStupidComplicatedIteratableThing)
{
await WhyAmIEvenHereAsync();
}
}

"If we could just hide all the non-gherkin stuff, that would make the code more pleasing to read, understand, and work with," I thought to myself.  So I suggested to my team that we use a partial class.  We put the public methods, with the English definition of the tests in one file, and everything else in another.  The team loved the idea, so that is what we do.

Now, we still have many tests in a file (See how I organize unit tests here).  So, the question is, are the tests still useful, to understand what the system does?

Let me put it another way... how do you know everything?


Simplification

The company I work for, IHS Markit, is merging with S&P Global.  The combined company will have a hundred gazillion staff members, but only 6 divisions.  How do you know what S&P do?  You list the 6 divisions.

The universe consists of 200,000,000,000,000,000,000,000 stars.  Perhaps it can be summarized as a bag of galaxies.

It takes many years to learn about the human body, but you can summarize it as five things: head, neck, body, a pair of arms and a pair of legs.

Simplifying complexity into a tree, where each node has about 5 children, makes everything easy to understand.

Microservices is the modern, cool way to develop distributed systems.  AWS S3 consists of 300 services.  I haven't seen this architecture, but I imagine if they're just a list of services that call each other higgledy piggledy, then AWS has a big problem.  Our software has 30 services, and even that is a lot to try to understand.

My point is, that I think pretty much everything in software, and perhaps even life, needs to be broken down into a tree, limited to about 5 child nodes per parent to make it easy to understand.  Now, don't get me wrong... 5 is more of a thumb suck than anything else, but 5 is a handful.

You may be wondering, well, what if it can't be broken down?

  • Maybe I have a property for each country... well, that's just a bag, collection, list, or array.
  • Maybe my system just has 10 things it needs to do... categorize them... facilities, business logic, etc.
My hypothesis is that any group of items can be broken down into smaller groups... it just takes a bit of time to think up suitable categories, and re-organize things.


Code Organization

Problems with flat, rather than code organized in hierarchies, adding to complexity are:

  • Private methods, private fields and private classes clutter the code, making it more difficult to understand than it needs to be.
  • Long methods.

Normally we wouldn't organize classes in a flat structure.  We'd use a hierarchy of folders.  I think it would be interesting if a class or namespace was a folder.  Perhaps an IDE plugin could be written.  It might look like this:

MyApp.csproj
    Program.cs
    Utilities.cs
        DataTypes.cs
            Strings.cs
            Dates.cs
                Conversion.cs
                Validation.cs
        Files.cs
            Encryption.cs
            Persistence.cs
        


Contemplation

Consider code within classes too.

Would limiting methods to 5 per class or interface help to make it easier to understand?  The interface segregation principle suggests keeping interfaces "small" to prevent having classes that have to implement methods unnecessarily.  It may require a little less thought, to have alarm bells going off in your head when you reach 5 methods, to consider whether another interface is required.

Can limiting properties to 5 per class make them easier to work with?  I've seen some pretty large view models.  Code becomes much cleaner when they're broken down into a tree structure.  It can be tricky to change it later, so best to keep them small from the start. 

Can limiting lines to 5 per method make the logic easier to follow?  Uncle Bob says, "extract, extract, extract."  If you follow that, your methods will be pretty small.  Using the number 5 might be too artificial, but when you have to scroll to see your whole method, you've probably gone way too far.

I've heard a suggesting that methods should only have 3 parameters.  The suggestion is that if there are more, they should be in a class.

Disclaimer:  As I've only thought of  some of these ideas recently, I haven't experimented with these ideas very much.  So, I'm going to try them out, and see how it goes.

Thoughts?


Final Thoughts

A tree with a single root, and 5 children per parent, and a depth of 11 nodes, would have just under 10 million leaf nodes.  That's a hell of a lot of complexity, but simple enough to navigate.

There's no need to learn everything.  That's impossible.  I think the most important things to learn, as a programmer, are the capabilities of tools.  That way, whatever you need to build, you can pick the right tool, and then you just need to learn how to use it.  Navigate the tree of knowledge to the depth you need at the time.

Image credits:

https://www.freeimages.com/photo/skeleton-pocket-watch-back-1637098

https://www.freeimages.com/photo/nebula-space-astro-photo-astronomy-sky-1420873

https://www.freeimages.com/photo/burning-tree-1377053

https://www.freeimages.com/photo/dave-in-window-1553933




Tuesday 15 February 2022

AWS Lambda vs Fargate

My manager, 7 finger Lucy
The following story is a re-enactment of what actually happened.  Names, dates, species and quite a lot more have been changed to protect those involved in the crimes against code...

Smoke emerged from management ears.  It had been months.  Six, perhaps, since our principal dev-ops ninja had started converting our twenty-eight Windows web services to run in containers.  Not a single container had been demoed yet.  "Stephen, we'd like you to take over this project, and... err... get it done quickly."

"Challenge accepted," I replied, retrieving an extra keyboard, pulling up my sleeves, and tightening my head band.  Browser open, my fingers sprang into action, "Google, what is...?"

I know what a container is, but err... yeah, my experience lacked... experience.

"So, principal dev..."  (I'll call her Bob)

"...Bob, what have we done so far?"

"600 hundred gazillion lines of pure code, she winked... YAML, Kubernetes, Terraform, nginX, Helm Charts, Docker, WSL 2, Artifactory and Windows Terminal.  She licked her lips.  Here, let me show you..."

Pete
The challenge, it appeared, seemed a bit more challenging that I had anticipated.  "Pete," I called our AWS consultant, because his name was Pete.

"Fargate," Pete replied.  A gong resounded.  I don't know why.  "Less problems, with Fargate, you will have."

I liked the idea.  I liked AWS.  Fargate was AWS.

I googled, "YAML, Kubernetes, Terraform, nginX, Helm Charts, Docker, WSL 2, Artifactory, Windows Terminal, Fargate."

Then I YouTubed, "YAML, Kubernetes, Terraform, nginX, Helm Charts, Docker, WSL 2, Artifactory, Windows Terminal, Fargate."

"I no longer have any more to teach you, my son," said the voice in my head.  It sounded like Pete.  "A container, in docker, you must create.  To Fargate, it must go."

"Kubernetes vs. Fargate," I typed into YouTube.  "EKS vs. ECS," it suggested.  "Fine, EKS vs ECS," I replied.  "ECS is simpler," it replied.

"Simple, I like simple," I told YouTube.

I ignored Bob's Git branch.  "600 hundred gazillion lines of pure code..." ignored.

I powered up PowerShell, installed WSL 2, formed my CloudFormation, setup a GitLab runner, bashed a new Bash script, docked a container in Docker, shipped it to ECR and Artifactory, and sailed it on to Fargate.

To be fair, it took me two months to release our first container.  There were hurdles, every step of the way, from setting up the GitLab runner to getting the containers to scale fast enough.  What I had achieved, however, was a custom solution that allowed us to 

  • continue to deploy to our individual, development EC2 environments
  • continue to be able to debug and step through any part or all of our application locally
  • build and run auto-scaling Linux containers in multiple environments
The cherry on the cake, however, was that I could show a colleague what to do for the rest of the application, and by simply copying and pasting a Dockerfile (renaming the entry point), and configuring a few values in a JSON file (CPU, memory, storage, scaling profile) he was able to convert a web service.  Setting up the remaining 27 services took a couple of days.

I patted myself on the back.


"Lambda, an alternative to containers, is.  To Lambda, your code could go."

"But you said,..."

"Mmmm...  Simpler.  Simpler your solution could be."


I remembered Lambdas.  Python scripts, that timed out and ran out of memory... very quickly.  And we were charged... for every call!

"Que?"  I asked the voice in my head.

"Not just Python... all languages.  Compiled code and NuGet packages run, it can!"

"Quicker, your API calls must be."

"Less memory, your APIs use, they must."

"For every million calls, charged, you will be."


And so it was, that I found myself questioning whether or not I had made the right choice.  I think the gist is, that there are always going to be alternatives.  Often there isn't a right choice.  While prepping and doing research to find a better solution before starting is optimal, when deadlines crop up, there  may not be enough time to do enough researching and prototyping.  Sometimes it's necessary to ask the experts what they think, and go with that.  


In the case of Lambdas vs ECS Fargate, I think the most important considerations are firstly, can it actually be done as a Lambda?  Lambdas are, at the time of writing this, restricted to 15 minutes and 10GB of RAM.  Secondly, it's price.  If the code base is small, loads quickly and only needs to run for a few seconds at a time, Lambda could be cheaper.  For long running code, with large binaries, that doesn't need to scale up and down as quickly, Fargate is cheaper.  If you want to keep data cached in memory for a long time, Lambda is not going to work.  If you want to prototype something for free, and don't expect much traffic, Lambda might be the way to go.  As with many services in AWS, one has to do the price math, to make good decisions.


Here's a useful breakdown for anyone who would like to know more about the costs.

How to reduce complexity and know everything.

Quality We have a great QA. His code is like a Rolex. You look at his gherkin, and smile.  It says what it's testing, in plain English, ...