Merge in changes made by NGINXaaS for Azure project #1

arussellf5 · 2025-07-18T09:52:02Z

This PR adds changes made by the NGINXaaS for Azure project to this repo.

The primary intent behind this is to keep retrying updates which may be made before the controlplane has registered the existence of a named upstream in the customer's NGINX configuration.

NOTE: This commit was accidentally missed out in the first iteration of this fork and is present in upstream: https://github.com/nginxinc/nginx-loadbalancer-kubernetes/blob/main/internal/configuration/settings.go#L189 The code needs to move forward to start the watchers and health server (side note: we should also think about the ordering of these at some point) and not running the infomer in a go routine prevents the program from further execution until the context is canceled. In the current iteration on main, the controller is stuck on the informer, and then k8s kills the service and restarts it since the health server is not up.

The user specifies the ingress service whose events the application should watch through setting the "service-annotation-match" annotation on the application's config map. Only events with a matching service annotation will be passed by the watcher to the handlers. The informer now listens to events from all namespaces. This frees the end user from the restriction of only using the nginx ingress controller.

…eam name The port name should now be formatted like this: "http-tea", where the first part of the string is the context type (either "http" or "stream") and the second part of the string after the hyphen is the name of the upstream.

We need to be able to publish the operator to dockerhub in order to be publicly available for customers. Following what we have in ARP as a release strategy where a release tag action would publish the image to dockerhub.

go test does not have a good way to capture unit-test coverage as part of test runs. This commit captures the error code of the unit test run, runs the coverage generation and then exits based on the test status.

Helm will be used as part of the user story to deploy the operator but it is also a good tool to deploy the operator while developing it. This commit adds the ability to publish helm charts: - to the dev registry for local iteration. - to the regular devops registry for CI iteration and testing. This will also help us test the helm chart itself.

0.0.1 is okay but it's not really a bug fix.

We should keep a single release script that will publish docker images and helm charts for the official release. This commit just updates the current release script to handle both artifact types. Helm logic will follow.

Previously, owing to a bug, if the name of the upstream included hyphens it would be rejected by the operator.

We are going to release with `0.x.y` and keeping the internal version inline with what will get published out will reduce confusion.

This repository is going to produce artifacts that will be available publicly and the end users will care about semantic versioning. We need to be able to map a public facing version to internally produced artifacts easily and having semver internally eases that work. This commit does not enforce the versioning but adds a version file that has the semver, which will be used to version the product. We can follow a workflow where during release time, we cut a release, which creates a tag and we retag existing dev artifacts to be shipped as an official artifact.

While a cosmetic change, it does impact how dokerhub repo needs to be setup to publish the helm chart. In addition to that, it impacts what the user see on k8s itself and it should not be nginx-loadbalancer-kubernetes as that is confusing.

While the chartname (which was renamed in the prior commit) gets used for naming stuff, it makes sense to also change the name value in the chart itself to use the new name.

Now that official images exist on docker hub, we should use those images in our charts.

Now that the chart has been renamed, gitignore needs to know about ignoring new charts.

This lets us test the operator from the private registry. Also, in case a customer does not want to pull from docker hub and instead use their own registry, they can do so and specify a pull secret for the image.

With the change to make the operator read a config file (to reduce code complexity), we need to make sure that the file exists for the service to start.

Once the operator reads from the configmap mounted as a volume, it does not need access to the k8s API to read the config map itself.

This is needed temporarily while nlk still needs to read the configmap and the code for swapping it out with a config file does not land. The order of merges that I am thinking: - Get this MR landed. - Land testenv changes to use helm. - Land code to read settings from configfile. - Remove configmap permissions.

This means that we do not have to grant the operator app any permissions to access kubernetes secrets. Reading from the config file changes the application's behavior in that settings are no longer changed whenever a config map is updated at run time, as they used to be. Settings are now concurrency-safe, because they pass new values instead of a shared pointer to their consumers in separate goroutines. Settings also no longer pass the application context to consumers as this is a well-document go anti-pattern. Context should always be passed as a function parameter. Any operator modules that need access to a kubernetes client are constructed with a reference to the client, instead of gaining access to the client through settings.

The latest versions of the kubernetes libraries recommend using a typed workqueue and this reduces a bit of boilerplate and error handling, because we no longer have to cast the workitems returned by the queue into the desired types.

Multiple parallel tests were all accessing the same pointer to a single variable for the DefaultTransport in the http package. This was leading to data races in unit tests.

There's a syntax issue in the "listen" directive. Should be "listen 9113;", not "listen 9113:". Using the current file, I got an error upon reloading the NGINX service ("nginx: [emerg] invalid port in "9113:" of the "listen" directive in /etc/nginx/conf.d/prometheus.conf:11")

There was a change in the API for the NGINX Plus Client that was missed when updating to the latest version. This corrects that.

These functions were being used to create IDs that were not really necessary for the business logic and which were generating security alerts because of weak cryptography techniques.

The http client is processing requests created by the nginx plus client library, and that library should always include a sensible number of headers. But the lack of change on the number of headers was causing security vulnerability flags to be raised over denial of service resource exhaustion attacks.

Go cache in the CI is seeded in the project working directory. We should skip the mod cache from lint/formatting as it's upstream code and there are high chances of the linting failing as upstream lint rules != our lint rules.

…inx hosts In order for the nginx-hosts yaml field to be parsed correctly by viper the template needs to: 1. not put double quotes around the value (this causes viper to interpret it as a string) 2. render it as a JSON array rather than a go representation of a slice.

The biggest change here is to remove most the TLS modes to enable mTLS and self-signed certificates. Product decided that this was too complex and there was not enough user demand for most of these options. We decided to pare down the code and remove tests that were no longer well maintained. The remaining configuration allows users to toggle a single switch: whether to make the http client verify the NGINX host's certificate chain and host name if https is being used. If the user wishes to enable https with self-signed certs they can use the "skip-verify-tls" setting to allow this. The default behavior is to perform this verification. We are maintaining the deprecated "no-tls" and "ca-tls" inputs for NGINXaaS backwards comptability reasons. The "no-tls" setting name was highly misleading, because all it did was disable TLS verification: it DID NOT disable TLS altogether in https mode. Similarly, the "ca-tls" setting did not enable TLS itself. TLS is enabled by default when the URL of the NGINX host includes the https protocol. The user setting merely enforced the verification of the certificate chain and host as described above.

Now that the plus go client allows users to check the http status code of the error, handle the upstream not found case by doing nothing.

This is a go anti-pattern

…pen source repo

puneetsarna and others added 30 commits July 17, 2025 17:28

Seed repo

64a9b9d

Format code

a66041f

Lint code

bb270b9

Update go version and deps

8584802

Disable exhaustruct linter for now

00cdef2

NLB-4655 NLK will retry a work item to update upstreams indefinitely

c8c1baf

The primary intent behind this is to keep retrying updates which may be made before the controlplane has registered the existence of a named upstream in the customer's NGINX configuration.

Update binary/docker img to nginxaas-operator

e011b20

NLB-5282: Allow images to be pushed to Dockerhub

7aa99ae

We need to be able to publish the operator to dockerhub in order to be publicly available for customers. Following what we have in ARP as a release strategy where a release tag action would publish the image to dockerhub.

Remove unneeded file

067a631

Capture code coverage

a4ac798

go test does not have a good way to capture unit-test coverage as part of test runs. This commit captures the error code of the unit test run, runs the coverage generation and then exits based on the test status.

NLB-5360 Upgraded nginx-plus client to v 1.2.2

3786008

NLB-5065 Operator adds API Key to header

12ba086

Update Chart version to be 0.1.0

21d4bbe

0.0.1 is okay but it's not really a bug fix.

Update release script to handle dual artifacts

6ecc8f0

We should keep a single release script that will publish docker images and helm charts for the official release. This commit just updates the current release script to handle both artifact types. Helm logic will follow.

NLB-5549 Translator allows hyphens in upstream name

beb7ad4

Previously, owing to a bug, if the name of the upstream included hyphens it would be rejected by the operator.

Keep major version to 0

9ca27de

We are going to release with `0.x.y` and keeping the internal version inline with what will get published out will reduce confusion.

Rename the chart to nginxaas-operator

f48f170

While a cosmetic change, it does impact how dokerhub repo needs to be setup to publish the helm chart. In addition to that, it impacts what the user see on k8s itself and it should not be nginx-loadbalancer-kubernetes as that is confusing.

Update name to be operator

975fff6

While the chartname (which was renamed in the prior commit) gets used for naming stuff, it makes sense to also change the name value in the chart itself to use the new name.

Update registry paths to dockerhub

32aaf74

Now that official images exist on docker hub, we should use those images in our charts.

Update gitignore with new chart name

e75a756

Now that the chart has been renamed, gitignore needs to know about ignoring new charts.

Add support for image pull secrets

80200a3

This lets us test the operator from the private registry. Also, in case a customer does not want to pull from docker hub and instead use their own registry, they can do so and specify a pull secret for the image.

Mount configmap as a volume

7d05b7b

With the change to make the operator read a config file (to reduce code complexity), we need to make sure that the file exists for the service to start.

Drop configmap access for operator

fd31461

Once the operator reads from the configmap mounted as a volume, it does not need access to the k8s API to read the config map itself.

arussellf5 and others added 27 commits July 18, 2025 10:39

NLB-6293 Upgraded golangci-lint to v1.64.5 and fixed configuration

d6be96b

NLB-6294 Bumped version to 1.1.1

236137f

NLB-6294 NewTransport clones the default http.DefaultTransport variable

180acca

Multiple parallel tests were all accessing the same pointer to a single variable for the DefaultTransport in the http package. This was leading to data races in unit tests.

prometheus config update (nginxinc#178)

c97e398

update prometheus files (nginxinc#179)

57877fa

fix typo (nginxinc#181)

213a5d0

Corrects the NGINX Plus Client interface

df7f96e

There was a change in the API for the NGINX Plus Client that was missed when updating to the latest version. This corrects that.

Removed synhronization's random functions

fc21eaf

These functions were being used to create IDs that were not really necessary for the business logic and which were generating security alerts because of weak cryptography techniques.

Upgraded go to 1.23.8 and golang.org/x/sync to v0.12.0

85f81a1

Bumped version to 1.1.2

5454980

Skip go-cache while linting

b16c2be

Go cache in the CI is seeded in the project working directory. We should skip the mod cache from lint/formatting as it's upstream code and there are high chances of the linting failing as upstream lint rules != our lint rules.

unit test flake fix and go version upgrade to 1.24.3

1bfc2c4

NLB-6295 Bumped version to 1.2.0

5f618f9

Updated go version to 1.24.4

4a84034

NLB-6754 When deleting upstream servers handle upstream not found error

f4fb000

Now that the plus go client allows users to check the http status code of the error, handle the upstream not found case by doing nothing.

NLB-6754 Bumped version to 1.2.1

950de97

Removed context as a field within the nginx stream border client

d0f4692

This is a go anti-pattern

Betterification

2673517

Updates cosign

9ad94d2

Needs --yes flag

c19629a

Resolved merge conflicts involved in bringing NGINXaaS changes into o…

a96ce41

…pen source repo

Re-added tool-versions

9b190c4

arussellf5 requested review from dwmcallister and kuthiala July 18, 2025 10:28

kuthiala approved these changes Jul 18, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Merge in changes made by NGINXaaS for Azure project #1

Merge in changes made by NGINXaaS for Azure project #1

arussellf5 commented Jul 18, 2025 •

edited

Loading

Uh oh!

Uh oh!

Merge in changes made by NGINXaaS for Azure project #1

Are you sure you want to change the base?

Merge in changes made by NGINXaaS for Azure project #1

Conversation

arussellf5 commented Jul 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

arussellf5 commented Jul 18, 2025 •

edited

Loading