|
| 1 | +--- |
| 2 | +title: Windmill on AWS ECS |
| 3 | +--- |
| 4 | + |
| 5 | +Windmill can be deployed on an ECS cluster. Below are the detailed steps to get a Windmill stack up and running. The number of servers and workers, as well as the instance sizes, should be tuned to your own usecases. |
| 6 | + |
| 7 | +To give a brief overview, we will first create a VPC and an associated security group. This VPC will contain the Database, the instances powering the ECS cluster, and the windmill containers. Then we will create a database on RDS, and finally the ECS cluster. |
| 8 | +Once the ECS cluster is running, we will define the tasks definitions and the associated services. Note again that the architecture of your windmill stack very well depends on how you're planning to use Windmill. For this tutorial, we will create a Windmill stack of this shape: 2 windmill servers, 1 "multi-purpose" windmill worker, 1 "native" worker. Windmill LSP and Multiplayer will also be deployed. |
| 9 | + |
| 10 | +# Create a VPC and a security group |
| 11 | + |
| 12 | +1. Go to AWS VPC and create a new one. |
| 13 | + a. Name it `windmill-vpc` |
| 14 | + a. Choose a CIDR block of your choice. We're going to use `10.0.0.0/16` here. Any value will work, just make sure you'll have enough IPs in the CIDR |
| 15 | + a. IPv6 is not required |
| 16 | + a. We recommend using at least 2 availability zones, with 2 public subnets and 2 private subnets |
| 17 | + a. NAT is required for ECS containers to have access to internet. We recommend creating 2, one in each AZ. |
| 18 | + a. We recommend using the S3 Gateway if you're planning to leverage S3 in Windmill |
| 19 | + a. Enable both DNS options |
| 20 | +1. You will need a security group linked to this VPC. Go to AWS Security Group menu and create a new one |
| 21 | + a. Name it `windmill-sg` |
| 22 | + a. Link it to the VPC created above |
| 23 | + a. Add the following inbound rule: `All traffic FROM this security group` |
| 24 | + a. If you want to SSH into the servers, you might want to add the inbound rule: `SSH from MyIP/Anywhere`. This is not required but might help with debugging |
| 25 | + a. Add a rule `HTTP traffic FROM anywhere` to be able to hit Windmill Server from your workstation. You can refine security here by having 2 security groups with only one allowing HTTP traffic. And later you'll place only the server in this security group and all the other containers in the more restricted security group. For simplicity, we will just have one security group here. |
| 26 | + a. The default outbound rule can be used: `All traffic TO 0.0.0.0/0` |
| 27 | + |
| 28 | +# Create a RDS database |
| 29 | + |
| 30 | +1. Go to AWS RDS |
| 31 | +1. Standard create |
| 32 | +1. Engine option: |
| 33 | + a. PostgreSQL |
| 34 | + a. Any recent version of PostgreSQL will work. At the time of writing we're using 16.1-R1 |
| 35 | +1. Template |
| 36 | + a. Choose the template of your choice (Production VS Dev/Test) depending on your needs. Obviously this will greatly impact costs |
| 37 | +1. Availability and durability |
| 38 | + a. This will be preset depending on the template option you chose. You can customize it depending on your needs. |
| 39 | +1. Settings |
| 40 | + a. Name it `windmill-db` |
| 41 | + a. Leave `postgres` as the username, and choose a string master password. Keep it in a safe place, you will need it in the following steps |
| 42 | +1. Instance configuration |
| 43 | + a. The instance will be preset depending on the template you chose. Default is usually a good starting point |
| 44 | +1. Storage. This is on a per-usecase basis. Just be aware that Windmill stores logs and job results in the database. Depending on the job retention period you want to later configure in Windmill, you might different requirements |
| 45 | + a. A good starting point is 400Gib / 12000 IOPS |
| 46 | + a. If you chose Multi AZ instance or Standalone instance, it is recommended to turn on autoscaling. |
| 47 | +1. Connectivity |
| 48 | + a. Don't connect to an EC2 compute resource. Container of the ECS cluster will connect to it using its URL |
| 49 | + a. Attach it the VPC created above |
| 50 | + a. The DB doesn't need public access |
| 51 | + a. Use the security group created above |
| 52 | + a. RDS proxy can be a good option in certain cases. It is not required |
| 53 | + a. We advise to create a certificate authority and use it here |
| 54 | + a. The port can be left to the default: 5432 |
| 55 | +1. Database authentication |
| 56 | + a. Windmill uses Password authentication |
| 57 | +1. Monitoring |
| 58 | + a. Choose whatever you prefer to monitor your database |
| 59 | +1. Additional configuration |
| 60 | + a. Initial database name should be set to `windmill` |
| 61 | + a. It is advised to enable automated backups |
| 62 | + a. Encryption can be set depending on your requirement, same for log export and maintenance. |
| 63 | + |
| 64 | +# Create the ECS cluster |
| 65 | + |
| 66 | +As said in the introduction, the architecture of your stack depends of your needs. The only requires parts are one Windmill server at least one multi-purpose worker. |
| 67 | + |
| 68 | +1. Go to AWS ECS and Create a new cluster |
| 69 | +1. Cluster configuration |
| 70 | + a. Name: `windmill-cluster` and leave the namespace the same |
| 71 | +1. Infrastructure |
| 72 | + a. Uncheck AWS Fargate and check Amazon EC2 instance instead |
| 73 | + a. Create a new ASG On-Demand |
| 74 | + a. Choose Linux as the operating system. We recommend using either default (x86_64) or ARM64 architecture. WARNING: Whichever you choose, make sure the EC2 instance type chose below is matching the OS architecture |
| 75 | + a. Choose an EC2 instance type matching the OS chose above. Here we're using the default Amazon Linux 2 OS, so we will choose a `t3.medium` |
| 76 | + a. This Auto-scaling Group will host the 2 Windmill servers, the multi-purpose and native Windmill workers, LSP and multiplayer. The maximum capacity should be at least 4 hosts |
| 77 | + a. Allowing SSH access is not required |
| 78 | + a. We recommend allocating at least 100GiB of volume size |
| 79 | +1. Network settings for EC2 instances |
| 80 | + a. Attach it to the VPC and security group created above |
| 81 | + a. Make sure to select the PUBLIC subnets if your VPC has private and public ones. The instance should be on the PUBLIC subnets |
| 82 | + a. WARNING: You need to TURN ON auto-assign public IP. Otherwise the EC2 agent on the servers will not be able to register the host to the EC2 cluster. This happens if you didn't set a NAT on your VPC |
| 83 | + |
| 84 | +# Create the task definitions |
| 85 | + |
| 86 | +We will create 6 tasks definitions here: |
| 87 | + |
| 88 | +- [REQUIRED] For Windmill Server |
| 89 | +- [REQUIRED] For multi-purpose Windmill workers |
| 90 | +- [OPTIONAL] For native Windmill workers |
| 91 | +- [OPTIONAL] For Windmill LSP |
| 92 | +- [OPTIONAL] For Windmill Multiplayer |
| 93 | + |
| 94 | +### Windmill Server |
| 95 | + |
| 96 | +1. Name: windmill-server |
| 97 | +1. Launch Type: AWS EC2 instances |
| 98 | +1. OS / Arch: Linux/x86_64 (or make it match your EC2 instance type and OS architecture if you chose something different when creating the ECS cluster) |
| 99 | +1. Network mode: awsvpc. This will virtually attach the containers to the VPC networks. The NAT of the VPC is required to give access to the internet to the containers. |
| 100 | +1. Task Size: 1vCPU / 1.5GiB Memory (this is good for a cluster of 3 t3.medium. Adapt it to the hardware you provisioned) |
| 101 | +1. Task Role: None |
| 102 | +1. No Task placement |
| 103 | +1. Container: |
| 104 | + a. name: windmill-server |
| 105 | + a. image: ghcr.io/windmill-labs/windmill:main (or ghcr.io/windmill-labs/windmill-ee:main for EE) |
| 106 | + a. Essential container: YES |
| 107 | + a. Port mapping: 8000 / TCP / http / HTTP |
| 108 | + a. Resource allocation: 1 CPU / 1.5 GiB memory |
| 109 | + a. Environment variable: `MODE=server` and `DATABASE_URL=postgres://postgres:<DB_PASSWORD>@DB_HOSTNAME:5432/windmill`. Replace the hostname and password with the ones from the RDS database your created above |
| 110 | + a. Turn on log collection for easy debugging |
| 111 | + a. Add the following healthcheck: `CMD-SHELL, curl -f http://localhost:8000/api/version || exit 1` / 10s interval / 5s timeout and 5 retries |
| 112 | + a. This is it, leave the rest default |
| 113 | + |
| 114 | +### Windmill multi-purpose worker |
| 115 | + |
| 116 | +1. Name: windmill-worker |
| 117 | +1. Launch Type: AWS EC2 instances |
| 118 | +1. OS / Arch: Linux/x86_64 (or make it match your EC2 instance type and OS architecture if you chose something different when creating the ECS cluster) |
| 119 | +1. Network mode: awsvpc |
| 120 | +1. Task Size: 2vCPU / 3.5GiB Memory (this is good for a cluster of 3 t3.medium. Adapt it to the hardware you provisioned) |
| 121 | +1. Task Role: None |
| 122 | +1. No Task placement |
| 123 | +1. Container: |
| 124 | + a. name: windmill-worker |
| 125 | + a. image: ghcr.io/windmill-labs/windmill:main (or ghcr.io/windmill-labs/windmill-ee:main for EE) |
| 126 | + a. Essential container: YES |
| 127 | + a. Port mapping: No port mapping for workers |
| 128 | + a. Resource allocation: 2 CPU / 3.5 GiB memory |
| 129 | + a. Environment variable: `MODE=worker`, `WORKER_GROUP=default` and `DATABASE_URL=postgres://postgres:<DB_PASSWORD>@DB_HOSTNAME:5432/windmill`. Replace the hostname and password with the ones from the RDS database your created above |
| 130 | + a. TODO: ellaborate on volumes |
| 131 | + a. Turn on log collection for easy debugging |
| 132 | + a. This is it, leave the rest default |
| 133 | + |
| 134 | +### Windmill native worker |
| 135 | + |
| 136 | +1. Name: windmill-native-worker |
| 137 | +1. Launch Type: AWS EC2 instances |
| 138 | +1. OS / Arch: Linux/x86_64 (or make it match your EC2 instance type and OS architecture if you chose something different when creating the ECS cluster) |
| 139 | +1. Network mode: awsvpc |
| 140 | +1. Task Size: 2vCPU / 3.5GiB Memory (this is good for a cluster of 3 t3.medium. Adapt it to the hardware you provisioned) |
| 141 | +1. Task Role: None |
| 142 | +1. No Task placement |
| 143 | +1. Container: |
| 144 | + a. name: windmill-worker |
| 145 | + a. image: ghcr.io/windmill-labs/windmill:main (or ghcr.io/windmill-labs/windmill-ee:main for EE) |
| 146 | + a. Essential container: YES |
| 147 | + a. Port mapping: no port mapping for workers |
| 148 | + a. Resource allocation: 2 CPU / 3.5 GiB memory |
| 149 | + a. Environment variable: `MODE=worker`, `WORKER_GROUP=native` and `DATABASE_URL=postgres://postgres:<DB_PASSWORD>@DB_HOSTNAME:5432/windmill`. Replace the hostname and password with the ones from the RDS database your created above |
| 150 | + a. TODO: ellaborate on volumes |
| 151 | + a. Turn on log collection for easy debugging |
| 152 | + a. This is it, leave the rest default |
| 153 | + |
| 154 | +### Windmill LSP |
| 155 | + |
| 156 | +1. Name: windmill-lsp |
| 157 | +1. Launch Type: AWS EC2 instances |
| 158 | +1. OS / Arch: Linux/x86_64 (or make it match your EC2 instance type and OS architecture if you chose something different when creating the ECS cluster) |
| 159 | +1. Network mode: awsvpc |
| 160 | +1. Task Size: 1vCPU / 1.5GiB Memory (this is good for a cluster of 3 t3.medium. Adapt it to the hardware you provisioned) |
| 161 | +1. Task Role: None |
| 162 | +1. No Task placement |
| 163 | +1. Container: |
| 164 | + a. name: windmill-lsp |
| 165 | + a. image: ghcr.io/windmill-labs/windmill-lsp:latest |
| 166 | + a. Essential container: YES |
| 167 | + a. Port mapping: 3001 / TCP / http / HTTP |
| 168 | + a. Resource allocation: 1 CPU / 1.5 GiB memory |
| 169 | + a. Environment variable: No env variable |
| 170 | + a. TODO: ellaborate on volumes |
| 171 | + a. Turn on log collection for easy debugging |
| 172 | + a. This is it, leave the rest default |
| 173 | + |
| 174 | +### Windmill Multiplayer |
| 175 | + |
| 176 | +1. Name: windmill-multiplayer |
| 177 | +1. Launch Type: AWS EC2 instances |
| 178 | +1. OS / Arch: Linux/x86_64 (or make it match your EC2 instance type and OS architecture if you chose something different when creating the ECS cluster) |
| 179 | +1. Network mode: awsvpc |
| 180 | +1. Task Size: 1vCPU / 1.5GiB Memory (this is good for a cluster of 3 t3.medium. Adapt it to the hardware you provisioned) |
| 181 | +1. Task Role: None |
| 182 | +1. No Task placement |
| 183 | +1. Container: |
| 184 | + a. name: windmill-multiplayer |
| 185 | + a. image: ghcr.io/windmill-labs/windmill-multiplayer:latest |
| 186 | + a. Essential container: YES |
| 187 | + a. Port mapping: 3002 / TCP / http / HTTP |
| 188 | + a. Resource allocation: 1 CPU / 1.5 GiB memory |
| 189 | + a. Environment variable: No env variable |
| 190 | + a. Turn on log collection for easy debugging |
| 191 | + a. This is it, leave the rest default |
| 192 | + |
| 193 | +# Create the services |
| 194 | + |
| 195 | +Similar as above, we will create 6 services. |
| 196 | + |
| 197 | +### Windmill server |
| 198 | + |
| 199 | +1. Environment: leave everything default. It will be using the default capacity provider |
| 200 | +1. Application type: Service |
| 201 | +1. Task definition: select the latest revision of the `windmill-server` task |
| 202 | +1. Service name: `windmill-server` |
| 203 | +1. Service replica: 2 (to follow what we said above, feel free to tune it to your needs) |
| 204 | +1. Networking: Select the VPC created above, and place the services in the PUBLIC subnets. Select the security group created above (or the one allowing traffic on port 80) |
| 205 | +1. Load balancer: It's important to create a load balancer here as it will be the entry point to Windmill. Create an Application Load Balancer `windmill-server-lb` with a target group `windmill-server-tg` |
| 206 | + |
| 207 | +### Multi-purpose Windmill worker |
| 208 | + |
| 209 | +1. Environment: leave everything default. It will be using the default capacity provider |
| 210 | +1. Application type: Service |
| 211 | +1. Task definition: select the latest revision of the `windmill-worker` task |
| 212 | +1. Service name: `windmill-worker` |
| 213 | +1. Service replica: 2 (to follow what we said above, feel free to tune it to your needs) |
| 214 | +1. Networking: Select the VPC created above, and place the services in the PRIVATE subnets. No need for workers to be in the public subnets, as there's NAT in the VPC. Select the security group created above |
| 215 | +1. Load balancer: No load balancer, the container is not exposing any port |
| 216 | + |
| 217 | +### Native Windmill Worker |
| 218 | + |
| 219 | +1. Same a the multi-purpose Windmill worker, except that the task definition should be `windmill-native-worker` |
| 220 | + |
| 221 | +### Windmill LSP |
| 222 | + |
| 223 | +1. Environment: leave everything default. It will be using the default capacity provider |
| 224 | +1. Application type: Service |
| 225 | +1. Task definition: select the latest revision of the `windmill-lsp` task |
| 226 | +1. Service name: `windmill-lsp` |
| 227 | +1. Service replica: 1 |
| 228 | +1. Networking: Select the VPC created above, and place the services in the PRIVATE subnets. No need for workers to be in the public subnets, as there's NAT in the VPC. Select the security group created above |
| 229 | +1. Load balancer: Create a load balancer. All we need is actually a Target Group, but using this menu AWS will create both, and we will just have to remove the load balancer later and keep only the target group. Name them `windmill-lsp-lb` and `windmill-lsp-tg` |
| 230 | + |
| 231 | +### Windmill Multiplayer |
| 232 | + |
| 233 | +1. Same as Windmill LSP, using the task definition `windmill-native-worker`. |
| 234 | + |
| 235 | +# Add Networking Routes for Windmill LSP and Multiplayer |
| 236 | + |
| 237 | +Here we will add the appropriate routes for requests that the UI will want to make to LSP or Multiplayer. We create 2 load balancers for Windmill LSP and Multiplayer, but only their Target Groups are needed. We will use the Windmill Service Load Balancer to route certain requests to those target groups based on their path. |
| 238 | + |
| 239 | +Go to AWS EC2 Load Balancer menu and start by deleting the load balancers named `windmill-lsp-lb` and `windmill-multiplayer-lb`. Then go to `windmill-server-lb` Load Balancer to update it: |
| 240 | + |
| 241 | +1. Open the HTTP:80 listener and click on the `Add Rule` button on the right |
| 242 | +2. Add a Route for LSP |
| 243 | + a. Name it `lsp` |
| 244 | + a. Add a condition: `Path is /ws/*` |
| 245 | + a. Click Next |
| 246 | + a. Select target group `windmill-lsp-tg` |
| 247 | + a. Give it a priority of `10` |
| 248 | + a. Click on Create |
| 249 | + a. Add a group for Multiplayer |
| 250 | + a. Same as above. |
| 251 | + a. The path should be `/ws_mp/*` |
| 252 | + a. The target group should be `windmill-multiplayer-tg` |
| 253 | + |
| 254 | +LSP and Multiplayer should now be all set |
| 255 | + |
| 256 | +# Open Windmill |
| 257 | + |
| 258 | +Go back to the `windmill-server-lb` and copy its DNS. Open it in a new tab. You should see the Windmill Login interface. Follow the instructions to go through the initial Windmill setup |
0 commit comments