Page MenuHomePhabricator

rook (Vivian Rook)
Disabled

User Details

User Since
Jun 7 2021, 2:32 AM (214 w, 3 d)
Roles
Disabled
LDAP User
Vivian Rook
MediaWiki User
VRook (WMF) [ Global Accounts ]

Recent Activity

Mar 6 2025

rook closed T386408: New upstream release for OpenRefine as Resolved.
Mar 6 2025, 4:26 PM · PAWS
rook closed T387074: New upstream release for Pywikibot as Resolved.
Mar 6 2025, 4:24 PM · PAWS

Feb 19 2025

rook added a comment to T386480: [o11y,logging,infra] Deploy Loki to store Toolforge tool log data.

As I consider it more I guess it doesn't make much of a difference if it is merged now, as about the biggest "risk" of not merging is that I eventually push a toolforge-deploy_components.yaml that is set to pull T386480. So perhaps it is better to wait until we decide more exactly what we want before merging.

Feb 19 2025, 4:40 PM · Patch-For-Review, Toolforge, cloud-services-team
rook added a comment to T127367: [toolforge,jobs-api,webservice,storage] Provide modern, non-NFS log solution for Toolforge tools.
Feb 19 2025, 9:53 AM · User-aborrero, cloud-services-team, Epic, Toolforge

Feb 18 2025

rook closed T385399: New upstream release for Pywikibot as Resolved.
Feb 18 2025, 1:32 PM · PAWS
rook added a comment to T386480: [o11y,logging,infra] Deploy Loki to store Toolforge tool log data.

The patches look ok to me (have not tested them yet, testing something else...), they have no extra config for loki yet right?

Feb 18 2025, 11:47 AM · Patch-For-Review, Toolforge, cloud-services-team

Feb 14 2025

rook updated subscribers of T386480: [o11y,logging,infra] Deploy Loki to store Toolforge tool log data.

I believe
https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/tree/T386480?ref_type=heads
and
https://gitlab.wikimedia.org/repos/cloud/toolforge/lima-kilo/-/tree/T386480?ref_type=heads

Feb 14 2025, 8:37 PM · Patch-For-Review, Toolforge, cloud-services-team
rook created T386480: [o11y,logging,infra] Deploy Loki to store Toolforge tool log data.
Feb 14 2025, 3:42 PM · Patch-For-Review, Toolforge, cloud-services-team

Feb 13 2025

rook closed T386339: hub-paws.wmcloud.org: SPARQL is absent at first connection, always timeout in further sessions as Resolved.
Feb 13 2025, 11:18 AM · PAWS
rook added a comment to T386339: hub-paws.wmcloud.org: SPARQL is absent at first connection, always timeout in further sessions.

Ah thank you for finding that! Sparql was removed a few years ago as the plugin stopped updating, and stopped installing in jupyter. T320934 is the ticket regarding it. Just checked and it doesn't look like they've updated since
https://github.com/paulovn/sparql-kernel

Feb 13 2025, 11:18 AM · PAWS

Feb 10 2025

rook added a comment to T385048: Missing notebooks for an account.

Yeah sorry that the news isn't better. I would probably suggest that important notebooks should be kept in a git repo that is better at data retention than only in PAWS itself.

Feb 10 2025, 5:01 PM · PAWS
rook closed T385048: Missing notebooks for an account as Resolved.
Feb 10 2025, 4:59 PM · PAWS
rook added a comment to T385048: Missing notebooks for an account.

It's not immediately clear to me what happened to the files. I'm tempted to blame a file purge from abusive accounts that we've been having trouble with. T379746. The abusive accounts pull down a lot of junk and were blocking up nfs, thus large accounts that had such junk in them were being removed. Though presumably there wasn't such junk in 75093411, and I don't see it in any of the suspected abusive account lists. Was there a lot of extra stuff (gigabytes of stuff) in her directory? Though I lack any other ideas of what could have happened to the data, so I'm still tempted to suggest that the data was caught up in a purge of abusive accounts, though I have no data to suggest that it would have been.

Feb 10 2025, 4:46 PM · PAWS
rook added a comment to T383560: CSI Cinder issues causing periodic failures on Magnum cluster.

There were problems like this on older fcos versions of k8s deployed by magnum (T336586). Your current cluster appears to be using Fedora-CoreOS-38, was the previous cluster using the same?

Feb 10 2025, 2:32 PM · Openstack-Magnum, cloud-services-team
rook added a comment to T385849: Determine usefulness of questionable hosts found in 2025-02-06 audit.

deployment-bastion.deployment-prep.eqiad1.wikimedia.cloud is not needed. The system has been shut down and can be removed at any time

Feb 10 2025, 11:47 AM · User-bd808, Beta-Cluster-Infrastructure

Jan 17 2025

rook updated subscribers of T383920: Update wikipedia_family.py at PAWS.

Looking right to me too. @Xqt any thoughts?

Jan 17 2025, 4:27 PM · PAWS
rook added a comment to T383920: Update wikipedia_family.py at PAWS.

Spot checking I'm seeing those languages listed in 'codes' in the /srv/paws/pwb/pywikibot/families/wikipedia_family.py file, should they be somewhere else in the file?

Jan 17 2025, 4:17 PM · PAWS
rook added a comment to T383920: Update wikipedia_family.py at PAWS.

Hello could the link in the ticket be updated to a hyperlink?

Jan 17 2025, 4:05 PM · PAWS

Jan 10 2025

rook added a comment to T383403: Automate deploy, or move away from nfs paws.

Both would likely be good options. Longhorn is probably a little better if we were on bare metal, probably leaning in the direction of Rook, in particular as we already are running ceph. Though we would need it to be blue-green setup, not sure how it manages that, though looks like it's doable.

Jan 10 2025, 4:35 PM · PAWS
rook closed T383406: Access to PAWS bastion host and horizon access. as Resolved.
Jan 10 2025, 3:27 PM · PAWS
rook added a comment to T383406: Access to PAWS bastion host and horizon access. .
openstack role add --project paws --user atrawog member
openstack role add --project paws --user atrawog reader
Jan 10 2025, 3:27 PM · PAWS
rook added a comment to T383406: Access to PAWS bastion host and horizon access. .

I believe you will first need a developer account https://www.mediawiki.org/wiki/Developer_account
Let me check if there is a tag for getting project access after that

Jan 10 2025, 2:29 PM · PAWS
rook created T383403: Automate deploy, or move away from nfs paws.
Jan 10 2025, 2:15 PM · PAWS

Jan 9 2025

rook added a comment to T383020: Higher RAM quota for fa-wp VPSs.

+1

Jan 9 2025, 9:04 PM · Cloud-VPS (Quota-requests)
rook added a comment to T383334: github action update.

https://github.com/toolforge/paws/pull/479

Jan 9 2025, 7:05 PM · PAWS
rook updated the task description for T383334: github action update.
Jan 9 2025, 2:38 PM · PAWS
rook created T383334: github action update.
Jan 9 2025, 2:38 PM · PAWS

Jan 8 2025

rook closed T340979: LibUp bot opening multiple upgrade notices for same lib as Resolved.
Jan 8 2025, 3:57 PM · PAWS, LibUp
rook added a comment to T340979: LibUp bot opening multiple upgrade notices for same lib.

This hasn't happened in awhile. Seems resolved.

Jan 8 2025, 3:57 PM · PAWS, LibUp
rook added a comment to T168222: Querying wikidata with pywikibot fails for items with images when user is not registered for commons.

Pulse check, is this still happening?

Jan 8 2025, 3:53 PM · TestMe, PAWS, Pywikibot-Login, Pywikibot, Pywikibot-Wikidata
rook changed the status of T381503: Upgrade to k8s 1.28, a subtask of T379400: Upgrade jupyter chart, from Open to Stalled.
Jan 8 2025, 3:46 PM · PAWS
rook changed the status of T381503: Upgrade to k8s 1.28 from Open to Stalled.
Jan 8 2025, 3:46 PM · PAWS
rook added a parent task for T381499: Upgrade cloud-vps openstack to version 'Dalmatian': T381503: Upgrade to k8s 1.28.
Jan 8 2025, 3:45 PM · Cloud-VPS, cloud-services-team
rook added a subtask for T381503: Upgrade to k8s 1.28: T381499: Upgrade cloud-vps openstack to version 'Dalmatian'.
Jan 8 2025, 3:45 PM · PAWS
rook closed T381373: Restrict outbound connectivity from PAWS hosts as Resolved.
Jan 8 2025, 3:42 PM · Patch-For-Review, cloud-services-team (FY2024/2025-Q1-Q2), PAWS, Cloud-VPS, User-aborrero
rook closed T381373: Restrict outbound connectivity from PAWS hosts, a subtask of T381078: cloudgw: suspected network problems, as Resolved.
Jan 8 2025, 3:42 PM · Cloud-VPS, User-aborrero, cloud-services-team

Jan 2 2025

rook closed T382903: update rstudio as Resolved.
Jan 2 2025, 5:44 PM · PAWS
rook created T382903: update rstudio.
Jan 2 2025, 4:28 PM · PAWS

Dec 20 2024

rook added a comment to T382601: Object storage quota increase request for search project.

+1

Dec 20 2024, 4:46 PM · Data-Platform-SRE (2024.11.30 - 2024.12.20), Wikidata, Cloud-VPS (Quota-requests), Wikidata-Query-Service

Dec 19 2024

rook closed T382444: New upstream release for Wikimedia Commons Extension for OpenRefine as Resolved.
Dec 19 2024, 4:46 PM · PAWS

Dec 18 2024

rook closed T382427: update jupyterlab as Resolved.
Dec 18 2024, 5:34 PM · PAWS
rook created T382427: update jupyterlab.
Dec 18 2024, 4:41 PM · PAWS
rook closed T382189: New upstream release for Pywikibot as Resolved.
Dec 18 2024, 4:40 PM · PAWS

Dec 16 2024

rook added a comment to T127367: [toolforge,jobs-api,webservice,storage] Provide modern, non-NFS log solution for Toolforge tools.

Fair enough, do you have any estimate on how much those logs would account for in a day?

Dec 16 2024, 4:44 PM · User-aborrero, cloud-services-team, Epic, Toolforge
rook added a comment to T127367: [toolforge,jobs-api,webservice,storage] Provide modern, non-NFS log solution for Toolforge tools.

For anyone curious

for namespace in $(kubectl get ns | tail -n +2 | awk '{print $1}') ;
do
    for pod in $(kubectl get pods -n ${namespace} | tail -n +2 | awk '{print $1}') ;
    do
        kubectl -n ${namespace} logs ${pod} --all-containers --since=24h
    done
done

Suggests that there are about 500 megabytes of logs in the last twenty four hours. Suggesting that a monolithic loki could work.

Dec 16 2024, 4:02 PM · User-aborrero, cloud-services-team, Epic, Toolforge

Dec 13 2024

rook added a comment to T382120: Cannot export from PAWS, nor publish as a public page.

I believe that file can be found here:
https://public-paws.wmcloud.org/User:Adam_Wight_(WMDE)/survey-analytics/TWL24-Statistics.ipynb

Dec 13 2024, 1:24 PM · PAWS

Dec 10 2024

rook closed T381907: upgrade jupyterlab as Resolved.
Dec 10 2024, 7:03 PM · PAWS
rook added a comment to T381907: upgrade jupyterlab.

https://github.com/toolforge/paws/pull/469

Dec 10 2024, 6:44 PM · PAWS
rook created T381907: upgrade jupyterlab.
Dec 10 2024, 6:27 PM · PAWS
rook added a comment to T381745: Trove DB full.

I see the volume as listed as 60GB in horizon. To verify did you try to resize the db in horizon by going to databases > instances > select the dropdown menu on the database > resize volume?

Dec 10 2024, 4:59 PM · Cloud-VPS (Quota-requests), cloud-services-team
rook closed T381745: Trove DB full as Resolved.
Dec 10 2024, 2:32 PM · Cloud-VPS (Quota-requests), cloud-services-team
rook added a comment to T381745: Trove DB full.

C'est fait

root@cloudcontrol1007:~# openstack database quota show baglama2
+-----------+--------+----------+-------+
| Resource  | In Use | Reserved | Limit |
+-----------+--------+----------+-------+
| backups   |      0 |        0 |     2 |
| instances |      1 |        0 |    10 |
| ram       |   7168 |        0 |    -1 |
| volumes   |     60 |        0 |    60 |
+-----------+--------+----------+-------+
root@cloudcontrol1007:~# openstack database quota update baglama2 volumes 100
+---------+-------+
| Field   | Value |
+---------+-------+
| volumes | 100   |
+---------+-------+
root@cloudcontrol1007:~# openstack database quota show baglama2
+-----------+--------+----------+-------+
| Resource  | In Use | Reserved | Limit |
+-----------+--------+----------+-------+
| backups   |      0 |        0 |     2 |
| instances |      1 |        0 |    10 |
| ram       |   7168 |        0 |    -1 |
| volumes   |     60 |        0 |   100 |
+-----------+--------+----------+-------+
Dec 10 2024, 2:31 PM · Cloud-VPS (Quota-requests), cloud-services-team

Dec 9 2024

rook closed T380902: Increase kubernetes quota for tools.multichill as Resolved.
Dec 9 2024, 4:25 PM · Toolforge (Quota-requests), cloud-services-team
rook added a comment to T380902: Increase kubernetes quota for tools.multichill.

Memory limit updated to 16Gi

rook@tools-bastion-13:~$ kubectl sudo edit quota -n tool-multichill
resourcequota/tool-multichill edited
rook@tools-bastion-13:~$ kubectl sudo get -o yaml quota -n tool-multichill
apiVersion: v1
items:
- apiVersion: v1
  kind: ResourceQuota
  metadata:
    creationTimestamp: "2019-12-17T02:01:30Z"
    name: tool-multichill
    namespace: tool-multichill
    resourceVersion: "2589149775"
    uid: f6a49866-7bb5-4203-b734-f2039ceb2fb4
  spec:
    hard:
      configmaps: "10"
      count/cronjobs.batch: "50"
      count/deployments.apps: "16"
      count/jobs.batch: "15"
      limits.cpu: "8"
      limits.memory: 16Gi
Dec 9 2024, 4:11 PM · Toolforge (Quota-requests), cloud-services-team
rook added a comment to T381745: Trove DB full.

Sounds good. What's the project name (I don't see a glamtools project)? And what size would you like the db volume? I believe the default is 20 gigabyte

Dec 9 2024, 4:06 PM · Cloud-VPS (Quota-requests), cloud-services-team
rook added a comment to T381745: Trove DB full.

If I'm understanding this correctly, you have a trove db in a cloud vps project. I believe you should be able to resize the db in horizon by going to databases > instances > select the dropdown menu on the database > resize volume

Dec 9 2024, 3:12 PM · Cloud-VPS (Quota-requests), cloud-services-team

Dec 4 2024

rook closed T381501: upgrade jupyterlab as Resolved.
Dec 4 2024, 4:17 PM · PAWS
rook created T381503: Upgrade to k8s 1.28.
Dec 4 2024, 3:20 PM · PAWS
rook merged T380658: New upstream release for OpenRefine into T380436: New upstream release for OpenRefine.
Dec 4 2024, 3:19 PM · PAWS
rook merged task T380658: New upstream release for OpenRefine into T380436: New upstream release for OpenRefine.
Dec 4 2024, 3:19 PM · PAWS
rook closed T381452: New upstream release for Pywikibot as Resolved.
Dec 4 2024, 3:16 PM · PAWS
rook added a comment to T381452: New upstream release for Pywikibot.

https://github.com/toolforge/paws/pull/467

Dec 4 2024, 3:16 PM · PAWS
rook created T381501: upgrade jupyterlab.
Dec 4 2024, 3:06 PM · PAWS
rook added a comment to T381373: Restrict outbound connectivity from PAWS hosts.

That said, how often is the system rebuilt? I would if possible like to keep the specific NAT rule in place for now, so that maybe in a week's time we can look at the Netflow data

Yes I think it's unlikely we'll have to rebuild the cluster before 1 week, so let's keep the rule in place until next week.

Dec 4 2024, 12:10 PM · Patch-For-Review, cloud-services-team (FY2024/2025-Q1-Q2), PAWS, Cloud-VPS, User-aborrero

Nov 27 2024

rook closed T380900: update application cred for codfw1dev as Resolved.
Nov 27 2024, 4:59 PM · PAWS

Nov 26 2024

rook added a comment to T380900: update application cred for codfw1dev.

Getting:

│ Error: Failed to get existing workspaces: operation error S3: ListObjectsV2, https response error StatusCode: 404, RequestID: tx00000ac44c5af7d0cb11f-0067461c90-c898dec-default, HostID: c898dec-default-default, api error NoSuchKey: UnknownError

from tofu. Container is created, tried creating the 'state' folder as well. Tried giving the project id instead of name in bucket.

Nov 26 2024, 7:27 PM · PAWS
rook created T380900: update application cred for codfw1dev.
Nov 26 2024, 6:07 PM · PAWS
rook closed T380794: pawsdev in codfw1dev as Resolved.
Nov 26 2024, 3:22 PM · PAWS
rook added a comment to T380737: openrefine in PAWS fails silently to upload new WD item.

Thank you @Spinster.

Nov 26 2024, 1:40 PM · PAWS

Nov 25 2024

rook added a comment to T380794: pawsdev in codfw1dev.

https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/145

Nov 25 2024, 9:36 PM · PAWS
rook created T380794: pawsdev in codfw1dev.
Nov 25 2024, 9:25 PM · PAWS
rook updated subscribers of T380737: openrefine in PAWS fails silently to upload new WD item.

@Spinster any thoughts on this?

Nov 25 2024, 1:41 PM · PAWS

Nov 22 2024

rook closed T380436: New upstream release for OpenRefine as Resolved.
Nov 22 2024, 6:05 PM · PAWS

Nov 21 2024

rook added a comment to T380436: New upstream release for OpenRefine.

Update not quite updating?
https://github.com/OpenRefine/OpenRefine/issues/7001

Nov 21 2024, 4:22 PM · PAWS
rook closed T380474: upgrade jupyterlab as Resolved.
Nov 21 2024, 2:47 PM · PAWS
rook created T380474: upgrade jupyterlab.
Nov 21 2024, 2:31 PM · PAWS

Nov 18 2024

rook added a comment to T380099: Audit WMCS compute capacity.

When running queries in https://thanos.wikimedia.org

avg by (instance)(sum by (cpu,instance)(rate(node_cpu_seconds_total{instance=~"cloudvirt1.+",mode!="idle"}[2m]))*100)

works fine, though when trying the same in grafana-rw.wikimedia.org it fails with a timeout if I give it more than 2 hosts at once in the filter. Do these services have different backends?

Nov 18 2024, 6:38 PM · cloud-services-team, Cloud-VPS

Nov 13 2024

rook added a comment to T289531: Switch to using prefix puppet instead of direct-on-instance puppet.

The move to k8s appears to have made this ticket mostly unactionable.

Nov 13 2024, 3:11 PM · cloud-services-team, Quarry
rook added a comment to T221548: Define default license for PAWS user data.

Considering the expansive existing collection of existing software with whatever existing licensing that it has or lacks this seems to be a complex task to decide how existing unlicensed projects would, or would not, be included in this. For now status quo appears to be the thing that will remain.

Nov 13 2024, 3:09 PM · Software-Licensing, cloud-services-team, PAWS
rook closed T288982: Productionize quarry a bit as Resolved.
Nov 13 2024, 3:07 PM · cloud-services-team, Quarry, Epic
rook closed T289531: Switch to using prefix puppet instead of direct-on-instance puppet, a subtask of T288982: Productionize quarry a bit, as Declined.
Nov 13 2024, 3:07 PM · cloud-services-team, Quarry, Epic
rook closed T289531: Switch to using prefix puppet instead of direct-on-instance puppet as Declined.
Nov 13 2024, 3:07 PM · cloud-services-team, Quarry
rook closed T221548: Define default license for PAWS user data as Declined.
Nov 13 2024, 3:06 PM · Software-Licensing, cloud-services-team, PAWS

Nov 8 2024

rook changed the status of T379400: Upgrade jupyter chart from Open to Stalled.
Nov 8 2024, 6:15 PM · PAWS
rook created T379400: Upgrade jupyter chart.
Nov 8 2024, 6:15 PM · PAWS
rook closed T188684: PAWS kills active users servers that are not connected to a user session as Resolved.
Nov 8 2024, 6:07 PM · PAWS, Upstream
rook added a comment to T188684: PAWS kills active users servers that are not connected to a user session.

I've tried to spread out cluster rebuilds some since my last comment. Haven't heard similar issues since then so that may well have been the issue. Please reopen if seen again.

Nov 8 2024, 6:07 PM · PAWS, Upstream

Nov 7 2024

rook closed T378978: update build-and-push as Resolved.
Nov 7 2024, 3:11 PM · Quarry
rook added a comment to T378978: update build-and-push.

https://github.com/toolforge/quarry/pull/71

Nov 7 2024, 3:10 PM · Quarry
rook closed T373528: unused dns proxies? as Resolved.
Nov 7 2024, 2:56 PM · Quarry
rook closed T373134: PR usually not posting to phabricator as Declined.
Nov 7 2024, 1:04 PM · Quarry, PAWS

Nov 6 2024

rook added a comment to T379076: Remove tf-infra-test project.

@rook We also have tf-infra-dev in codfw, should that one be deleted as well?

Nov 6 2024, 11:53 AM · Cloud-VPS, cloud-services-team

Nov 5 2024

rook added a comment to T379076: Remove tf-infra-test project.

I don't see that file on either cloudcontrol1005.eqiad.wmnet or cloudcontrol1007.eqiad.wmnet

Nov 5 2024, 7:17 PM · Cloud-VPS, cloud-services-team
rook added a comment to T379076: Remove tf-infra-test project.

Looks like it took

openstack server show 40560d4a-6b06-49be-bfcd-2565666ef95d
No Server found for 40560d4a-6b06-49be-bfcd-2565666ef95d
Nov 5 2024, 7:09 PM · cloud-services-team, Cloud-VPS
rook added a comment to T379076: Remove tf-infra-test project.

Do we feel that running openstack server delete 40560d4a-6b06-49be-bfcd-2565666ef95d would be safe?

Nov 5 2024, 6:39 PM · cloud-services-team, Cloud-VPS
rook added a comment to T379076: Remove tf-infra-test project.

I believe 40560d4a-6b06-49be-bfcd-2565666ef95d is our system:

Nov 5 2024, 6:38 PM · cloud-services-team, Cloud-VPS
rook added a comment to T379076: Remove tf-infra-test project.

Things that are good to know. I'll see what I can find

Nov 5 2024, 6:15 PM · cloud-services-team, Cloud-VPS
rook renamed T379076: Remove tf-infra-test project from Remove tofu-infra-test project to Remove tf-infra-test project.
Nov 5 2024, 2:19 PM · cloud-services-team, Cloud-VPS
rook created T379076: Remove tf-infra-test project.
Nov 5 2024, 2:10 PM · cloud-services-team, Cloud-VPS

Nov 4 2024

rook closed T378977: Update build-and-push as Resolved.
Nov 4 2024, 3:02 PM · PAWS