User Details
- User Since
- Jun 7 2021, 2:32 AM (214 w, 3 d)
- Roles
- Disabled
- LDAP User
- Vivian Rook
- MediaWiki User
- VRook (WMF) [ Global Accounts ]
Mar 6 2025
Feb 19 2025
As I consider it more I guess it doesn't make much of a difference if it is merged now, as about the biggest "risk" of not merging is that I eventually push a toolforge-deploy_components.yaml that is set to pull T386480. So perhaps it is better to wait until we decide more exactly what we want before merging.
- Is the new system going to follow the ideas outlined at https://wikitech.wikimedia.org/wiki/User:Taavi/Loki_notes ?
- if so, the doc was written some time ago (originally in 2021!) so all in there may not be true / relevant / accurate / desirable
Feb 18 2025
Feb 14 2025
Feb 13 2025
Ah thank you for finding that! Sparql was removed a few years ago as the plugin stopped updating, and stopped installing in jupyter. T320934 is the ticket regarding it. Just checked and it doesn't look like they've updated since
https://github.com/paulovn/sparql-kernel
Feb 10 2025
Yeah sorry that the news isn't better. I would probably suggest that important notebooks should be kept in a git repo that is better at data retention than only in PAWS itself.
It's not immediately clear to me what happened to the files. I'm tempted to blame a file purge from abusive accounts that we've been having trouble with. T379746. The abusive accounts pull down a lot of junk and were blocking up nfs, thus large accounts that had such junk in them were being removed. Though presumably there wasn't such junk in 75093411, and I don't see it in any of the suspected abusive account lists. Was there a lot of extra stuff (gigabytes of stuff) in her directory? Though I lack any other ideas of what could have happened to the data, so I'm still tempted to suggest that the data was caught up in a purge of abusive accounts, though I have no data to suggest that it would have been.
There were problems like this on older fcos versions of k8s deployed by magnum (T336586). Your current cluster appears to be using Fedora-CoreOS-38, was the previous cluster using the same?
deployment-bastion.deployment-prep.eqiad1.wikimedia.cloud is not needed. The system has been shut down and can be removed at any time
Jan 17 2025
Looking right to me too. @Xqt any thoughts?
Spot checking I'm seeing those languages listed in 'codes' in the /srv/paws/pwb/pywikibot/families/wikipedia_family.py file, should they be somewhere else in the file?
Hello could the link in the ticket be updated to a hyperlink?
Jan 10 2025
Both would likely be good options. Longhorn is probably a little better if we were on bare metal, probably leaning in the direction of Rook, in particular as we already are running ceph. Though we would need it to be blue-green setup, not sure how it manages that, though looks like it's doable.
openstack role add --project paws --user atrawog member openstack role add --project paws --user atrawog reader
I believe you will first need a developer account https://www.mediawiki.org/wiki/Developer_account
Let me check if there is a tag for getting project access after that
Jan 9 2025
+1
Jan 8 2025
This hasn't happened in awhile. Seems resolved.
Pulse check, is this still happening?
Jan 2 2025
Dec 20 2024
+1
Dec 19 2024
Dec 18 2024
Dec 16 2024
Fair enough, do you have any estimate on how much those logs would account for in a day?
For anyone curious
for namespace in $(kubectl get ns | tail -n +2 | awk '{print $1}') ; do for pod in $(kubectl get pods -n ${namespace} | tail -n +2 | awk '{print $1}') ; do kubectl -n ${namespace} logs ${pod} --all-containers --since=24h done done
Suggests that there are about 500 megabytes of logs in the last twenty four hours. Suggesting that a monolithic loki could work.
Dec 13 2024
I believe that file can be found here:
https://public-paws.wmcloud.org/User:Adam_Wight_(WMDE)/survey-analytics/TWL24-Statistics.ipynb
Dec 10 2024
I see the volume as listed as 60GB in horizon. To verify did you try to resize the db in horizon by going to databases > instances > select the dropdown menu on the database > resize volume?
C'est fait
root@cloudcontrol1007:~# openstack database quota show baglama2 +-----------+--------+----------+-------+ | Resource | In Use | Reserved | Limit | +-----------+--------+----------+-------+ | backups | 0 | 0 | 2 | | instances | 1 | 0 | 10 | | ram | 7168 | 0 | -1 | | volumes | 60 | 0 | 60 | +-----------+--------+----------+-------+ root@cloudcontrol1007:~# openstack database quota update baglama2 volumes 100 +---------+-------+ | Field | Value | +---------+-------+ | volumes | 100 | +---------+-------+ root@cloudcontrol1007:~# openstack database quota show baglama2 +-----------+--------+----------+-------+ | Resource | In Use | Reserved | Limit | +-----------+--------+----------+-------+ | backups | 0 | 0 | 2 | | instances | 1 | 0 | 10 | | ram | 7168 | 0 | -1 | | volumes | 60 | 0 | 100 | +-----------+--------+----------+-------+
Dec 9 2024
Memory limit updated to 16Gi
rook@tools-bastion-13:~$ kubectl sudo edit quota -n tool-multichill resourcequota/tool-multichill edited rook@tools-bastion-13:~$ kubectl sudo get -o yaml quota -n tool-multichill apiVersion: v1 items: - apiVersion: v1 kind: ResourceQuota metadata: creationTimestamp: "2019-12-17T02:01:30Z" name: tool-multichill namespace: tool-multichill resourceVersion: "2589149775" uid: f6a49866-7bb5-4203-b734-f2039ceb2fb4 spec: hard: configmaps: "10" count/cronjobs.batch: "50" count/deployments.apps: "16" count/jobs.batch: "15" limits.cpu: "8" limits.memory: 16Gi
Sounds good. What's the project name (I don't see a glamtools project)? And what size would you like the db volume? I believe the default is 20 gigabyte
If I'm understanding this correctly, you have a trove db in a cloud vps project. I believe you should be able to resize the db in horizon by going to databases > instances > select the dropdown menu on the database > resize volume
Dec 4 2024
Nov 27 2024
Nov 26 2024
Getting:
│ Error: Failed to get existing workspaces: operation error S3: ListObjectsV2, https response error StatusCode: 404, RequestID: tx00000ac44c5af7d0cb11f-0067461c90-c898dec-default, HostID: c898dec-default-default, api error NoSuchKey: UnknownError
from tofu. Container is created, tried creating the 'state' folder as well. Tried giving the project id instead of name in bucket.
Thank you @Spinster.
Nov 25 2024
@Spinster any thoughts on this?
Nov 22 2024
Nov 21 2024
Update not quite updating?
https://github.com/OpenRefine/OpenRefine/issues/7001
Nov 18 2024
When running queries in https://thanos.wikimedia.org
avg by (instance)(sum by (cpu,instance)(rate(node_cpu_seconds_total{instance=~"cloudvirt1.+",mode!="idle"}[2m]))*100)
works fine, though when trying the same in grafana-rw.wikimedia.org it fails with a timeout if I give it more than 2 hosts at once in the filter. Do these services have different backends?
Nov 13 2024
The move to k8s appears to have made this ticket mostly unactionable.
Considering the expansive existing collection of existing software with whatever existing licensing that it has or lacks this seems to be a complex task to decide how existing unlicensed projects would, or would not, be included in this. For now status quo appears to be the thing that will remain.
Nov 8 2024
I've tried to spread out cluster rebuilds some since my last comment. Haven't heard similar issues since then so that may well have been the issue. Please reopen if seen again.
Nov 7 2024
Nov 6 2024
Nov 5 2024
I don't see that file on either cloudcontrol1005.eqiad.wmnet or cloudcontrol1007.eqiad.wmnet
Looks like it took
openstack server show 40560d4a-6b06-49be-bfcd-2565666ef95d No Server found for 40560d4a-6b06-49be-bfcd-2565666ef95d
Do we feel that running openstack server delete 40560d4a-6b06-49be-bfcd-2565666ef95d would be safe?
I believe 40560d4a-6b06-49be-bfcd-2565666ef95d is our system:
Things that are good to know. I'll see what I can find