Page MenuHomePhabricator

Data-Engineering-Icebox-DEPRECATEDGroup
ArchivedPublic

Members (9)

Watchers

  • This project does not have any watchers.
  • View All

Details

Description

This tag has been deprecated in favor of Data-Engineering-Icebox.

This tag was originally created as a subproject of Data-Engineering, which made it not possible to have both tags on the same task. In early 2025, the Data Engineering Team is changing the way we organize workboards, and would like to keep Icebox tasks on the main. workboard.

See also https://wikimedia.slack.com/archives/C05H0JYT85V/p1736200895793169

Recent Activity

Tue, Jul 8

isarantopoulos moved T295661: Upgrade ROCm to 5.4 from 2024-2025 Q4 Done to Task Archive on the Machine-Learning-Team board.
Tue, Jul 8, 8:27 AM · Data-Engineering-Icebox-DEPRECATED, Analytics-Radar, Patch-For-Review, Machine-Learning-Team

Jan 10 2025

Gehel moved T300937: Evaluate storing logs from applications in yarn with the typical logging infrastructure from Backlog to Done on the Data-Platform-SRE (2023.12.01 - 2023.12.31) board.
Jan 10 2025, 4:57 PM · Data-Platform-SRE (2023.12.01 - 2023.12.31), Data-Engineering-Icebox-DEPRECATED, Observability-Logging, Wikimedia-Logstash
Gehel edited projects for T300937: Evaluate storing logs from applications in yarn with the typical logging infrastructure, added: Data-Platform-SRE (2023.12.01 - 2023.12.31); removed Data-Platform-SRE.
Jan 10 2025, 4:56 PM · Data-Platform-SRE (2023.12.01 - 2023.12.31), Data-Engineering-Icebox-DEPRECATED, Observability-Logging, Wikimedia-Logstash
Ottomata archived Data-Engineering-Icebox-DEPRECATED.
Jan 10 2025, 2:47 AM
Ottomata moved T310846: Improve Bot Detection Heuristics from Epics to Backlog on the Data-Engineering-Icebox-DEPRECATED board.
Jan 10 2025, 2:37 AM · Data-Engineering-Icebox, Data-Engineering
Ottomata added a project to T333729: eventgate-analytics-external logs field explosion: Data-Engineering-Icebox.
Jan 10 2025, 1:12 AM · Data-Engineering, Data-Engineering-Icebox, Observability-Logging

Jan 9 2025

colewhite added a comment to T333729: eventgate-analytics-external logs field explosion.

I'm unable to tell if the number of fields within response_body has increased or decreased, but the field still gets some heavy traffic in 2025.

Jan 9 2025, 11:42 PM · Data-Engineering, Data-Engineering-Icebox, Observability-Logging
Ottomata removed a hashtag from Data-Engineering-Icebox-DEPRECATED: #data-engineering-icebox.
Jan 9 2025, 10:06 PM
Ottomata renamed Data-Engineering-Icebox-DEPRECATED from Data-Engineering-Icebox to Data-Engineering-Icebox-DEPRECATED.
Jan 9 2025, 6:57 PM

Jan 8 2025

joanna_borun removed a project from T301943: Log_param is redacted in wiki replica when only comment and/or user should be: cloud-services-team.
Jan 8 2025, 3:35 PM · Data-Engineering-Icebox, Data-Engineering, Platform Engineering, Patch-For-Review, Data-Services
Andrew added a comment to T301943: Log_param is redacted in wiki replica when only comment and/or user should be.

*bump* This is a data engineering task but it's pretty simple isn't it?

Jan 8 2025, 3:35 PM · Data-Engineering-Icebox, Data-Engineering, Platform Engineering, Patch-For-Review, Data-Services
hashar closed T313114: Analyze possible bot traffic for frwiki article Cookie (informatique), a subtask of T138207: [Open question] Improve bot identification at scale, as Declined.
Jan 8 2025, 11:10 AM · Data-Engineering-Icebox, Data-Engineering, Research-Freezer

Jan 7 2025

JAllemandou removed a parent task for T292435: Re-examine how internal search referrals are handled by Clickstream: T289532: Add more languages to Wikipedia Clickstream.
Jan 7 2025, 9:11 AM · Data-Engineering-Icebox, Data-Engineering
JAllemandou closed T292476: Update clickstream code to support more languages as Resolved.

This is done!

Jan 7 2025, 9:07 AM · Data-Engineering-Icebox-DEPRECATED
JAllemandou closed T112284: Create new table for 'referer' aggregated data as Resolved.

I think we should consider this done. Resolving.

Jan 7 2025, 9:05 AM · Data-Engineering-Icebox-DEPRECATED
Michael added a project to T321838: Back-fill Wikidata reliability Graphite metrics: Wikidata.

This can probably be closed by now, as presumably all the relevant source data is long gone? However, I'm not on the Wikidata team anymore so it is not my call to make, and also I'm not sure if the original issue/question about the change in metrics still persists.

Jan 7 2025, 8:45 AM · Data-Engineering-Icebox, Data-Engineering, Wikidata, Data Pipelines

Jan 6 2025

VirginiaPoundstone moved T321707: Bot Detection from Epics Timeline to NEEDS DISCUSSION on the Experimentation Lab board.
Jan 6 2025, 10:29 PM · Data-Engineering-Icebox, Data-Engineering, Experimentation Lab, Experimentation Lab Roadmap, Epic
VirginiaPoundstone added a project to T321707: Bot Detection: Experimentation Lab Roadmap.
Jan 6 2025, 10:23 PM · Data-Engineering-Icebox, Data-Engineering, Experimentation Lab, Experimentation Lab Roadmap, Epic
VirginiaPoundstone removed a project from T366720: Public DataHub: Experimentation Lab.
Jan 6 2025, 9:28 PM · Data-Engineering-Icebox, Data-Engineering
VirginiaPoundstone edited projects for T114675: Sanitize pageview_hourly, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:24 PM · Data-Engineering-Icebox, Data-Engineering, Epic
VirginiaPoundstone edited projects for T193759: Add legacy per-article pagecounts data (prior to 2015), added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:24 PM · Data-Engineering-Icebox, Data-Engineering
VirginiaPoundstone edited projects for T232844: Release wikimedia history dumps sorted by user ID and page ID, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:24 PM · Data-Engineering-Icebox, Data-Engineering
VirginiaPoundstone edited projects for T181703: Implement digest-only mediawiki_history_reduced dataset in spark, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:23 PM · Data-Engineering-Icebox, Data-Engineering
VirginiaPoundstone edited projects for T198983: Sqoop more tables for mediawiki in monthly schedule , added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:23 PM · Data-Engineering-Icebox, Data-Engineering, Product-Analytics
VirginiaPoundstone edited projects for T215438: Aggregate pageviews to Wikidata entities, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:23 PM · Data-Engineering-Icebox, Data-Engineering
VirginiaPoundstone edited projects for T237389: Create dashiki dashboard / small tool to track statistics about incubated wikis, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:23 PM · Data-Engineering-Icebox, Data-Engineering, incubator.wikimedia.org
VirginiaPoundstone edited projects for T227809: Set entropy alarm in editors per country per wiki, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:23 PM · Data-Engineering-Icebox, Data-Engineering
VirginiaPoundstone edited projects for T240413: Join slot, content, revision, and page once on load, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:23 PM · Data-Engineering-Icebox, Data-Engineering
VirginiaPoundstone edited projects for T251145: Idea: Add 'top X bigger than Y' sanitization method to EL-to-Druid, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:23 PM · Data-Engineering-Icebox, Data-Engineering
VirginiaPoundstone edited projects for T258967: History: mismatched historical and latest values, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:23 PM · Data-Engineering-Icebox, Data-Engineering, Product-Analytics
VirginiaPoundstone edited projects for T266374: Analyze differences between checksum-based and revert-tag based reverts in mediawiki_history, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:23 PM · Data-Engineering-Icebox, Data-Engineering, Product-Analytics
VirginiaPoundstone edited projects for T259823: page_id is null where it shouldn't be in mediawiki history, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:23 PM · Analytics-Data-Problem, Data-Engineering-Icebox, Data-Engineering
VirginiaPoundstone edited projects for T278467: Use Hive/Spark timestamps in Refined event data, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:23 PM · Data-Engineering-Icebox, Data-Engineering, User-Iflorez, Product-Analytics
VirginiaPoundstone edited projects for T273685: Turnilo "Display Druid query" gives "general error", added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:23 PM · Data-Engineering-Icebox, Data-Engineering, Data-Platform-SRE
VirginiaPoundstone edited projects for T280029: Easy dimensional data visualization, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:23 PM · Data-Engineering-Icebox, Data-Engineering
VirginiaPoundstone edited projects for T288750: LVS in Analytics VLANs, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:23 PM · Data-Platform-SRE (2025.05.24 - 2025.06.13), Data-Engineering-Icebox, Data-Engineering
VirginiaPoundstone edited projects for T291620: Better observability/visualization for MediaWiki jobs, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:23 PM · Data-Engineering-Icebox, Data-Engineering, Wikidata Change Dispatching & Watchlists, serviceops-radar, Platform Team Workboards (Platform Engineering Reliability), Wikibase change dispatching scripts to jobs
VirginiaPoundstone edited projects for T285783: Sqoop image metadata, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:23 PM · Data-Engineering-Icebox, Data-Engineering
VirginiaPoundstone edited projects for T290060: Fill holes in pageview-complete dumps using pageview-count-raw, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:23 PM · Data-Engineering-Icebox, Data-Engineering
VirginiaPoundstone edited projects for T292476: Update clickstream code to support more languages, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:23 PM · Data-Engineering-Icebox-DEPRECATED
VirginiaPoundstone edited projects for T304571: [REQUEST] Add new Fundraising dimensions to druid.pageviews_daily & druid.pageviews_hourly, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:23 PM · Data-Engineering-Icebox, Data-Engineering, Data Pipelines, Product-Analytics
VirginiaPoundstone edited projects for T299729: Implement one golang AQS microservice, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:23 PM · Data-Engineering-Icebox, Data-Engineering
VirginiaPoundstone edited projects for T320860: Fix mediawiki-history page computation for deleted pages having the same title, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:23 PM · Data-Engineering-Icebox, Data-Engineering, Data Pipelines
VirginiaPoundstone edited projects for T326302: Misconfigured proxies on data-engineering hosts, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:23 PM · Data-Engineering-Icebox, Data-Engineering, Data-Platform-SRE
VirginiaPoundstone edited projects for T112284: Create new table for 'referer' aggregated data, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:23 PM · Data-Engineering-Icebox-DEPRECATED
VirginiaPoundstone edited projects for T117945: Add alarms for high volume of views to pages with replacement characters, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:23 PM · Data-Engineering-Icebox, Data-Engineering, Datasets-Webstatscollector
VirginiaPoundstone edited projects for T138207: [Open question] Improve bot identification at scale, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:23 PM · Data-Engineering-Icebox, Data-Engineering, Research-Freezer
VirginiaPoundstone edited projects for T134231: Wikipedia Clickstream dataset. Programmatic Access, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:23 PM · Data-Engineering-Icebox, Data-Engineering, Data-release
VirginiaPoundstone edited projects for T189044: Mediawiki History: moves counted twice in Revision, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:22 PM · Data-Engineering-Icebox, Data-Engineering
VirginiaPoundstone edited projects for T215858: Plan a replacement for wiki replicas that is better suited to typical OLAP use cases than the MediaWiki OLTP schema, added: Data-Engineering-Icebox-DEPRECATED; removed Data-Engineering.
Jan 6 2025, 9:22 PM · Data-Engineering-Icebox, Data-Engineering, Epic, cloud-services-team, Data-Services