← Back to data subTLDR

data subTLDR week 51 year 2025

r/MachineLearningr/dataengineeringr/SQL

Unveiling a Comedic SQL Detective Game, Immersive Storytelling with SQL Side Quest, Free SQL Tutoring for All Levels, and Essential Checklist for 'Small Data' Pipelines

Week 51, 2025
Posted in r/dataengineeringbyu/ElectronicMenu323012/17/2025
692

me and my coworkers

Meme
The discussion primarily revolves around the evolution of data handling and the relevance of Kimball's methods in contemporary contexts. Participants emphasize the shift from ETL processes to ELT, attributing this to the advent of shared-nothing architecture and cheaper storage. The concept of dimensional modeling remains highly endorsed for its enduring efficacy in data analysis. However, there's a consensus that data marts should be based on a single source of truth layer to avoid inconsistencies. The sentiment is generally positive towards Kimball's work, albeit recognizing the necessity to adapt these principles to modern data environments.
70 comments
Share
Save
View on Reddit →
Posted in r/SQLbyu/wassaman12/19/2025
480

I spent 4 years programming and hand drawing a comedic educational SQL detective game that comes out later next year!

Discussion
The announcement of a forthcoming comedic educational game, focusing on SQL, has been positively received. The game, described as a 'SQL detective game', teaches SQL syntax through interactive gameplay and is expected to release next year. Users praised the creative approach to teaching SQL, highlighting the game's potential as an educational tool for both beginners and more experienced learners. Many people have already added the game to their wishlists, with some keen on beta testing. The game's unique design, reminiscent of original Query Analyzer, also attracted positive comments. Overall, the announcement generated enthusiasm and anticipation.
28 comments
Share
Save
View on Reddit →
Posted in r/dataengineeringbyu/Jhaspelia12/18/2025
444

My “small data” pipeline checklist that saved me from building a fake-big-data mess

Discussion
The thread revolves around a data pipeline checklist for handling 'small data,' focusing on simplification over excessive complexity. The post advises starting with service level agreements (SLAs), preferring a single source of truth storage layer, and prioritizing batch processing over streaming unless real-time data is crucial. It encourages idempotent processes, anticipates backfills, minimal observability, and non-optional orchestration. The thread suggests optimizing last, with most pipeline slowdowns due to fundamental issues rather than tech stack limitations. The consensus agrees with the checklist, highlighting the utility of simplicity and the unnecessary complexity of streaming in most cases. The sentiment is generally positive, commending the post's practicality and clarity.
28 comments
Share
Save
View on Reddit →
Posted in r/dataengineeringbyu/City-Popular45512/18/2025
414

Report: Microsoft Scales Back AI Goals Because Almost Nobody is Using Copilot

Discussion
The Microsoft AI tool, Copilot, is reportedly underused, with many users preferring ChatGPT for Microsoft products due to its superior results. Users criticize Copilot's inefficiency, inaccurate results, and poor integration into workflows. Some view the push for AI tools as an attempt to appease shareholders rather than improve functionality. A few users, however, have noted recent improvements in Copilot's search function for company documents. The overall sentiment is negative, with calls for an AI that understands specific data and semantics instead of a 'generic AI'. The situation indicates a reality check for companies seeking competitive edge through AI.
75 comments
Share
Save
View on Reddit →
Posted in r/MachineLearningbyu/qalis12/15/2025
210

[D] Idea: add "no AI slop" as subreddit rule

Discussion
The majority agrees that adding a no AI slop rule could help improve the subreddit’s quality by limiting low-value posts. Users expressed frustration over posts that claim significant breakthroughs without solid evidence or reliability. Many approved the idea of creating a support resource for those afflicted by 'ChatGPT psychosis', while opinions were divided over issuing lifetime bans. Some suggested that restrictions are necessary but a lifetime ban might be excessive. A notable number of upvotes showed interest in more clarity for moderation and rule enforcement, and there was a call for new moderators to assist with these changes. Overall sentiment: positive.
62 comments
Share
Save
View on Reddit →
Posted in r/MachineLearningbyu/alexsht112/17/2025
194

[P] Eigenvalues as models

Project
The discussion centers on the potential of eigenvalues in neural computation and optimization, with insights drawn from various fields like machine learning and physics. There's skepticism about the effectiveness of eigenvalues due to their non-differentiability and being a set, not a function. Some comments highlight the proven strength of neural fields, a developed approach where neurons solve fixed-point PDEs. Others express humor about the 'discovery' of existing mathematical concepts by machine learning enthusiasts. The speed and reliability of computing eigenvalues are also questioned. Nonetheless, there's interest in exploring eigenvalues for their structure and potential to express nontrivial functions.
45 comments
Share
Save
View on Reddit →
Posted in r/MachineLearningbyu/Dangerous-Hat140212/17/2025
122

[D] AISTATS is Desk-Rejecting Papers Where Authors Accessed Reviewer Identities via the OpenReview Bug

Discussion
In response to a major security incident at OpenReview, the AISTATS conference is desk-rejecting papers where authors accessed reviewer identities through the bug. The decision has sparked mixed reactions among the community. Some believe it is a fair response to maintain the integrity of the double-blind review process, while others argue that curiosity or poor judgement should not lead to such severe consequences. The organizers emphasize that author and reviewer trust in the review process is crucial and hope to move past this incident. The matter has also raised questions about potential changes to API access controls or clearer guidelines for authors in the future.
42 comments
Share
Save
View on Reddit →
Posted in r/SQLbyu/sqlsidequest12/18/2025
78

SQL SIDE QUEST - An Immersive story telling SQL Game

Discussion
The SQL Side Quest, a story-driven platform for practicing SQL, has been well-received for its unique, immersive approach to learning. Users praised the creator's effort, noting the impressive incorporation of graphics, animation, and video, adding an unexpected layer of engagement to SQL practice. The site's timer feature was highlighted for its practicality in preparing for interview scenarios. A few users are looking forward to trying the platform on desktop, given its current optimization for that format. Overall, the sentiment was overwhelmingly positive, with many appreciating the novel, enjoyable learning experience offered for free.
25 comments
Share
Save
View on Reddit →
Posted in r/SQLbyu/Disastrous-Pin730412/17/2025
66

Offering free SQL tutoring – want to see if I can be a good teacher

Discussion
A data engineer is offering free SQL classes to individuals at any skill level, including those new to the field. Many users expressed interest, with several suggesting the sessions could be recorded for future reference. Others expressed a desire to learn about integrating SQL with Python for data science/engineering. Some participants also expressed interest in understanding how SQL is used with messy data in a real-world context, as most online tutorials use clean data. There were no minimum requirements mentioned, making these sessions accessible to a wide range of learners. The overall sentiment is positive.
78 comments
Share
Save
View on Reddit →
Posted in r/SQLbyu/imm_uol181912/21/2025
40

Most "empjoyable" SQL stuff I can mention in my resume?

Discussion
In a discussion about enhancing a resume for data analyst/marketing analyst roles with SQL skills, the consensus is that showcasing how SQL is used to solve business problems and deliver impact is more valuable than listing specific SQL functions. Many highlight the importance of being able to talk intelligently about SQL in interviews. Demonstrating experience, even from academic projects or informal learning initiatives, rather than focusing solely on tools, is advised. Emphasis is also placed on the ability to ask the right questions and understand the business context. Some suggest creating projects using freely available datasets to demonstrate practical application of skills. The overall sentiment is constructive and supportive.
28 comments
Share
Save
View on Reddit →

Subscribe to data-subtldr

Get weekly summaries of top content from r/dataengineering, r/MachineLearning and more directly in your inbox.

Get the weekly data subTLDR in your inbox!

We respect your privacy. No spam, ever.