data subTLDR week 33 year 2025

r/MachineLearningr/dataengineeringr/SQL

Job Seeker's SQL Test Drama, The Timeless Art of Database Humor, Small Victories in Data Engineering, and Gaming's Web Crawling Concerns

August 17, 2025•Week 33, 2025

Posted in r/dataengineeringbyu/Shoddy_Bumblebee6890•8/11/2025

2082

This is what peak performance looks like

Meme

Data engineers in the thread unanimously celebrated small wins in their field. They humorously exaggerated the impact of minor improvements, such as eliminating duplicate rows in large datasets, improving query performance, and ensuring data integrity. These steps were praised as significant achievements, with one commenter playfully suggesting such actions could be highlighted on a resume. The discussion also touched on strategies to maintain data quality, like adding constraints to prevent future data duplication. The overall sentiment was positive and light-hearted, reflecting a shared understanding of the satisfaction derived from these seemingly minor victories in data engineering.

58 comments

Save

View on Reddit →

Posted in r/SQLbyu/Herobrine20XX•8/17/2025

590

I'm building a visual SQL query builder

PostgreSQL

The visual SQL query builder initiative received mixed responses. Some users expressed interest, particularly in the proposal to allow the tool to work backwards by visualizing connections and dependencies in user-written queries. Others compared the concept to existing tools like Microsoft Access and SSMS. There were concerns about the practicality of the tool, with critics suggesting that text remains the best abstraction for software and that such tools may struggle with complex queries. Supporters argued that visualizations can be useful learning tools. The creator acknowledged the challenge of incorporating existing codebases into the tool's framework.

131 comments

Save

View on Reddit →

Posted in r/MachineLearningbyu/NuoJohnChen•8/12/2025

374

[R] Position: The Current AI Conference Model is Unsustainable!

Research

The AI conference model is facing challenges such as a surge in per-author publication rates and the associated carbon footprint, venue capacity issues, and mental health concerns. However, users criticize the approach of forecasting a monthly paper output per researcher by the 2040s as an unrealistic extrapolation. Many suggest separating publications from conferences as in other academic fields, which could alleviate issues like carbon emissions, venue overcrowding, and mental health. Critics argue that the current model encourages low-quality research and overlooks rigorous scientific standards. Concerns were also raised about high conference costs and their impact on less privileged researchers. The sentiment is predominantly negative.

50 comments

Save

View on Reddit →

Posted in r/dataengineeringbyu/aryan_p_patel•8/13/2025

356

Saw this popup in-game for using device resources to crawl the web, scary as f***

Discussion

Concerns were raised about games using device resources to web crawl, an issue discovered in the game 'Pizza Ready'. Commenters were alarmed at the potential for devices to be used in illegal activities without the owner's knowledge. There was also mention of this technique being utilized for more localized web testing. A few took a lighter stance, joking about the in-game rewards offered in return for this service. Overall, the tone was mostly negative due to the inherent security risks and potential for misuse, with a sliver of positivity regarding its potential utility in specific situations.

41 comments

Save

View on Reddit →

Posted in r/dataengineeringbyu/Hunt_Visible•8/12/2025

306

The push for LLMs is making my data team's work worse

Discussion

The push for Language Model Learning (LLMs) in data teams is sparking debate, with many arguing it sacrifices reliability for a guise of innovation. While LLMs offer more flexibility, their use in data extraction, fuzzy matching, and data categorization reportedly increases errors and diminishes quality. Critics suggest that LLMs are better suited to tasks where no current solution exists, especially in high-scale environments. Others argue that business leaders often overlook quality in favor of cost-efficiency and automation, leading to lowered standards. However, some propose finding separate use cases for LLMs that align with their strengths, optimizing their benefits without sacrificing accuracy. The sentiment is largely negative, signaling dissatisfaction with the shift towards LLMs.

73 comments

Save

View on Reddit →

Posted in r/MachineLearningbyu/4yush01•8/17/2025

[R] Bing Search API is Retiring - What’s Your Next Move?

Discussion

Developers are transitioning from the retiring Bing Search API to alternatives such as Exa API and Tavily, both praised for their performance. Some express concern over potential price-gouging by major providers. Google Programmable Search was tried but criticized for latency issues. The sentiment leans towards experimenting with newer options before committing to big providers. Price sensitivity varies among users, with some open to higher costs for better service, while others suggest self-hosting for small projects. The retirement of Bing Search API has sparked anxiety among developers and AI builders due to the potential gap it might create.

19 comments

Save

View on Reddit →

Posted in r/MachineLearningbyu/ImaginationAny2254•8/14/2025

[D] People in ML/DS/AI field since 5-10 years or more, are you tired of updating yourself with changing tech stack?

Discussion

Most professionals in the Machine Learning, Data Science, and AI fields accept the necessity of constant updates to the tech stack as part of their roles. They perceive the transition from Java to Python, the shift from on-premises computing to the cloud, and other transformations as less frequent than expected. Many enjoy learning new skills and tools, finding it beneficial for their career advancement and overall job satisfaction. However, there's a shared feeling of exhaustion and pressure, especially among those trying to secure their positions or seeking job switches. The sentiment is mixed, with an emphasis on continuous learning despite the challenges.

71 comments

Save

View on Reddit →

Posted in r/SQLbyu/zeekohli•8/14/2025

Failed my final round interview today

SQL Server

A job seeker failed their final round interview due to a handwritten SQL test. The interviewer perceived the applicant's syntax errors as a lack of SQL capabilities, leading to disappointment and self-doubt. The community response was mixed. Some empathized with the candidate, criticizing the unusual practice of handwritten SQL tests and suggesting the candidate may have dodged a toxic work environment. Others argued that missing essential SQL syntax indicated a lack of experience or carelessness. The candidate considered reaching out to the interviewers to explain the situation, with some Redditors advising to include all interviewers in the conversation. Overall, the sentiment was mixed with a slight lean towards empathy for the job seeker.

141 comments

Save

View on Reddit →

Posted in r/SQLbyu/gumnos•8/16/2025

I am the very model of a modern major database

Discussion

The thread's overall sentiment is more informative and playful. It discusses an old, humorous poem describing the workings of a database system, originally posted on a Python mailing list. Participants appreciate the poem's clever combination of technical accuracy and humor, highlighting the importance of data types and efficient multi-threading. There's also recognition of the poem's relevance even today, underscoring the timelessness of good database design principles. However, there's a clear call for a better understanding of how the SQLite database system handles data types differently, reflecting a trend towards greater curiosity about specific database behaviors.

12 comments

Save

View on Reddit →

Posted in r/SQLbyu/Adela_freedom•8/15/2025

Database change — where confidence sometimes meets chaos

Discussion

Given the lack of content and comments in this thread, it's impossible to provide a summary or capture any insights or sentiments.

0 comments

Save

View on Reddit →

Subscribe to data-subtldr

Get weekly summaries of top content from r/dataengineering, r/MachineLearning and more directly in your inbox.

data subTLDR week 33 year 2025

Subscribe to data-subtldr

Get the weekly data subTLDR in your inbox!