data subTLDR week 30 year 2025

r/MachineLearningr/dataengineeringr/SQL

Database Handling Decoded: Proofreading Queries, Importance of 'Where' Clause, and CTEs, SQL Server Renaming Dilemma, AI Jargon Frustration, and Struggles of Transitioning into Technical Roles.

July 27, 2025•Week 30, 2025

Posted in r/SQLbyu/The-4CE•7/25/2025

1192

Forgot 'where'

MySQL

A recent discussion on Reddit highlighted the importance of using careful practices when handling databases. The most popular suggestions included proofreading queries before committing them, deactivating auto-commit and using rollback, and always writing the 'where' clause first for update statements. Some suggested starting all write queries as selects and only changing to update/delete after validating the results. Another popular idea was to always do a select of the data to be deleted and add in delete later. There were several reminders about the consequences of mistakes, emphasizing the importance of having a test environment and making backups. The overall sentiment was clearly cautionary.

91 comments

Save

View on Reddit →

Posted in r/dataengineeringbyu/eczachly•7/23/2025

974

I’ve been getting so tired with all the fancy AI words

Discussion

Many professionals are growing frustrated with the proliferation of jargon in the field of AI and data science, critiquing the overuse of 'buzzwords' as merely rebranding familiar concepts. This sentiment is particularly directed towards terms like 'MCP' (an API), 'RAG' (database query and string concatenation), and 'AI agents' (text input calling an API). Some commenters humorously suggest that databases are just superior spreadsheets, and spreadsheets are fancy .CSV files. There's a shared perception that the jargon creates an illusion of progress, potentially used to secure budget approvals. However, it's also recognized that language evolution can reflect nuanced advances in technology.

208 comments

Save

View on Reddit →

Posted in r/dataengineeringbyu/Effective-Pen8413•7/22/2025

337

Anyone else feel stuck between “not technical enough” and “too experienced to start over”?

Career

The prevailing sentiment from this discussion is one of empathy towards the struggles of transitioning into more technical roles, with many relating the job market's high expectations to their own experiences. A significant viewpoint encourages a shift in direction if one's passion isn't in their current field, arguing that genuine drive and a desire to improve are essential for thriving in a role. There's also a suggestion that reliance on AI might hinder individual thought and skill development. Many emphasize the importance of hands-on practice and building a deeper understanding of programming languages, with Python specifically mentioned. Some comments underline the tough job market but express optimism for the value of coding skills in the future.

62 comments

Save

View on Reddit →

Posted in r/MachineLearningbyu/kaitzu•7/25/2025

269

[R] NeurIPS 2025 D&B: "The evaluation is limited to 15 open-weights models ... Score: 3"

Research

The academic community has criticized a benchmark paper for only evaluating 15 open weights models and not including state-of-the-art commercial models, due to the high cost. There is a strong sentiment of frustration over the perceived unfairness of this expectation, especially for researchers without significant funding. However, many support the decision to only include open models, arguing that results from closed commercial models are non-reproducible and less valuable to the research community. Some suggest anticipating and addressing such criticisms within the paper itself, while others propose exploring sponsorship from commercial labs for the required compute credits.

29 comments

Save

View on Reddit →

Posted in r/SQLbyu/Various_Candidate325•7/24/2025

248

CTEs saved my sanity but now I think I'm overusing them

Discussion

The community agrees that while Common Table Expressions (CTEs) can improve readability, overuse can lead to performance issues. It's suggested to write code with CTEs first, then condense where possible. Learning window functions and Pivot can be a game-changer. Many recommend thinking in sets for simplifying complex joins. While CTEs benefit debugging, they may create correlated subqueries that negatively impact performance. Temp tables or table variables with indices can be a better choice. Long queries aren't necessarily bad if they're well-written. It's normal to overuse CTEs initially, but optimizing through joins can increase efficiency and readability.

71 comments

Save

View on Reddit →

Posted in r/SQLbyu/Agitated-Whole2328•7/21/2025

220

I think I messed up....I was told to rename the SQL server computer name and now I cannot log in. Renamed it back...still can't log in. what next?

SQL Server

The contributor who renamed their SQL server computer name and could no longer log in received advice from the community. The most supported opinion was that this was a valuable lesson in avoiding such renaming. Suggestions for resolving the issue included checking if the MSSQLServer service was running, as its non-functionality could indicate damage to the master database. Another proposed solution was to stop and restart the MSSQLSERVER, create a new login with 'SYSADMIN' role, then restart the MSSQLSERVER again. Others suggested trying to connect via an IP address or using SQLcmd to put the instance in single user mode to reset the system.

91 comments

Save

View on Reddit →

Posted in r/MachineLearningbyu/Proof-Marsupial-5367•7/23/2025

218

[D] - NeurIPS'2025 Reviews

Discussion

The upcoming NeurIPS 2025 reviews have sparked a mix of anticipation and anxiety among participants, with some using the scores as a turning point in their academic journeys. A notable change this year is the review score range, which has been revised from 1-10 to 1-6. While some participants are treating the process with humor and camaraderie, others are expressing feelings of stress and apprehension. The exact release date of the reviews is also a point of discussion, with expectations of a slight delay. The overall sentiment is a blend of trepidation and community support.

672 comments

Save

View on Reddit →

Posted in r/MachineLearningbyu/currentscurrents•7/21/2025

211

[D] Gemini officially achieves gold-medal standard at the International Mathematical Olympiad

News

The advanced Gemini model by DeepMind has achieved a gold-medal standard at the International Mathematical Olympiad (IMO), generating excitement but also questions about the process and implications. While the achievement is impressive, concerns were raised about the methods used and whether the results can be reproduced. Some users noted the interesting paradox of AI excelling at complex tasks yet struggling with simpler ones. The success of language models in tasks with clear-cut answers was highlighted, but their limitations in areas without definite solutions, such as creative writing, were also underscored. Overall, the sentiment was mixed with awe, skepticism, and curiosity.

68 comments

Save

View on Reddit →

Posted in r/dataengineeringbyu/throwaway16830261•7/26/2025

175

Microsoft admits it 'cannot guarantee' data sovereignty -- "Under oath in French Senate, exec says it would be compelled – however unlikely – to pass local customer info to US admin"

Discussion

Microsoft's admission that it can't guarantee data sovereignty has sparked concerns and skepticism. Despite assurances of encryption, the potential for information to be passed to the US administration is seen as a risk, particularly given the current political climate. Some predict this could harm Microsoft's stocks and potentially end the use of Azure within the EU. A few commenters believe this lack of data sovereignty is not new, citing whistleblower Edward Snowden, and noting this has been a common understanding in the tech industry. There are discussions about an EU-specific competitor, funded by the EU itself. The overall sentiment is negative.

31 comments

Save

View on Reddit →

Posted in r/SQLbyu/SteelTurtle34•7/27/2025

Any good SQL IDE for database development?

Discussion

SQL developers are seeking a solid IDE for database development and have several recommendations. JetBrains' Datagrip is highly favored, praised for its coding capabilities, compatibility with Postgres and SQL Server, and uniform interface with other JetBrains' software. DBeaver is also popular due to its wide-ranging database compatibility and helpful column renaming strategy. A few recommend dbForge for SQL Server, notably for its smart refactoring feature, and Redgate SQL Prompt as an addition to SSMS. However, developers note that schema compare functionality varies and some turn to CI for enforcement. The sentiment is mixed as different tools suit different needs.

40 comments

Save

View on Reddit →

Posted in r/MachineLearningbyu/Ok_Rub1689•7/27/2025

[P] I tried implementing the CRISP paper from Google Deepmind in Python

Project

The Python implementation of Google Deepmind's CRISP paper indicates a significant improvement in clustering multi-vector models during training, rather than the traditional post-training method. This approach helps the model learn inherently clusterable representations, reducing the index size issue common in these models. Users appreciate the hands-on comparison provided by the open-source PyTorch implementation. One user suggests that tweaking the loss function to focus on cluster compactness can further improve retrieval accuracy. The overall sentiment is positive, with users finding the analysis useful and encouraging further experimentation.

5 comments

Save

View on Reddit →

Subscribe to data-subtldr

Get weekly summaries of top content from r/dataengineering, r/MachineLearning and more directly in your inbox.

data subTLDR week 30 year 2025

Subscribe to data-subtldr

Get the weekly data subTLDR in your inbox!