← Back to data subTLDR

data subTLDR week 17 year 2026

r/MachineLearningr/dataengineeringr/SQL

Exploring SSIS Relevance in Modern IT, Engaging with New SQL Game, Career Shift at 63, Dealing with Data Deletion Disaster, Navigating AI in Data Engineering

Week 17, 2026
Posted in r/dataengineeringbyu/Agitated_Success96064/22/2026
414

Deleted prod data permanently without any backup. How screwed am I?

Career
The consensus among the top-rated comments is that the individual who accidentally deleted production data without backup should immediately own up to the mistake. There is significant emphasis that this is an organizational issue, pointing out the lack of backups and safeguards to prevent a single person from making such a costly error. Many suggest that this incident should serve as a learning opportunity for both the individual and the organization. The overall sentiment is understanding and supportive, suggesting that mistakes happen and it's crucial to learn from them to avoid such issues in the future.
164 comments
Share
Save
View on Reddit →
Posted in r/MachineLearningbyu/dot---4/24/2026
204

There Will Be a Scientific Theory of Deep Learning [R]

Research
The lead author of a perspective paper on deep learning theory shared the work on Reddit, sparking a discussion around the emerging scientific theory of deep learning. The paper outlines five areas of evidence: solvable toy settings, insightful limits, simple empirical laws, theories of hyperparameters, and universal phenomena. The discussion was mostly positive, with readers appreciating the rigorous mathematical grounding and coherent approach. There was also interest in the proposed learning mechanics concept that explains how various factors shape learned functions and internal representations. However, some readers questioned how the theory might interface with measurement quality, label/target construction, and other external factors.
44 comments
Share
Save
View on Reddit →
Posted in r/dataengineeringbyu/yo_aesir4/24/2026
185

Lead Data Engineer to FullStack Vibe Coder

Rant
The discussion revolves around a shift in work approach where AI tool 'Claude' is increasingly utilized for tasks, including sensitive ones like rewriting old applications. The move has sparked concerns among employees due to issues such as exposed internal-only Github repos and a costly $9,000 bug. Some commenters express resignation or even excitement about the impending 'spectacular failure', while others share similar experiences of stress from being overworked due to AI implementation. There are also concerns about the reduction in code quality and peer review due to the rapid AI-driven development. The overall sentiment is mixed, with a trend towards apprehension and dissatisfaction.
31 comments
Share
Save
View on Reddit →
Posted in r/MachineLearningbyu/NeighborhoodFatCat4/20/2026
162

[D] It seems that EVERY DAY there are around 100 - 200 new machine learning papers uploaded on Arxiv.

Discussion
The rapid increase in machine learning research papers on Arxiv has made it difficult for many to keep up. Some note a perceived decline in quality and originality, with many papers seen as unremarkable or merely following current trends. The issue is compounded by individuals releasing multiple papers daily. To manage, some have turned to relying on word-of-mouth and daily digests of selected content. Others have developed custom tools to filter and curate content based on personal interest profiles, incorporating resources like HuggingFace and HackerNews, and even creating personalized podcasts. The sentiment is of overwhelmed curiosity.
61 comments
Share
Save
View on Reddit →
Posted in r/dataengineeringbyu/donhuell4/23/2026
159

Getting tons of recruiter messages lately, what's going on?

Career
A surge in recruitment messages for data engineering roles has been observed, sparking discussion about potential causes. The trend is attributed to a growing recognition of the importance of well-structured data for successful AI analytics, as well as the current hiring season. Some users suggest a correlation with fiscal year planning and the need to finalize budgets before year-end freezes. The increase in recruitment activity is not limited to the poster’s experience, with others also reporting heightened interest. The sentiment reflects a positive market demand for data engineers, although the exact reasons remain speculative.
125 comments
Share
Save
View on Reddit →
Posted in r/MachineLearningbyu/Encrux6154/21/2026
128

Bulding my own Diffusion Language Model from scratch was easier than I thought [P]

Project
A user shared their experience of building a diffusion language model from scratch, finding the process easier than anticipated. Their model, trained on a small Shakespeare dataset, produced interesting results despite the limited training time. The project facilitated better understanding of complex AI terminologies. The user's initiative was appreciated by others, with one pointing out the impressive results given the limited resources. There was a discussion on the fundamental concepts, with suggestions to read specific research papers for clarity. Some highlighted the simplicity behind intimidating terms once delved into. The sentiment was predominantly positive, encouraging continued learning and experimentation in AI.
29 comments
Share
Save
View on Reddit →
Posted in r/SQLbyu/SubstantialPhone61634/23/2026
22

SSIS is worth it and in demand in today IT market?

SQL Server
The sentiment towards learning SSIS (SQL Server Integration Services) in the IT market is mixed, largely dependent on the work environment. Critics argue that the tool is clunky, irritating to work with, and doesn't provide unique value, recommending only learning it to support legacy systems. On the other hand, supporters emphasize its relevance in industries with on-premises systems, legacy systems, or in the government sector. They also consider it a good skill for ELT/ETL developers. While SSIS is seen as ubiquitous and unlikely to disappear, there's a clear trend towards learning cloud-based ETL tools like Azure Data Factory for future-proofing one's skills.
33 comments
Share
Save
View on Reddit →
Posted in r/SQLbyu/Far-Round20924/24/2026
21

I built a SQL game where the PvP mode validates queries server-side so the client never sees the solution

MySQL
The developer of SQL Protocol, a new browser game centered around Postgres queries, has received positive initial feedback. Users appreciated the unique player vs player (PvP) mode and found the game highly engaging. Some pointed out similarities to other SQL learning resources and expressed eagerness to try the newly introduced guest mode. The difficulty level of initial chapters was deemed appropriate for beginners, with suggestions to maintain ease for quick wins. Concerns were raised about potential issues with the validation of column order edge cases, but the developer reassured that the issue has been considered and addressed.
10 comments
Share
Save
View on Reddit →
Posted in r/SQLbyu/execusuite4/22/2026
16

63 year old woman with an MBA, LMSW, Green Belt in Lean Six -

MySQL
A 63-year-old woman with multiple qualifications, including an MBA and LMSW, is seeking advice on her decision to change careers to software development. The most upvoted comment warns of the rapid advancements in AI that are eliminating the need for entry and mid-level developers, suggesting that the industry is becoming akin to industrialization. They believe SQL is being optimized by AI and that many IT jobs will be lost in the next few years due to these advancements. They recommend focusing on IT and network security as these sectors require human intervention. Other comments express skepticism about the feasibility of entering the field at her age and suggest she might be better suited for management. The overall sentiment is mixed, with some supportive of her pursuit but many cautioning her about the challenges ahead.
27 comments
Share
Save
View on Reddit →

Subscribe to data-subtldr

Get weekly summaries of top content from r/dataengineering, r/MachineLearning and more directly in your inbox.

Get the weekly data subTLDR in your inbox!

We respect your privacy. No spam, ever.