← Back to data subTLDR

data subTLDR week 16 year 2026

r/MachineLearningr/dataengineeringr/SQL

CEO's Risky Database Maneuver, AI Spam Overload, Database Engine Switch Burns, Accidental Production Drop, Misuse of SQL Server: A Week of Data Engineering Dramas

Week 16, 2026
Posted in r/dataengineeringbyu/klenium4/16/2026
707

Today I became a true data enginner as I acidentally dropped all of our production objects

Meme
A data engineer's accident of dropping all production objects sparked a conversation about the importance of access controls in coding environments. Many participants were surprised that the engineer could easily drop production tables, hinting at a need for more stringent security measures. Some suggested preventative measures like tiered admin users or automated processes for handling production actions. A few shared similar experiences, reinforcing the idea that such incidents are common but are not the engineer's sole responsibility. They underscored the importance of system architects, leads, or DevOps in preventing such situations. Overall, the sentiment was constructive and somewhat amused.
77 comments
Share
Save
View on Reddit →
Posted in r/dataengineeringbyu/Firestone784/17/2026
251

How do I explain that SQL Server should not be used as a code repository?

Discussion
The discussion highlights a misuse of SQL Server as a code repository, with specific critique of a complex, inefficient approach to creating Power BI reports. The process, involving SQL queries and JavaScript embedded in HTML, was viewed as unnecessarily convoluted and not aligned with typical data engineering practices. Sentiment was mixed; while many comments expressed disbelief and concern over the method, some found it impressively creative despite its impracticality. There was consensus that the approach was inappropriate, indicating a need for better education on tool usage and custom report creation in Power BI.
148 comments
Share
Save
View on Reddit →
Posted in r/MachineLearningbyu/Environmental_Form144/15/2026
185

Failure to Reproduce Modern Paper Claims [D]

Discussion
Concerns are being raised about reproducibility in ML research, as some have found that a significant number of claims in papers are not verifiable. This is attributed to a lack of code sharing and inadequate paper evaluation by reviewers. Some suggest a solution could be requiring authors to submit reproducible code with their papers, which could be run on official servers. However, it's also noted that many results are sensitive to small details not included in papers. This issue extends beyond ML and reflects a broader issue in the scientific community known as the replication crisis. The overall sentiment is mixed, with frustration at the current state and calls for change.
48 comments
Share
Save
View on Reddit →
Posted in r/dataengineeringbyu/echanuda4/14/2026
144

My company is switching to Fabric :(

Discussion
The sentiment towards the switch to Fabric in a professional setting is mixed. Some users view it as a learning opportunity, while others express dissatisfaction with Fabric, citing preference for Databricks for enterprise scale. However, some see Fabric as a good replacement for Synapse and appreciate its steady improvement. There's a reminder to focus on foundational, transferable skills like SQL, Spark, Python, regardless of the tool used. Overall, the attitude is more of acceptance, with emphasis on adaptability and learning rather than specific tools. Importantly, it's noted that the quality of a tool isn't determined by its popularity.
160 comments
Share
Save
View on Reddit →
Posted in r/SQLbyu/lune-soft4/16/2026
96

You work at E-commerce. Your BOSS/CEO who just use Claude, he just created index on order.status and say index is good. It makes things faster. What do you do here as SQL BE guy?

Discussion
The consensus among commenters is that the CEO creating an index on 'order.status' in an e-commerce database may not be a wise move. The index could potentially slow down the system as each update to the 'order.status' requires a corresponding update to the index. Indexes are recommended for columns with high cardinality like product id, email, or username. Some also questioned why a high-level executive would have such direct access to production databases, citing potential risks and inefficiencies. However, a few suggested that the impact might be minimal unless the company operates at a very large scale.
66 comments
Share
Save
View on Reddit →
Posted in r/SQLbyu/SweatyControles4/17/2026
68

[META] Vibecoded AI Slop Tools

Discussion
Reddit users are expressing frustration over a surge of posts promoting hastily-made AI tools, often seen as redundant or unhelpful. Many feel these posts are spammy and detract from the quality of the subreddit. Several users also noted a common issue of broken links attached to these posts. Some suggest creating a separate subreddit for AI-made tools to maintain the integrity of the current subreddit. There's a sense of urgency for moderators to better manage these posts. The sentiment is predominantly negative, with users seeking a more streamlined, spam-free experience.
26 comments
Share
Save
View on Reddit →
Posted in r/SQLbyu/catekoder4/14/2026
45

What difference between database engines has burned you the hardest?

Discussion
The discussion revolved around the different difficulties experienced when switching between database engines. A few key points were the case sensitivity difference between MySQL and Postgres and the need to start transactions explicitly in MSSQL, unlike in Oracle. There were also complaints about MySQL 4's design flaws such as silent data truncation, accepting invalid dates, and non-deterministic behavior of GROUP BY. Other mentioned issues included the lack of semicolon usage in MSSQL, the complex database creation in Oracle, and the misuse of a data lake in HIVE. The overall sentiment was a mix of frustration and shared commiseration.
33 comments
Share
Save
View on Reddit →

Subscribe to data-subtldr

Get weekly summaries of top content from r/dataengineering, r/MachineLearning and more directly in your inbox.

Get the weekly data subTLDR in your inbox!

We respect your privacy. No spam, ever.