← Back to data subTLDR

data subTLDR week 40 year 2025

r/MachineLearningr/dataengineeringr/SQL

Mastering SQL and CROSS APPLY, Transitioning to Analyst Roles, The Realities of Data Pipelines, Market Shifts in Data Engineering Tools

Week 40, 2025
Posted in r/dataengineeringbyu/Background_Artist8019/30/2025
1170

“Achievement”

Meme
The discussion centers around the reliability of data pipelines, with a prevalent sentiment of humor and cynicism. High consensus exists around the idea that pipelines often fail at inopportune times and are prone to errors, such as empty output columns. Notably, a thread emphasizes the difficulty faced by teams in understanding and managing pipelines, especially after the departure of the original architect. This often results in costly and time-consuming efforts to migrate to new technologies. The tone is overall mixed, acknowledging the challenges in managing pipelines, while also appreciating their essential role in data processing.
32 comments
Share
Save
View on Reddit →
Posted in r/dataengineeringbyu/full_arc9/30/2025
403

The Great Consolidation is underway

Meme
The market for data engineering tools is cyclical, with companies often fluctuating between purchasing products to simplify their pipeline and deciding to build their data engineering layer in-house when tools fail to deliver on promises or prove too costly. Fivetran's recent pricing model change has sparked controversy, with some users reporting significant, sudden cost increases. This has led some companies to consider creating their own data ingestion layer. Despite Fivetran's struggles, it's suggested that their approach to putting ingestion and transformation into less-technical hands may benefit larger platforms like Snowflake, Databricks, and BQ. The overall sentiment towards these shifts in the data engineering market is mixed.
42 comments
Share
Save
View on Reddit →
Posted in r/MachineLearningbyu/al3arabcoreleone10/2/2025
196

[N] Stanford is updating their Deep Learning course on YouTube

News
The Stanford Deep Learning course's update is being well-received, with many appreciating the chance to learn or fill knowledge gaps. Some users, however, feel that the CMU deep learning course offers better depth. While overall sentiment is positive, there's a clear interest in the specifics of the update and how it compares to other courses. The course's depth and potential for comprehensive learning are key considerations for users.
11 comments
Share
Save
View on Reddit →
Posted in r/MachineLearningbyu/rosesarenotred009/30/2025
107

[D] Is it normal for a CV/ML researcher with ~600 citations and h-index 10 to have ZERO public code at all?

Discussion
Many in the CV/ML research community find it common, though not ideal, for researchers to have zero code releases. Cleaning up research code for public sharing is time-consuming and often not incentivized. Some suggest that the trend of not releasing code was more prevalent before 2019-2020, when public code and reproducibility became more appreciated. Some users expressed concerns about the potential for revealed issues that could invalidate research. However, others argue that if the code was adequate to produce research, it should be released, even if unpolished. The sentiment is mixed, reflecting a tension between practicality and transparency in research.
112 comments
Share
Save
View on Reddit →
Posted in r/MachineLearningbyu/lan199010/2/2025
106

[D] Open source projects to contribute to as an ML research scientist

Discussion
A Machine Learning (ML) research scientist sought advice on open-source projects to contribute to, aiming to improve their coding skills and job prospects. Respondents directed the scientist to various projects including FastVideo, an initiative centered on video generation and world models, and Matplotlib, a popular data visualization library in Python. Some advised caution when navigating potentially aggressive contributor communities in larger projects. Others suggested the scientist's job application rejections might be tied to their overall profile, not just coding abilities, and recommended seeking feedback from senior peers. There was also a suggestion to explore issues trackers in tools they use and join relevant communities. The overall sentiment was supportive and constructive.
38 comments
Share
Save
View on Reddit →
Posted in r/SQLbyu/LabRevolutionary96599/29/2025
81

SQL is really tought

MySQL
Learning SQL can be challenging initially, but it's considered one of the easier programming languages. New learners are encouraged to understand the logic behind the code and speak confidently about it, even if their syntax isn't perfect. While courses may set high expectations, it's essential to remember that proficiency takes time, and rushing might not be beneficial. Honesty about skill level in job interviews is recommended over pretense. The real challenge often lies in handling data and business rules, rather than the SQL language itself. Persistence is key; a week of study is just a starting point. The sentiment is mixed to positive, emphasizing patience and honesty.
78 comments
Share
Save
View on Reddit →
Posted in r/SQLbyu/Chand_15910/1/2025
45

How much sql is required to move to analyst job

Discussion
To transition into a data analyst role, mastering all aspects of SQL is not required, but a solid understanding of essentials like core querying, joins and subqueries, window functions, and data cleaning is expected. This includes being comfortable with filtering, aggregating, and slicing data, as well as handling messy real-world data. The ability to translate complex business questions into the right SQL queries and clearly explain the results is highly valued. Pairing SQL with visualization tools such as Power BI or Tableau is also beneficial. The level of SQL knowledge required can vary significantly across roles, so reviewing specific job requirements is recommended.
16 comments
Share
Save
View on Reddit →

Subscribe to data-subtldr

Get weekly summaries of top content from r/dataengineering, r/MachineLearning and more directly in your inbox.

Get the weekly data subTLDR in your inbox!

We respect your privacy. No spam, ever.