← Back to data subTLDR

data subTLDR week 28 year 2025

r/MachineLearningr/dataengineeringr/SQL

Mastering CTEs and Window Functions: Insights from a Senior Data Analyst Interview, Job Opportunities in Oregon, SQL's Real-World Complexity, and the Misunderstanding of Recruiters and Citizen Developers

Week 28, 2025
Posted in r/dataengineeringbyu/HMZ_PBI7/9/2025
443

Let's talk about the elephant in the room, Recruiters don't realize that all cloud platforms are similar and an Engineer working with Databricks can work with GCP

Discussion
Many Reddit users express frustration over recruiters' lack of understanding of similar cloud platforms, leading to misguided hiring decisions. They share experiences of being wrongfully dismissed due to perceived lack of specific experience, despite having worked with similar technologies. Some suggest this results from recruiters merely ticking boxes, rather than understanding the role's requirements. A few users voice their strategy of simply telling recruiters what they want to hear, due to the difficulty of having meaningful conversations with them. However, one user defends recruiters, suggesting that certain platforms might require specialized experience. Overall, the sentiment is negative towards recruiters' knowledge of technology.
103 comments
Share
Save
View on Reddit →
Posted in r/dataengineeringbyu/Swimming_Cry_68417/10/2025
335

Vibe / Citizen Developers bringing our Datawarehouse to it's knees

Discussion
The increased use of 'vibe coders' or 'citizen developers', who lack comprehensive coding training, is causing substantial strain on a data warehouse, spiking its compute usage by 2000%. These novice coders are running inefficient queries, causing resource locks and potential future costs. Despite the issues, management remains enthusiastic about these coders, creating further concerns about unnecessary spending, especially given a questionable $2 million deal for 2TB of cold log storage. Cloud providers are seen to potentially benefit from increased compute billing. This situation reveals the potential pitfalls of insufficiently trained staff handling complex data systems. The sentiment is predominantly negative.
134 comments
Share
Save
View on Reddit →
Posted in r/SQLbyu/tits_mcgee_927/12/2025
283

Here are some SQL questions I was asked for a technical interview recently.

Discussion
Despite passing a challenging technical test involving SQL questions, the interviewee did not proceed to the next round of interviews for a Senior Data Analyst position at a well-known US company. The test required knowledge of SELF-JOIN, MAX on a datetime field, CTEs, JOIN, aggregations, and window functions. The interviewee also discussed experiences with slow queries, indexing, big data, data verification, and data visualization through Tableau. Although the feedback was not provided, the candidate found the experience good practice and echoed by others. The take-away is the importance of mastering CTEs and window functions. The overall sentiment is mixed.
44 comments
Share
Save
View on Reddit →
Posted in r/dataengineeringbyu/Dependent_Gur13877/8/2025
205

de trends of 2025

Discussion
In the data engineering landscape for 2025, cloud data warehouses, data orchestration, and real-time processing tools are highly in demand. Snowflake, Databricks, Apache Airflow, and Apache Kafka appear to be the most sought-after tools. However, there is an ongoing debate on dbt's role, with some users arguing it's not an orchestrator but a transform tool. Several users questioned the data collection method behind the survey, suggesting that job postings might have been scraped. Some users also mentioned missing tools like Apache Spark and Beam, indicating a possible gap in the analysis. Overall, the sentiment was mixed, with appreciation for the effort made but also criticism for potential methodological shortcomings.
18 comments
Share
Save
View on Reddit →
Posted in r/MachineLearningbyu/pz6c7/8/2025
169

Favorite ML paper of 2024? [D]

Discussion
The community identified the ARC-AGI without Pretraining and Anthropic's Extracting interpretability features as the most impactful ML papers of 2024. The former is lauded for its one-shot, data-efficient approach to AI, despite some criticisms of handcrafting information into the architecture. The latter paper became influential in the ML community and beyond, inspiring further research on hallucinations in language models. Suggestions for monthly discussions on papers were well-received. Despite some criticisms of an overemphasis on large language models (LLMs), the Quiet-STaR and Mamba papers were also acknowledged for their contributions to LLM innovation. The sentiment was generally positive, with constructive debates on specific methodologies.
43 comments
Share
Save
View on Reddit →
Posted in r/SQLbyu/Hairy-Teach-2947/7/2025
71

We’re Hiring! Onsite in Oregon - Database Administrator

SQL Server
A job posting for a Database Administrator position in Oregon received positive reactions, with users praising the transparency about the salary range and the inclusion of mentorship. Questions were raised about the requirement for on-site work despite the job involving Azure migration, suggesting potential interest in fully remote positions. The poster clarified the role is actually a hybrid, with three days on-site required. Users also suggested the poster use other platforms to advertise the job. A few negative comments highlighted the lack of clarity about the poster's role as an agency recruiter. Overall, the sentiment was largely positive.
18 comments
Share
Save
View on Reddit →

Subscribe to data-subtldr

Get weekly summaries of top content from r/dataengineering, r/MachineLearning and more directly in your inbox.

Get the weekly data subTLDR in your inbox!

We respect your privacy. No spam, ever.