← Back to data subTLDR
data subTLDR week 35 year 2025
r/MachineLearningr/dataengineeringr/SQL
Mastering SQL Queries, Cringing at Vibe Coding, Navigating the Competitive Job Market: A Dive into Coding Practices, Machine Learning, and Job Applications
•Week 35, 2025
Posted in r/dataengineeringbyu/analyticsvector-yt•8/28/2025
3217
It’s everyday bro with vibe coding flow
Meme
The sentiment towards 'vibe coding' is mixed, with some finding the term cringeworthy. A key concern raised was the risk of exposing API keys when using large language models like ChatGPT. Pedro Domingos' previous suggestion that machine learning is an exciting field, especially outside of large language models, resonated with many. The book, 'Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow' by Aurélien Géron was recommended for those interested in ML. Some commentators highlighted the ongoing challenge and importance of understanding and organizing data, despite advancements in tooling that have simplified solution design.
Posted in r/dataengineeringbyu/throwngarbage521•8/27/2025
696
347 Applicants for One Data Engineer Position - Keep Your Head Up Out There
Career
The job market for junior data engineering positions is highly competitive, with hundreds of candidates applying for a single opening. Some applicants reportedly distort their experience to appear more qualified, leading to disqualification during reference checks. The debate on whether years of experience (YoE) should be a determining factor in hiring is ongoing, with some arguing that skills and aptitude should be prioritized. Others suggest that low pay for demanding roles in high-cost living areas could contribute to attracting less experienced candidates. The overall sentiment reflects the challenging job market conditions and the importance of honest, relevant qualifications.
Posted in r/MachineLearningbyu/impatiens-capensis•8/30/2025
225
[D] NeurIPS is pushing to SACs to reject already accepted papers due to venue constraints
Discussion
The academic community is expressing frustration over NeurIPS' new policy to reject accepted papers due to venue constraints. Many feel that this undermines the peer review process, with papers liked by multiple reviewers and ACs being rejected for logistical reasons. There is a wide call for exploring alternative platforms or formats to accommodate the increasing amount of research output. The sentiment is largely negative, with researchers feeling their work is devalued, and calls for structural change are gaining traction.
Posted in r/SQLbyu/Any-Evening-4623•8/26/2025
216
That moment when:
SQL Server
The discourse highlights the importance and challenges of managing transactions in a programming or database context. There is emphasis on the use of operations such as 'BEGIN TRANSACTION' and 'ROLLBACK', indicating common mistakes or issues encountered. There are also humorous references to the idea of living off-grid or causing a 'resumé generating event,' suggesting a mistake could lead to job loss. The existence of a separate test environment, distinct from the production one, is recognized as a luxury, hinting at the potential risks of testing in a live environment. The mood is a mix of shared pain and humor.
Posted in r/dataengineeringbyu/nonamenomonet•8/25/2025
179
Vortex: A new file format that extends parquet and is apparently 10x faster
Open Source
Vortex, an advanced columnar file format, has garnered attention for its claim to be 10 times faster than Parquet. This Linux Foundation project is being likened to a combination of Arrow and Parquet in one library, and is stirring interest within the Reddit community. However, some users express skepticism due to the lack of support by major cloud providers and concerns about tech debt. Others highlight its potential, noting it has 1500 stars on GitHub. Despite mixed reactions, the community agrees that the success of Vortex will largely depend on its adoption and maturity over time.
Posted in r/MachineLearningbyu/Adventurous-Cut-7077•8/27/2025
179
[N] Unprecedented number of submissions at AAAI 2026
News
The sudden surge in AI conference submissions, particularly from China, has sparked concerns about the quality and authenticity of the research. The community highlights a prevalent issue of low-quality submissions and potential academic fraud. There is a suggestion that the increase may be due to the accessibility of conferences held in locations with less strict visa restrictions. Some propose solutions such as creating more, and more specific, conferences to manage the volume, or capping the number of submissions per person or group. There's also a call for re-evaluating how PhD programs are assessed, hinting at systemic issues within the academic community. The sentiment is mixed, ranging from questioning to critical.
Posted in r/SQLbyu/updated_at•8/30/2025
165
hmm
Discussion
The discourse primarily revolves around a notable product, eliciting a range of reactions from amusement to appreciation. The majority of commenters express positive sentiments, with some finding it humorous and others expressing interest in purchasing. However, there is also a small portion of dissent, with a few users dismissing the notion entirely. The overall sentiment leans towards the positive, indicating a generally favorable reception.
Posted in r/MachineLearningbyu/kekkodigrano•8/27/2025
128
[D] How to do impactful research as a PhD student?
Discussion
The PhD student seeking advice on conducting meaningful research received various perspectives. The general consensus is that much of research, especially in machine learning, involves publishing numerous papers to gain recognition and opportunities for impactful work. Some commented that the PhD is a stepping stone, not the defining achievement. Others suggested creating a compelling research narrative and looking for postdocs in labs undertaking appealing work. A few did argue, however, that meaningful research can be conducted during the PhD, requiring deep thinking about the problem and strong, motivated advisors. The importance of continuing to publish but also taking bigger risks was also emphasized.
Subscribe to data-subtldr
Get weekly summaries of top content from r/dataengineering, r/MachineLearning and more directly in your inbox.