← Back to data subTLDR
data subTLDR week 2 year 2026
r/MachineLearningr/dataengineeringr/SQL
Running SQL in Google Sheets via Chrome Extension, Building SQL Dashboards for Better Understanding, Preparing for SQL Certification, Weekly Calls for Data Engineering Enthusiasts, The Knowledge Behind Data Engineering YouTubers
•Week 2, 2026
Posted in r/SQLbyu/Comfortable-Ear-1129•1/5/2026
387
Chrome extension to run SQL in Google Sheets
Discussion
The SQL4Sheets Chrome extension, which allows users to run SQL directly in Google Sheets, is being well-received for its sleek, user-friendly interface. Despite some existing options for SQL querying in Excel, users appreciate this new tool as a significant improvement to Google Sheets. However, concerns have been raised about data security, with requests for technical documentation or source code to understand the data protection measures in place. Some users also expressed interest in having the extension available on other browsers. Overall, the sentiment is positive, signaling potential widespread adoption once security concerns are addressed.
Posted in r/MachineLearningbyu/Nunki08•1/7/2026
300
[R] DeepSeek-R1’s paper was updated 2 days ago, expanding from 22 pages to 86 pages and adding a substantial amount of detail.
Research
The updated DeepSeek-R1 paper has expanded significantly, sparking discussion on the inclusion of additional details. A key topic among commenters is the potential rectification of issues with the grpo reward calculation. The paper's popularity on social media was noted, with mention of the hype surrounding its initial release. There was also discussion regarding the practicality and necessity of self-normalizing activations in deep CNNs. Views on the paper's length and format varied, with one commenter noting its similarity to the combination of a Nature paper and supplementary materials. The sentiment was mixed, reflecting both skepticism and appreciation for the paper's content.
Posted in r/dataengineeringbyu/lil_faucet•1/5/2026
174
Small Group of Data Engineering Learners
Discussion
A group of data engineering enthusiasts are exploring the idea of starting a weekly call to discuss, learn, and share insights about the field. The group is open to everyone, but is primarily aimed at early to mid-career professionals. Topics of interest include performance and scaling, systems thinking, FinOps, and how data stacks evolve with business growth. Some commenters also showed interest in joining the group, offering their expertise in areas like PySpark and Databricks. However, there were suggestions to join existing communities such as the Practical Data Community Discord for a more immediate exchange of ideas. Overall, the sentiment was positive and proactive.
Posted in r/MachineLearningbyu/chaitjo•1/8/2026
164
[D] I summarized my 4-year PhD on Geometric Deep Learning for Molecular Design into 3 research questions
Discussion
A recent PhD graduate explored Geometric Deep Learning for Molecular Design, focusing on expressivity, generative modelling, and real-world design. The research theorizes 3D representations, proposes unified models for systems, and tests generative AI in RNA design. Users were curious about equivariant models' future, testing transfer learning, lessons from the PhD journey, and the source of initial training structures. The graduate emphasized the importance of data efficiency and real-world application, and suggested that 3D structural data may not always be optimal for improving future biological AI models. There was also a discussion about the efficiency gap between architectures with inductive biases and Transformer modules. The overall sentiment was positive and insightful.
Posted in r/dataengineeringbyu/Decent-Ad3092•1/10/2026
161
Data Engineering Youtubers - How do they know so much?
Discussion
The consensus is that data engineering YouTubers gain their extensive knowledge through a combination of self-study, practical experience, and a passion for continuous learning. They likely dedicate significant time outside of their full-time jobs to explore, experiment, and master new technologies. Additionally, creating content and teaching others can reinforce their understanding. Some viewers express skepticism, suggesting that not all YouTubers are experts and they might just be good at presenting information. The sentiment leans towards admiration for their dedication, but also includes a reminder to critically evaluate the quality of online educational content.
Posted in r/MachineLearningbyu/casualcreak•1/11/2026
151
[D] Double blind review is such an illusion…
Discussion
The academic community is expressing frustration over the perceived illusion of double-blind review, particularly in the context of top-tier labs publicizing papers on platforms like arXiv before they have been reviewed. Concerns include potential bias from reviewers, the decline in value of peer review, and a shortage of qualified reviewers impacting review quality. There's a perception that prestigious institutions and well-known researchers receive undue praise and less scrutiny. Some suggest the system needs an overhaul, with ideas ranging from tightening acceptance criteria on platforms like arXiv, to moving towards open peer review. The sentiment is overall negative towards the current system.
Posted in r/dataengineeringbyu/Murky-Equivalent-719•1/5/2026
104
What actually differentiates candidates who pass data engineering interviews vs those who get rejected?
Career
Successful data engineering candidates at Google and similar tech companies typically have strong skills in SQL, data modeling, and distributed systems. They demonstrate the ability to consider trade-offs and think about data in terms of cost versus speed. Crucially, they must also be effective communicators, capable of clearly conveying their thought processes and conclusions. Candidates who fail often lack hands-on experience and the ability to apply knowledge to new problems, or they're unable to coherently communicate their skills and experiences. Therefore, practical expertise, problem-solving skills, and communication abilities are key differentiators in data engineering interviews.
Posted in r/SQLbyu/AFRIKANIZ3D•1/11/2026
68
I learned about BOMs the other day, and how strict MySQL is
MySQL
Migrating a project from Databricks SQL to MySQL can unexpectedly present some complexities, such as understanding BOMs (Byte Order Marks) and handling hidden characters that MySQL won't accommodate, unlike other platforms. The process has been characterised as an eye-opening experience that can lead to unexpected errors, or even missed data, due to MySQL’s strictness. However, this rigidity can also result in more explicit and well-structured queries. Despite the initial challenges, the overall sentiment leans towards the positive, as users found value and learning opportunities in the process.
Posted in r/SQLbyu/mitch1stpaul•1/6/2026
34
How to get SQL certified
SQL Server
The discussion suggests that the best way for a Business Analyst with no prior experience to get SQL certified is via the Microsoft Azure Data Fundamentals DP 900 certification, due to its beginner-friendly format and wide recognition by employers. To prepare, users suggest free resources like SQLBolt, StrataScratch, and Mode, and to focus on core skills such as SELECT and WHERE statements, INNER and LEFT joins, GROUP BY, HAVING, and subqueries. Other suggestions include using Microsoft Learn, the free Azure SQL Managed Instance, and the SQL Server Developer Edition. Some users mentioned options like Oracle classes or online courses on Udemy, but cautioned about their high costs.
Subscribe to data-subtldr
Get weekly summaries of top content from r/dataengineering, r/MachineLearning and more directly in your inbox.