← Back to data subTLDR

data subTLDR week 44 year 2025

r/MachineLearningr/dataengineeringr/SQL

SQL's Unstoppable Reign, Teaching It to Play Pong, Extra Cash from SQL Gigs, Halloween Tech Humor, Freebie Alert for Non-Commercial DataGrip Users

Week 44, 2025
Posted in r/SQLbyu/g2petter10/31/2025
1041

Any day now ...

Discussion
AI and low-level manipulators are not seen as replacements for SQL, despite the evolving tech landscape. SQL's longevity, extensive documentation, and proven capabilities make it a well-loved tool among users, although its complexity also garners some dislike. The tech world's frequent reinventions are viewed skeptically compared to SQL's consistency over 50 years. There's a general belief that AI won't eliminate SQL but may replace the analyst. Some feel that object-oriented databases, a real concept, could be the wave of future development. Sentiment is mixed with a hint of nostalgia for SQL's robustness and a cautious outlook on emerging technologies.
56 comments
Share
Save
View on Reddit →
Posted in r/MachineLearningbyu/NamerNotLiteral10/31/2025
316

[D] ArXiv CS to stop accepting Literature Reviews/Surveys and Position Papers without peer-review.

News
The decision by ArXiv CS to no longer accept literature reviews, surveys, or position papers without peer review has received mixed reactions. While some find the move humorous given ArXiv's role as a preprint site, others regard it as a necessary measure to control the influx of low-quality, LLM-generated spam. Critics argue that this introduces gatekeeping, shifting ArXiv's identity from a simple PDF upload site to something akin to an online journal. Supporters point out the declining quality of publications, with many lacking depth, reproducibility, or proper structure. The feeling is that a stricter review process could help maintain standards.
34 comments
Share
Save
View on Reddit →
Posted in r/MachineLearningbyu/BetterbeBattery10/29/2025
248

[D]NLP conferences look like a scam..

Research
Many users express skepticism towards the validity and usefulness of NLP conference papers, largely due to perceived lack of theoretical justification, unrealistic assumptions, and minor benchmark improvements. They note that many improvements are not reproducible due to high compute costs, and often lack reported statistical robustness. Some argue that the field has been overtaken by deep learning, which despite its 'magical' operation, delivers superior results than linguistically backed methods. Counterarguments emphasize that ML is empirical, and advancements often come from benchmark improvements. They defend the post-hoc writing of papers and argue against unnecessary mathematical justification. The sentiment is somewhat mixed, but leans towards criticism.
53 comments
Share
Save
View on Reddit →
Posted in r/dataengineeringbyu/lozinge10/28/2025
232

DataGrip Is Now Free for Non-Commercial Use

Blog
DataGrip, a SQL Integrated Development Environment (IDE), has garnered widespread approval for its recent move to free non-commercial use. Users value its functionality, with many recommending it for both professional and hobby projects. Key features include output customizability, making it easier to import data into different formats. Some users compared it favorably to DBeaver, another popular SQL IDE, but expressed concerns about the latter's free version becoming less useful. However, a few users noted that the free trial period for DataGrip was still in effect, and questioned the added benefits for those already using the full version of PyCharm. The overall sentiment was positive.
29 comments
Share
Save
View on Reddit →
Posted in r/dataengineeringbyu/Different-Future-44710/29/2025
217

What exactly does a Data Engineering Manager at a FAANG company or in a $250k+ role do day-to-day

Career
Data Engineering Managers at FAANG companies or in $250k+ roles spend most of their time on politics and managing upwards, setting the context and direction for their teams. The roles require experience with both technical aspects and people management and involve implementing systems across many teams. It's often noted that securing these jobs is typically easier for individuals who have already worked at FAANG companies, as high-level roles are often filled through internal networks. The cost of living in the areas where these companies are located is also a significant factor. Despite the challenges, these roles are highly rewarding and offer opportunities to impact the broader organization.
70 comments
Share
Save
View on Reddit →
Posted in r/SQLbyu/Psychological-Motor611/1/2025
143

I taught SQL to play Pong … against itself. Now it won’t stop.

Discussion
The programming community is impressed by a developer's ingenuity in teaching SQL to play Pong against itself. This was achieved by implementing all game logic in a single SQL query per frame, utilizing physics, AI, and collisions. Python was used to print the scene, and the game can run at various speeds including 30, 60, or 120 FPS. The method was applied using DuckDB, but should work for most modern SQL engines. The sentiment is overwhelmingly positive, with users expressing admiration for the creativity and technical skills demonstrated in the project. There's a general consensus that such projects inspire and expand the understanding of SQL's potential applications.
14 comments
Share
Save
View on Reddit →
Posted in r/MachineLearningbyu/Federal_Ad181210/27/2025
138

[R] PKBoost: Gradient boosting that stays accurate under data drift (2% degradation vs XGBoost's 32%)

Research
A new gradient boosting model, PKBoost, has been developed to handle performance collapse on extreme imbalance and silent degradation when data drifts. The model uses Shannon entropy in the split criterion alongside gradients, making it more robust to data drift. PKBoost has shown superior performance on imbalanced data and under realistic drift compared to LightGBM and XGBoost. While it provides auto-tuning and comparable inference speed to XGBoost, it has ~2-4x slower training and is slightly behind on balanced data. The model, built in Rust, has received positive feedback for its innovative architecture and potential broad applications despite its slower training time.
37 comments
Share
Save
View on Reddit →
Posted in r/SQLbyu/Fearless_Stock_537510/29/2025
82

How are you all making extra money with SQL?

PostgreSQL
Most SQL professionals find side gigs through their network connections and being known in their field. They advise not to sell SQL skills directly, but to sell problem-solving abilities using SQL as a tool. Teaching is seen as a viable way to earn extra, but it requires commitment similar to a full-time job. Some suggest taking opportunities as consultants for short-term projects. However, the financial gains from these activities are often modest, requiring persistence and a considerable investment of time. The sentiment is generally positive, with experienced SQL users encouraging using SQL skills as a means to solve problems and add value.
27 comments
Share
Save
View on Reddit →
Posted in r/SQLbyu/Exact-Shape-413111/2/2025
41

1NF, 2NF, 3NF are killing me.

PostgreSQL
While many professionals admit to rarely using or discussing 1NF, 2NF, 3NF (normal forms) in everyday work, they acknowledge the importance of understanding these concepts for effective database design. There's consensus that neglecting these principles can lead to poorly structured databases, requiring significant later fixes. The idea of data dependency was clarified with examples, such as a value in column A impacting column B’s value, and a real-world scenario involving user-friendly data representation. A few posters warned that real-world data is often messy, making it challenging to apply normalization principles effectively. The overall sentiment leans towards practical applications over academic definitions.
99 comments
Share
Save
View on Reddit →

Subscribe to data-subtldr

Get weekly summaries of top content from r/dataengineering, r/MachineLearning and more directly in your inbox.

Get the weekly data subTLDR in your inbox!

We respect your privacy. No spam, ever.