← Back to data subTLDR
data subTLDR week 50 year 2025
r/MachineLearningr/dataengineeringr/SQL
Cracking SQL with Humor and Cocktails, Advent of SQL Puzzles in Spotlight, AI in Interviews Sparks Debate, Undisclosed Employee Promotion Stirs Trust Issues, Pandas' Future Under Examination
•Week 50, 2025
Posted in r/SQLbyu/schoolforapples•12/13/2025
571
I can't escape SQL, even when I'm trying to get drunk
SQL Server
The discussion revolves around humorously applying SQL commands to the context of drinking, with the top-voted comment being a command to select vodka from a shelf. Other popular comments include SQL commands to make a cocktail and jokes about being one’s own server. Some comments also reference a video about SQL and Miller Lite, and debate the pronunciation of SQL. The overall sentiment is positive, with participants engaging in light-hearted banter using SQL language. There's a clear appreciation for the blend of coding and humor in this unconventional context.
Posted in r/dataengineeringbyu/Wonderful-Local6996•12/9/2025
276
Evidence of Undisclosed OpenMetadata Employee Promotion on r/dataengineering
Discussion
Evidence of undisclosed promotion by OpenMetadata employees on r/dataengineering has sparked a discussion about trust and integrity within the community. The majority of comments approved of the exposure of spam-like activities, with some suggesting immediate bans for violators. The community has expressed frustration over disguised promotional posts undermining the neutrality of their discussion space. Some suspect that rival companies might be coordinating report spamming to silence posts. Suggestions include the use of the Bot Bouncer app to pre-emptively remove disruptive bot-like account behaviors and stricter moderation policies to maintain the forum's integrity. Overall, the sentiment leans towards a call for increased transparency and regulation.
Posted in r/dataengineeringbyu/Relative-Cucumber770•12/9/2025
243
Will Pandas ever be replaced?
Discussion
While many users acknowledge the speed and efficiency of new tools like Polars and DuckDB, they highlight that changing from Pandas is challenging due to its integration with other tools and its prevalence in the industry. The transition is not only about time, but also about the effort needed to adapt to the new ecosystem. Some users criticize Pandas' syntax and encourage the use of Polars, but the overall sentiment is that understanding of Pandas is still essential. However, the expectation is that the adoption of faster tools will increase over time to save costs, despite legacy issues.
Posted in r/dataengineeringbyu/Zealousideal_Grand75•12/8/2025
225
Wtf is data governance
Help
Data governance is fundamentally about making data accessible, safe, secure, and easily searchable within an organization. It's akin to transforming individual knowledge into a well-indexed library book, making it available for everyone who has access. This involves establishing policies and processes that ensure data is available in the right quality at the right time for all authorized individuals. Additionally, data governance includes accountability for data quality checks, access control, metadata and documentation, and serving as the point of contact for data-related queries. It's not merely documenting data, but also understanding its source, its responsibility, and who can access it. The sentiment is positive, highlighting data governance's integral role in organizational data management.
Posted in r/MachineLearningbyu/apidevguy•12/13/2025
115
[D] How does Claude perform so well without any proprietary data?
Discussion
The Reddit community highlights that Anthropic's (Claude) success could be attributed to a focus on algorithmic improvements and optimization, instead of large proprietary data sets. Many believe that Claude's sophisticated model training and feature engineering may give it an edge. Some argue that the quality of data used for training is more important than the quantity, suggesting that Claude may utilize high-quality public data. There's also speculation that Claude might be leveraging anonymous user data, although this is unconfirmed. Overall, the sentiment is positive towards Claude's performance, with an appreciation for the company's ability to compete against giants with larger data assets.
Posted in r/MachineLearningbyu/rantana•12/8/2025
113
[D] Does this NeurIPS 2025 paper look familiar to anyone?
Research
The NeurIPS 2025 paper The Indra Representation Hypothesis has generated debate due to its striking resemblance to a previous work, the Platonic Representation Hypothesis. Despite the authors citing the original paper, commenters highlighted significant overlaps in content and terminology, raising concerns about the originality of the new paper. There's also confusion regarding the paper's acceptance despite lukewarm reviews and borderline ratings. Some speculated that the new paper might be a homage to the original, while others criticized its overly formal mathematical approach and unclear tables. The sentiment leaned negative, indicating a broader concern regarding rigorous review processes and standards in academic publishing.
Posted in r/SQLbyu/LordSnouts•12/11/2025
84
I built Advent of SQL - An Advent of Code style daily SQL challenge with a Christmas mystery story
SQLite
The SQL coding community generally appreciates the new 'Advent of SQL' challenge series, which features daily database-focused puzzles with a Christmas-themed narrative. Users are eager to participate, with some actively working on puzzles. However, there are concerns about the functionality of the platform, particularly with parsing answers and displaying column names on mobile devices. Feedback suggests that the platform might not accept some correct answers, and the user interface could be improved for better guidance. Despite these technical concerns, the creator is responsive and open to making improvements based on user feedback, indicating an active commitment to the platform's development.
Posted in r/SQLbyu/arrogant_definition•12/9/2025
74
Got sacked at 3rd stage interview because I did this.
Discussion
The sentiment towards the individual who was not selected for a Business Intelligence role after using AI tools during a task-based interview is largely negative, with the highest upvoted comments criticizing this approach. Redditors emphasized the importance of demonstrating personal skills and understanding of SQL, rather than relying on AI. There were also security concerns about connecting a database to an external AI tool. Some comments highlighted that larger, risk-averse corporations may not appreciate such innovative approaches, contrasting them with startups. A few likened the use of AI to copying and pasting answers, suggesting it doesn't prove competency in the job.
Subscribe to data-subtldr
Get weekly summaries of top content from r/dataengineering, r/MachineLearning and more directly in your inbox.