← Back to data subTLDR
data subTLDR week 38 year 2025
r/MachineLearningr/dataengineeringr/SQL
SQL to Excel: A Necessary Evil?, Master SQL in a Week for Your First Coding Interview, The VARCHAR(MAX) Debate in MSSQL, Job Hunting Woes: A Global Issue, Pandas vs. Alternatives: The Showdown
•Week 38, 2025
Posted in r/dataengineeringbyu/yourAvgSE•9/15/2025
280
Am I the only one who seriously hates Pandas?
Discussion
There's a strong shared sentiment of dissatisfaction with the Python data analysis library, Pandas, due to perceived overcomplexity and inconsistency, similar to issues noted in the R programming language. Many users suggest switching to alternatives such as Polars and DuckDB, which reportedly offer cleaner APIs and solve many issues present in Pandas. There's also a comparison drawn to R's tidyverse metapackage, which addresses R's inconsistencies and is said to have similar principles to Polars. A few users also pointed out that the original poster's issues with JSON formatting are not inherently Pandas' fault and highlighted that Pandas is primarily designed for data analytics, not engineering.
Posted in r/dataengineeringbyu/Key-Establishment483•9/18/2025
279
Absolutely brutal
Career
Job application processes are facing challenges globally. In Norway and the UK, there's a struggle to find enough suitable candidates, often due to visa requirements. Many applicants are fresh graduates with little practical experience or are indiscriminately applying to jobs, regardless of relevance. This can skew perception of competition, with good fits standing a better chance than the numbers suggest. Some advice includes reaching out directly to hiring managers to stand out and emphasizing relevant experience. The system is criticized for potentially overlooking qualified candidates due to mass applications by those who are not fully qualified. Overall sentiment: mixed.
Posted in r/MachineLearningbyu/general_landur•9/16/2025
189
[D] - NeurIPS 2025 Decisions
Discussion
The anticipation and anxiety around the upcoming NeurIPS 2025 decisions were palpable. Many users expressed excitement, stress, and a sense of anticipation. Notably, there was a call for an OpenReview leaderboard to track who refreshes the page most frequently. A few shared their results, with both acceptances and rejections. One user reflected on the seeming randomness of the process, having had papers accepted that were previously rejected from less prestigious conferences. The sentiment was generally mixed, as the process was seen as somewhat of a lottery, but there was also encouragement and support for those with rejected papers, emphasizing to keep faith in one's work.
Posted in r/dataengineeringbyu/tanmayiarun•9/17/2025
167
Snowflake is slowly taking over
Discussion
The data technology platforms, Snowflake and Databricks, are both gaining traction in the market. While some users have observed a shift towards Snowflake, others argue Databricks is becoming more popular. Snowflake is recognized for its ease of use and fast analytics, but it's primarily an OLAP database. Databricks, on the other hand, is likened to a Swiss army knife for its broader capabilities, including complex pipelines and machine learning, but it requires more engineering effort. Some companies are transitioning between the two platforms depending on their specific needs and the expertise of their teams. The discussion reflects a mix of personal preferences and experiences.
Posted in r/SQLbyu/After_Comedian_7420•9/16/2025
159
Who’s still exporting SQL data into Excel manually?
MySQL
Despite the perceived inefficiency of manually exporting SQL data to Excel, many professionals still find it a useful and versatile tool, especially for sharing data with those who don't have SQL access or for performing detailed analysis. While automating repetitive tasks is encouraged, there is still a place for manual processes in dealing with one-off queries. Powerquery and ODBC connections to data warehouses are also used for easy data updates. However, some express caution with ODBC's ability to edit data, citing potential risks. Overall, the sentiment suggests a blend of automation and manual handling, depending on the context.
Posted in r/MachineLearningbyu/Zapin6•9/15/2025
158
[D] The quality of AAAI reviews is atrocious
Research
The AAAI conference review process has been criticized for its low-quality feedback, with many researchers expressing frustration and disappointment. Reviewers' lack of detail, mathematical errors, and lack of actionable advice seem to be recurring issues. Some suggest that the reciprocal review system, where researchers review each other's work, is failing. Others argue that short reviews are not the problem, but rather the inaccuracy and lack of value of the feedback. Suggestions for improvement include paying reviewers and posting all research on OpenReview for community evaluation. The dissatisfaction is causing some researchers to reconsider submitting to big conferences or to opt for journal publication instead.
Posted in r/MachineLearningbyu/Dangerous-Hat1402•9/15/2025
116
[D] The conference reviewing system is trash.
Discussion
The academic conference reviewing system is under criticism due to perceived biases and lack of responsibility among reviewers. Several participants shared experiences of receiving inconsistent and low-quality feedback, with some reviewers seemingly not understanding their work. Many believe the system's flaws stem from forcing authors to review others' papers, as this could incentivize authors to reject others' work to boost their own chances of acceptance. There were also concerns about unethical behavior like collusion rings. The competitive nature of these conferences could be influencing this behavior. However, some argue that those who submit papers should also provide quality reviews in return, calling for a more ethical and responsible reviewing culture. Overall, sentiment is negative towards the current system.
Posted in r/SQLbyu/Heron-Rude•9/15/2025
51
First coding interview without SQL knowledge :/
Discussion
The Reddit community seems confident that a basic understanding of SQL can be achieved in a week, especially for a Junior Data Analyst role. Top advice includes solving SQL50 on leetcode and using resources like stratascratch.com and Analyst Builder for practice problems. Focus should be on understanding key SQL operations like SELECT, WHERE, JOIN, and basic aggregations (COUNT, SUM, AVG). Members also highlighted the importance of honesty about one's skills in the tech industry. Despite acknowledging that SQL is a broad topic, the sentiment was largely positive about the interviewee's ability to prepare in a week.
Posted in r/SQLbyu/andrewsmd87•9/18/2025
39
MSSQL does it really matter if you use varchar max
SQL Server
The use of VARCHAR(MAX) in MSSQL over a defined size like VARCHAR(512) could potentially affect memory allocation and query optimization, especially in production environments. Top comments suggest that a larger VARCHAR field could lead to excessive memory grant issues due to overestimation of memory requirements by the query planner. VARCHAR(MAX) also limits indexing and doesn't effectively support table/index compression. Appropriate data type choices are crucial for data quality enforcement. However, VARCHAR(MAX) may be suitable for staging tables to prevent ingestion errors. The sentiment leans towards avoiding VARCHAR(MAX) unless data length exceeds current values or under specific use cases.
Subscribe to data-subtldr
Get weekly summaries of top content from r/dataengineering, r/MachineLearning and more directly in your inbox.