What Algorithms Can Transformers Learn; Reasoning Agent for Graphs; Supervised Fine-Tuning; Context Understanding in LLMs; and More.

Danny Butvinik

Chief Data Scientist | 100K+ Followers | FinCrime | Writer | Author of AI Vanguard Newsletter

Published Jan 23, 2024

Editor's Paper Recommendations

Knowledge Editing for Large Language Models : A Survey: Large language models (LLMs) have recently transformed academic and industrial landscapes due to their remarkable capacity to understand, analyze, and generate texts based on their vast knowledge and reasoning ability. Nevertheless, one major drawback of LLMs is their substantial computational cost for pre-training due to their unprecedented amounts of parameters. The disadvantage is exacerbated when new knowledge frequently needs to be introduced into the pre-trained model. Therefore, developing effective and efficient techniques to update pre-trained LLMs is imperative. Traditional methods encode new knowledge in pre-trained LLMs through direct fine-tuning. However, naively re-training LLMs can be computationally intensive and risks degenerating valuable pre-trained knowledge irrelevant to the update in the model. Knowledge-based Model Editing (KME) has recently attracted increasing attention, aiming to precisely modify the LLMs to incorporate specific knowledge without negatively influencing irrelevant knowledge. In this survey, we aim to provide a comprehensive and in-depth overview of recent advances in the field of KME. We first introduce a general formulation of KME to encompass different KME strategies. Afterward, we provide an innovative taxonomy of KME techniques based on how the new knowledge is introduced into pre-trained LLMs. We investigate existing KME strategies while analyzing key insights, advantages, and limitations of methods from each category. Moreover, representative metrics, datasets, and applications of KME are introduced accordingly. Finally, we'd like to provide an in-depth analysis regarding the practicality and remaining challenges of KME and suggest promising research directions for further advancement in this field.

Graph Agent: Explicit Reasoning Agent for Graphs : Graph embedding methods such as Graph Neural Networks (GNNs) and Graph Transformers have contributed to developing graph reasoning algorithms for various tasks on knowledge graphs. However, the lack of interpretability and explainability of graph embedding methods has limited their applicability in scenarios requiring explicit reasoning. This paper introduces the Graph Agent (GA), an intelligent agent methodology of leveraging large language models (LLMs), inductive-deductive reasoning modules, and long-term memory for knowledge graph reasoning tasks. GA integrates aspects of symbolic reasoning and existing graph embedding methods to provide an innovative approach for complex graph reasoning tasks. By converting graph structures into textual data, GA enables LLMs to process, reason, and provide predictions alongside human-interpretable explanations. The effectiveness of the GA was evaluated on node classification and link prediction tasks. Results showed that GA reached state-of-the-art performance, demonstrating accuracy of 90.65%, 95.48%, and 89.32% on Cora, PubMed, and PrimeKG datasets, respectively. Compared to existing GNN and transformer models, GA offered the advantages of explicit reasoning ability, freedom of training, and easy adaptation to various graph reasoning tasks.

What Algorithms Can Transformers Learn? A Study in Length Generalization : Large language models exhibit surprising emergent generalization properties yet struggle with many simple reasoning tasks, such as arithmetic and parity. This raises the question of if and when Transformer models can learn the true algorithm for solving a task. We study the scope of Transformers' abilities in the specific setting of length generalization on algorithmic tasks. Here, we propose a unifying framework to understand when and how Transformers can exhibit strong length generalization on a given task. Specifically, we leverage RASP (Weiss et al., 2021), a programming language designed for the computational model of a transformer, and introduce the RASP-Generalization Conjecture: Transformers tend to generalize on a task if the task can be solved by a short RASP program that works for all input lengths. This simple conjecture remarkably captures the most known instances of length generalization on algorithmic tasks. Moreover, we leverage our insights to drastically improve generalization performance on traditionally hard tasks (such as parity and addition). Theoretically, we give a simple example where the "min-degree interpolator" model of learning from Abbe et al. (2023) does not correctly predict Transformers' out-of-distribution behavior, but our conjecture does. Overall, our work provides a novel perspective on the mechanisms of compositional generalization and the algorithmic capabilities of Transformers.

Meet SingleStore Pro Max, the Powerhouse Edition

In the rapidly changing landscape of AI and real-time analytics, the foundation of your applicationsâ€”the data platformâ€”is no longer an optional frill but a must-have. It's the springboard for innovation, the hidden force behind every breakthrough application.

Introducing SingleStore Pro Max: The Powerhouse Edition

Link

Industry Insights

Growth Zone

How to Motivate Employees to Go Beyond Their Jobs

Expert Advice

The AI Vanguard

43,662 followers

+ Subscribe

YesSir NazZir

9mo

"Embark on a transformative journey with the Supervisory Management Transformational Program (SMTP). Unveiling a meticulously crafted High-Level Structure and a 14-step Transformational Ladder, this program is designed to elevate supervisory skills to new heights. From foundational principles to advanced leadership strategies, each step propels participants toward managerial excellence, fostering a culture of innovation, collaboration, and sustainable success. Join us in redefining leadership through SMTP, where every rung on the ladder signifies a strategic leap toward organizational brilliance." Â #leadershiptransformation #SupervisorSuccess #SmartSupervisors #InspiringSupervisors #leadershipdevelopment #leadershipskills #effectivemanagement #SupervisoryExcellence #HighLevelSupervision #ManagementRevolution #supervisors #supervision #supervisedlearning Â https://www.linkedin.com/posts/yasernazir_leadershiptransformation-supervisorsuccess-activity-7165692222141591552-_IzN?utm_source=share&utm_medium=member_desktop

DataInsta

9mo

Can't wait to explore the latest developments! ðŸš€

Asen Ivanov

Strategic Partnerships | Games Lover | Dual US & Europe Citizenship | Athlete | Motivational Speaker

9mo

Great newsletter, really informative! ðŸ‘

ADITYA KUMAR

9mo

Well said

Piotr Malicki

9mo

So many exciting topics covered in this issue! Can't wait to dive in! ðŸš€ðŸ¤–

See more comments

To view or add a comment, sign in

See all

Sign in

Stay updated on your professional world

By clicking Continue to join or sign in, you agree to LinkedInâ€™s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

Directory

What Algorithms Can Transformers Learn; Reasoning Agent for Graphs; Supervised Fine-Tuning; Context Understanding in LLMs; and More.

Danny Butvinik

Chief Data Scientist | 100K+ Followers | FinCrime | Writer | Author of AI Vanguard Newsletter

Editor's Paper Recommendations

Meet SingleStore Pro Max, the Powerhouse Edition

Recommended by LinkedIn

Industry Insights

Growth Zone

Expert Advice

The AI Vanguard

43,662 followers

More articles by this author

Sign in

Insights from the community

Others also viewed

ðŸ¥‡Top ML Papers of the Week

ðŸ¥‡Top ML Papers of the Week

Watch#8: Extreme Teachers and Mixing Tokens, not Experts

ðŸ¥‡Top ML Papers of the Week

Mastering Long Document Insights: Advanced Summarization with Amazon Bedrock and Anthropic Claude 2 Foundation Model

ðŸ¥‡Top ML Papers of the Week

Retriever Augmented Generation (RAG): Enhancing Language Models with External Knowledge

How Large Language Models (LLMs) Work and How They Are Developed

The Reproducibility Challenge in Large Language Models: Strategies and Practical Insights (Part 2)

Explore topics

Directory

Editor's Paper Recommendations

Meet SingleStore Pro Max, the Powerhouse Edition

Recommended by LinkedIn

Industry Insights

Growth Zone

Expert Advice

The AI Vanguard

43,662 followers

Assessing GPT-4 on Reasoning; Mathematical Perspective On Transformers; Family Of Multimodal Models; Why Small LMs Are The Next Thing; and More.

Apr 18, 2024

First Hallucination-Free LLM; Fine-Tune or Retrieval; Privacy Issues in LLMs; New Embedding Model by Google; What Resilience Means and More.

Apr 4, 2024

LLM Fine-Tuning on Graphs; How To Evaluate LLMs; Uncovering Knowledge Gaps Using RAG; Claud 3 on Bedrock; Overcoming Limits Of RAG; and More.

Mar 12, 2024

Generation Model â€“ What Do They Know? Cracking Length Generalization: AI's Reasoning Evolution; Can We Drastically Reduce Training Costs?; and More.

Mar 3, 2024

Multimodal LLMs; Orca 2; Cosmopedia â€“ Largest Open Synthetic Data by Huggin Face; How To Fine-Tune On Single GPU; and More.

Feb 27, 2024

ChatGPT vs Gemini; Uncertainty Quantification in GenAI; GPT-4 vs. GPT-4V vs. Humans On Abstraction and Reasoning; Private vs Public LLMs; and More.

Feb 20, 2024

Survey on Hallucination in LLM; LLMâ€™s Understanding Math; GPT4All Open-Source LMs; Next Chapter of Gemini; Improved GPT-4 Performance; and More.

Feb 13, 2024

Bard vs. ChatGPT; Jina Embedding 2; Text2Structure; Does GPT-4 Pass Turing Text?; Transformer As Graph2Graph; and More.

Feb 6, 2024

Hallucination in LLMs â€“ Perspectives and Remediations; Fine-Tuning With Feedback; What LLMs DO NOT KNOW; LLaMA 2 Explained; and More.

Jan 30, 2024

Why LLMs Hallucinate; GraphGPT; Inside Microsoftâ€™s small LLM; Deploy Tiny Llama on AWS EC2; Fine-Tune LLM using PyTorch; and More

Jan 16, 2024

Sign in

Insights from the community

Others also viewed

ðŸ¥‡Top ML Papers of the Week

ðŸ¥‡Top ML Papers of the Week

Watch#8: Extreme Teachers and Mixing Tokens, not Experts

ðŸ¥‡Top ML Papers of the Week

Mastering Long Document Insights: Advanced Summarization with Amazon Bedrock and Anthropic Claude 2 Foundation Model

ðŸ¥‡Top ML Papers of the Week

Retriever Augmented Generation (RAG): Enhancing Language Models with External Knowledge

How Large Language Models (LLMs) Work and How They Are Developed

The Reproducibility Challenge in Large Language Models: Strategies and Practical Insights (Part 2)

Explore topics