Avi Caciularu, Senior Research Scientist, Google

Publications

Jump to publications in:

'25 '24 '23 '22 '21 '20 '18

2025

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities.

Gemini Team, ..., Avi Caciularu, et al.

Technical Report

Paper

MDCure: A Scalable Pipeline for Multi-Document Instruction-Following

Gabrielle Kaili-May Liu, Bowen Shi, Avi Caciularu, Idan Szpektor, and Arman Cohan

The Annual Meeting of the Association for Computational Linguistics (ACL)

Paper Code

Identifying User Goals From UI Trajectories

Omri Berkovitch^*, Sapir Caduri^*, Noam Kahlon, Anatoly Efros, Avi Caciularu, and Ido Dagan

International World Wide Web Conference (WWW); Workshop on Personal Intelligence with Generative AI

🏆 Best Paper Award 🏆

Paper

2024

TACT: Advancing Complex Aggregative Reasoning with Information Extraction Tools

Avi Caciularu, Alon Jacovi, Eyal Ben-David, Sasha Goldshtein, Tal Schuster, Jonathan Herzig, Gal Elidan, and Amir Globerson

The Annual Conference on Neural Information Processing Systems (NeurIPS, Datasets and Benchmarks)

Paper Website Dataset

Towards Responsible Development of Generative AI for Education: An Evaluation-Driven Approach

Irina Jurenka, ..., Avi Caciularu, et al.

Technical Report

Paper

Unpacking Tokenization: Evaluating Text Compression and its Correlation with Model Performance

Omer Goldman, Avi Caciularu, Matan Eyal, Kris Cao, Idan Szpektor, and Reut Tsarfaty

The Annual Meeting of the Association for Computational Linguistics (ACL Findings)

Paper

Patchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models

Asma Ghandeharioun^*, Avi Caciularu^*, Adam Pearce, Lucas Dixon, and Mor Geva

The International Conference on Machine Learning (ICML)

Paper Website Code

2023

Stop Uploading Test Data in Plain Text: Practical Strategies for Mitigating Data Contamination by Evaluation Benchmarks

Alon Jacovi, Avi Caciularu, Omer Goldman, and Yoav Goldberg

The Conference on Empirical Methods in Natural Language Processing (EMNLP)

Paper

The Curious Case of Hallucinatory Unanswerablity: Finding Truths in the Hidden States of Over-Confident Large Language Models

Aviv Slobodkin, Omer Goldman, Avi Caciularu, Ido Dagan, and Shauli Ravfogel

The Conference on Empirical Methods in Natural Language Processing (EMNLP)

Paper Code

Optimizing Retrieval-augmented Reader Models via Token Elimination

Moshe Berchansky, Peter Izsak, Avi Caciularu, Ido Dagan, and Moshe Wasserblat

The Conference on Empirical Methods in Natural Language Processing (EMNLP)

Paper Code

A Comprehensive Evaluation of Tool-Assisted Generation Strategies

Alon Jacovi, Avi Caciularu, Jonathan Herzig, Roee Aharoni, Bernd Bohnet, and Mor Geva

The Conference on Empirical Methods in Natural Language Processing, Findings (EMNLP Findings)

Paper

Don’t Add, don’t Miss: Effective Content Preserving Generation from Pre-Selected Text Spans

Aviv Slobodkin, Avi Caciularu, Eran Hirsch, and Ido Dagan

The Conference on Empirical Methods in Natural Language Processing, Findings (EMNLP Findings)

Paper Code

Peek Across: Improving Multi-Document Modeling via Cross-Document Question-Answering

Avi Caciularu, Matthew E. Peters, Jacob Goldberger, Ido Dagan, and Arman Cohan

The Annual Meeting of the Association for Computational Linguistics (ACL)

Paper Code

Revisiting Sentence Union Generation as a Testbed for Text Consolidation

Eran Hirsch, Valentina Pyatkin, Ruben Wolhandler, Avi Caciularu, Asi Shefer, and Ido Dagan

The Annual Meeting of the Association for Computational Linguistics, Findings (ACL Findings)

Paper Code Dataset

An Entangled Mixture of Variational Autoencoders Approach to Deep Clustering

Avi Caciularu and Jacob Goldberger

Neurocomputing

Paper Code

Explaining the decisions of power quality disturbance classifiers using latent space features

Ram Machlev, Michael Perl, Avi Caciularu, Juri Belikov, Kfir Yehuda Levy, and Yoash Levron

The International Journal of Electrical Power and Energy Systems

Paper

2022

Cross-document Event Coreference Search: Task, Dataset and Modeling

Alon Eirew, Avi Caciularu, and Ido Dagan

The Conference on Empirical Methods in Natural Language Processing (EMNLP)

Paper Code Dataset

QASem Parsing: Text-to-text Modeling of QA-based Semantics

Ayal Klein, Eran Hirsch, Ron Eliav, Valentina Pyatkin, Avi Caciularu, and Ido Dagan

The Conference on Empirical Methods in Natural Language Processing (EMNLP)

Paper Demo Code

Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space

Mor Geva^*, Avi Caciularu^*, Kevin Ro Wang, and Yoav Goldberg

The Conference on Empirical Methods in Natural Language Processing (EMNLP)

Paper Code

LM-Debugger: An Interactive Tool for Inspection and Intervention in Transformer-Based Language Models

Mor Geva, Avi Caciularu, Guy Dar, Paul Roit, Shoval Sadde, Micah Shlain, Bar Tamir, and Yoav Goldberg

The Conference on Empirical Methods in Natural Language Processing, System demonstrations (EMNLP demo)

Paper Demo Code

Long Context Question Answering via Supervised Contrastive Learning

Avi Caciularu, Ido Dagan, Jacob Goldberger, and Arman Cohan

The Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)

Paper Code

A Proposition-Level Clustering Approach for Multi-Document Summarization

Ori Ernst, Avi Caciularu^*, Ori Shapira^*, Ramakanth Pasunuru, Mohit Bansal, Jacob Goldberger, and Ido Dagan

The Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)

Paper Code

Interpreting BERT-based Text Similarity via Activation and Saliency Maps

Itzik Malkiel^*, Dvir Ginzburg^*, Oren Barkan, Avi Caciularu, Jonathan Weill, and Noam Koenigstein

The International World Wide Web Conference (WWW)

Paper

MetricBERT: Text Representation Learning via Self-Supervised Triplet Training

Itzik Malkiel^*, Dvir Ginzburg^*, Oren Barkan, Avi Caciularu, Jonathan Weill, and Noam Koenigstein

The International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

Paper

2021

CDLM: Cross-document Language Modeling

Avi Caciularu, Arman Cohan, Iz Beltagy, Matthew E. Peters, Arie Cattan, and Ido Dagan

The Conference on Empirical Methods in Natural Language Processing, Findings (EMNLP Findings)

Paper Code

iFacetSum: Coreference-based Interactive Faceted Summarization for Multi-Document Exploration

Eran Hirsch, Alon Eirew^*, Ori Shapira^*, Avi Caciularu, Arie Cattan, Ori Ernst, Ramakanth Pasunuru, Hadar Ronen, Mohit Bansal, and Ido Dagan

The Conference on Empirical Methods in Natural Language Processing, System demonstrations (EMNLP demo)

Paper Demo Code

Cold Item Integration in Deep Hybrid Recommenders via Tunable Stochastic Gates

Oren Barkan^*, Roy Hirsch^*, Ori Katz^*, Avi Caciularu, Jonathan Weill, and Noam Koenigstein

The International Conference on Data Mining (ICDM)

Paper

Representation Learning via Variational Bayesian Networks

Oren Barkan^*, Avi Caciularu^*, Idan Rejwan^*, Ori Katz, Jonathan Weill, Itzik Malkiel and Noam Koenigstein

The International Conference on Information and Knowledge Management (CIKM)

Paper

Grad-SAM: Explaining Transformers via Gradient Self-Attention Maps

Oren Barkan^*, Edan Hauon^*, Avi Caciularu^*, Ori Katz, Itzik Malkiel, Omri Armstrong, and Noam Koenigstein

The International Conference on Information and Knowledge Management (CIKM)

Paper

Anchor-based Collaborative Filtering

Oren Barkan^*, Roy Hirsch^*, Ori Katz^*, Avi Caciularu, and Noam Koenigstein

The International Conference on Information and Knowledge Management (CIKM)

Paper

GAM: Explainable Visual Similarity and Classification via Gradient Activation Maps

Oren Barkan^*, Omri Armstrong^*, Amir Hertz^*, Avi Caciularu, Ori Katz, Jonathan Weill, Itzik Malkiel, and Noam Koenigstein

The International Conference on Information and Knowledge Management (CIKM)

Paper

On the Evolution of Word Order

Idan Rejwan and Avi Caciularu

Recent Advances in Natural Language Processing (RANLP), Student Research Workshop

Paper

Denoising Word Embeddings by Averaging in a Shared Space

Avi Caciularu, Ido Dagan, and Jacob Goldberger

The Joint Conference on Lexical and Computational Semantics (*SEM)

Paper Code

Self-Supervised Document Similarity Ranking via Contextualized Language Models and Hierarchical Inference

Dvir Ginzburg^*, Itzik Malkiel^*, Oren Barkan^†, Avi Caciularu^†, and Noam Koenigstein

The Annual Meeting of the Association for Computational Linguistics, Findings (ACL Findings)

Paper Code

Cold Start Revisited: A Deep Hybrid Recommender with Cold-Warm Item Harmonization

Oren Barkan^*, Roy Hirsch^*, Ori Katz^*, Avi Caciularu, Yoni Weill, and Noam Koenigstein

The International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

Paper

perm2vec: Attentive Graph Permutation Selection for Decoding of Error Correction Codes

Avi Caciularu^*, Nir Raviv^*, Tomer Raviv, Jacob Goldberger, and Yair Be’ery

IEEE Transactions on Cognitive Communications and Networking, Special Issue

Paper

2020

Within-Between Lexical Relation Classification

Avi Caciularu^*, Oren Barkan^*, and Ido Dagan

The Conference on Empirical Methods in Natural Language Processing (EMNLP)

Paper

Paraphrasing vs Coreferring: Two Sides of the Same Coin

Yehudit Meged, Avi Caciularu, Vered Shwartz, and Ido Dagan

The Conference on Empirical Methods in Natural Language Processing, Findings (EMNLP Findings)

Paper Code

RecoBERT: A Catalog Language Model for Text-Based Recommendations

Itzik Malkiel, Oren Barkan, Avi Caciularu, Noam Razin, Ori Katz, and Noam Koenigstein

The Conference on Empirical Methods in Natural Language Processing, Findings (EMNLP Findings)

Paper

Cold Item Recommendations via Hierarchical Item2vec

Oren Barkan^*, Avi Caciularu^*, Idan Rejwan^*, Jonathan Weill, Ori Katz, Itzik Malkiel, and Noam Koenigstein

The International Conference on Data Mining (ICDM)

Paper

Explainable Recommendations via Attentive Multi-Persona Collaborative Filtering

Oren Barkan^*, Yonatan Fuchs^*, Avi Caciularu, and Noam Koenigstein

The ACM Conference on Recommender Systems (RecSys)

Paper

Attentive Item2vec: Neural Attentive User Representations

Oren Barkan^*, Avi Caciularu^*, Ori Katz, and Noam Koenigstein

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

Paper

Bayesian Hierarchical Words Representation Learning

Oren Barkan^*, Idan Rejwan^*, Avi Caciularu^*, and Noam Koenigstein

The Annual Meeting of the Association for Computational Linguistics (ACL)

Paper

Unsupervised Linear and Nonlinear Channel Equalization and Decoding using Variational Autoencoders

Avi Caciularu and David Burshtein

IEEE Transactions on Cognitive Communications and Networking (TCCN)

Paper

Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding

Oren Barkan^*, Noam Razin^*, Itzik Malkiel, Ori Katz, Avi Caciularu, and Noam Koenigstein

The AAAI Conference on Artificial Intelligence (AAAI)

Paper Code

2018

ARPM: Additive, Retentive Penalty Method for Multidimensional NILM Algorithms

Mattan Serry, David Sriker, Avi Caciularu, Ram Machlev, Yuval Beck, and David Raz

The International Conference on the Science of Electrical Engineering (ICSEE)

Paper

Blind Channel Equalization Using Variational Autoencoders

Avi Caciularu and David Burshtein

The International Conference on Communications (ICC)

Paper

Inducing Regular Grammars Using Recurrent Neural Networks

Mor Cohen^*, Avi Caciularu^*, Idan Rejwan^*, and Jonathan Berant

The International Joint Conferences on Artificial Intelligence (IJCAI): Workshop on Learning and Reasoning (L&R)

Paper Code

May 2025	One paper accepted to ACL 🙌
May 2025	Our paper, Identifying User Goals From UI Trajectories, won the best paper award 🏆
Sept. 2024	One paper accepted to NeurIPS 🙌
August 2024	A new preprint on a benchmark for complex claim verification.
June 2024	A new preprint on a complex aggregative reasoning dataset.
May 2024	Our technical report on LLMs for education is out
May 2024	One paper accepted to ICML 🙌

Recent News

Résumé

Education

Bar-Ilan University

Tel Aviv University

Tel Aviv University

ML Professional Experience

Google Research

Google Research

Meta AI Research (FAIR)

AI2 (Semantic Scholar Team)

Microsoft

Publications

2025

2024

2023

2022

2021

2020

2018