We present the Natural Questions corpus, a question answering dataset.
Questions consist of real anonymized, aggregated queries issued to the
Google search engine. An annotator is presented with a question along
with a Wikipedia page from the top 5 search results, and annotates a
long answer (typically a paragraph) and a short answer (one or more
entities) if present on the page, or marks null if no long/short answer
is present. The public release consists of 307,373 training examples
with single annotations, 7,830 examples with 5-way annotations for
development data, and a further 7,842 examples 5-way annotated
sequestered as test data. We present experiments validating quality of
the data. We also describe analysis of 25-way annotations on 302
examples, giving insights into human variability on the annotation
task. We introduce robust metrics for the purposes of evaluating
question answering systems; demonstrate high human upper bounds on
these metrics; and establish baseline results using competitive
methods drawn from related literature.
@article{NaturalQuestions,
title = {Natural Questions: a Benchmark for Question Answering Research},
author = {Tom Kwiatkowski and Jennimaria Palomaki and Olivia Redfield and Michael Collins and Ankur Parikh and Chris Alberti and Danielle Epstein and Illia Polosukhin and Matthew Kelcey and Jacob Devlin and Kenton Lee and Kristina N. Toutanova and Llion Jones and Ming-Wei Chang and Andrew Dai and Jakob Uszkoreit and Quoc Le and Slav Petrov},
year = {2019},
journal= {Transactions of the Association of Computational Linguistics}
}
- "Natural questions: a benchmark for question answering research"
Tom Kwiatkowski, Jennimaria Palomaki, Olivia Rhinehart, Michael Collins,
Ankur Parikh, Chris Alberti, Danielle Epstein, Illia Polosukhin, Matthew Kelcey,
Jacob Devlin, Kenton Lee, Kristina Toutanova, Llion Jones, Ming-Wei Chang,
Andrew Dai, Jakob Uszkoreit, Quoc Le, Slav Petrov, TACL 2019
[abstract]
[paper (pdf)]
[bibtex]
The Conference on Computational Natural Language Learning (CoNLL) features a
shared task, in which participants train and test their learning systems on the
same data sets. In 2017, the task was devoted to learning dependency parsers
for a large number of languages, in a real-world setting without any
gold-standard annotation on input. All test sets followed a unified annotation
scheme, namely that of Universal Dependencies. In this paper, we define the
task and evaluation methodology, describe how the data sets were prepared,
report and analyze the main results, and provide a brief categorization of the
different approaches of the participating systems.
@inproceedings{zeman-etal-2018-conll,
title = "{C}o{NLL} 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies",
author = "Zeman, Daniel and Haji{\v{c}}, Jan and Popel, Martin and Potthast, Martin and
Straka, Milan and Ginter, Filip and Nivre, Joakim and Petrov, Slav",
booktitle = "Proceedings of the {C}o{NLL} 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies",
month = oct,
year = "2018",
address = "Brussels, Belgium",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/K18-2001",
pages = "1--21",
}
We show that small and shallow feedforward neural networks can achieve near
state-of-the-art results on a range of unstructured and structured language
processing tasks while being considerably cheaper in memory and
computational requirements than deep recurrent models. Motivated by
resource-constrained environments like mobile phones, we showcase simple
techniques for obtaining such small neural network models, and investigate
different tradeoffs when deciding how to allocate a small memory budget.
@InProceedings{botha-EtAl:2017:EMNLP,
author = {Jan A. Botha and Emily Pitler and Ji Ma and Anton Bakalov and
Alex Salcianu and David Weiss and Ryan McDonald and Slav Petrov},
title = {Natural Language Processing with Small Feed-Forward Networks},
booktitle = {Proceedings of EMNLP 2017},
year = {2017},
}
Universal Dependencies (UD) offer a uniform cross-lingual syntactic
representation, with the aim of advancing multilingual applications.
Recent work shows that semantic parsing can be accomplished by
transforming syntactic dependencies to logical forms. However, this
work is limited to English, and cannot process dependency graphs,
which allow handling complex phenomena such as control. In this
work, we introduce UDEPLAMBDA, a semantic interface for UD, which
maps natural language to logical forms in an almost language-
independent fashion and can process dependency graphs. We perform
experiments on question answering against Freebase and provide
German and Spanish translations of the WebQuestions and Graph-
Questions datasets to facilitate multilingual evaluation. Results
show that UDEPLAMBDA outperforms strong baselines across languages
and datasets. For English, it achieves a 4.9 F1 point improvement
over the state-of-the-art on GraphQuestions.
@InProceedings{reddy-EtAl:2017:EMNLP,
author = {Siva Reddy and Oscar Täckström and Slav Petrov and Mark Steedman and Mirella Lapata},
title = {Universal Semantic Parsing},
booktitle = {Proceedings of EMNLP 2017},
year = {2017},
}
The Conference on Computational Natural Language Learning (CoNLL) features a
shared task, in which participants train and test their learning systems on the
same data sets. In 2017, the task was devoted to learning dependency parsers
for a large number of languages, in a real-world setting without any
gold-standard annotation on input. All test sets followed a unified annotation
scheme, namely that of Universal Dependencies. In this paper, we define the
task and evaluation methodology, describe how the data sets were prepared,
report and analyze the main results, and provide a brief categorization of the
different approaches of the participating systems.
@InProceedings{zeman-EtAl:2017:CONLL,
author = {Zeman, Daniel and Popel, Martin and Straka, Milan and
Hajic, Jan and Nivre, Joakim and Ginter, Filip and
Luotolahti, Juhani and Pyysalo, Sampo and Petrov, Slav and
Potthast, Martin and Tyers, Francis, et al.},
title = {CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies},
booktitle = {Proceedings of the CoNLL 2017 Shared Task},
year = {2017},
pages = {1--19},
}
- "CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies"
Daniel Zeman, Martin Popel, Milan Straka, Jan Hajic, Joakim Nivre, Filip Ginter,
Juhani Luotolahti, Samo Pyysalo, Slav Petrov, Martin Potthast, Franci Tyers, et al., CoNLL 2017
[abstract]
[paper (pdf)]
[website]
[bibtex]
We introduce a globally normalized transition-based neural network model
that achieves state-of-the-art part-of speech tagging, dependency parsing and
sentence compression results. Our model is a simple feed-forward neural network
that operates on a task-specific transition system, yet achieves comparable or
better accuracies than recurrent models. We discuss the importance of global
as opposed to local normalization: a key insight is that the label bias problem
implies that globally normalized models can be strictly more expressive than
locally normalized models.
@InProceedings{andor-EtAl:2016:ACL,
author = {Andor, Daniel and Alberti, Chris and Weiss, David and
Severyn, Aliaksei and Presta, Alessandro and
Ganchev, Kuzman and Petrov, Slav and Collins, Michael},
title = {Globally Normalized Transition-Based Neural Networks},
booktitle = {Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics},
year = {2016},
pages = {2442--2452},
}
@InProceedings{nivre-etAl:2016:LREC,
author = {Joakim Nivre and Marie-Catherine de Marneffe and Filip Ginter and
Yoav Goldberg and Jan Hajic and Christopher D. Manning and
Ryan McDonald and Slav Petrov and Sampo Pyysalo and
Natalia Silveira and Reut Tsarfaty and Daniel Zeman},
title = {Universal Dependencies v1: A Multilingual Treebank Collection},
booktitle = {Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)},
year = {2016},
}
Cross-linguistically consistent annotation is necessary for sound comparative
evaluation and cross-lingual learning experiments. It is also useful for
multilingual system development and comparative linguistic studies. Universal
Dependencies is an open community effort to create cross-linguistically
consistent treebank annotation for many languages within a dependency-based
lexicalist framework. In this paper, we describe v1 of the universal
guidelines, the underlying design principles, and the currently available
treebanks for 33 languages.
- "Universal Dependencies v1: A Multilingual Treebank Collection"
Joakim Nivre, Marie-Catherine de Marneffe, Filip Ginter, Yoav Goldberg,
Jan Hajic, Christopher D. Manning, Ryan McDonald, Slav Petrov,
Sampo Pyysalo, Natalia Silveira, Reut Tsarfaty and Daniel Zeman, LREC 2016
[abstract]
[paper (pdf)]
[data]
[bibtex]
@InProceedings{Vinyals-EtAl:2015:NIPS,
author = {Oriol Vinyals and Lukasz Kaiser and Terry Koo and Slav Petrov and Ilya Sutskever and Geoffrey E. Hinton},
title = {Grammar as a Foreign Language},
journal = {NIPS},
year = {2015},
}
Syntactic parsing is a fundamental problem in computational linguistics and
natural language processing. Traditional approaches to parsing are highly
complex and problem specific. Recently, Sutskever et al. (2014) presented a
task-agnostic method for learning to map input sequences to output sequences
that achieved strong results on a large scale machine translation problem.
In this work, we show that precisely the same sequence-to-sequence method
achieves results that are close to state-of-the-art on syntactic constituency
parsing, whilst making almost no assumptions about the structure of the
problem. To achieve these results we need to mitigate the lack of domain
knowledge in the model by providing it with a large amount of automatically
parsed data.
@InProceedings{albert-etAl:2015:EMNLP,
author = {Alberti, Chris and Weiss, David and Coppola, Greg and Petrov, Slav},
title = {Improved Transition-Based Parsing and Tagging with Neural Networks},
booktitle = {Proceedings of EMNLP 2015},
year = {2015},
}
We extend and improve upon recent work in structured training for
neural network transition-based dependency parsing. We do this by
experimenting with novel features, additional transition systems
and by testing on a wider array of languages. In particular, we
introduce set-valued features to encode the predicted
morphological properties and part-of-speech confusion sets of the
words being parsed. We also investigate the use of joint parsing
and part-of-speech tagging in the neural paradigm. Finally, we
conduct a multi-lingual evaluation that demonstrates the
robustness of the overall structured neural approach, as well as
the benefits of the extensions proposed in this work. Our
research further demonstrates the breadth of the applicability of
neural network methods to dependency parsing, as well as the ease
with which new features can be added to neural parsing models.
@InProceedings{weiss-etAl:2015:ACL,
author = {Weiss, David and Alberti, Chris and Collins, Michael and Petrov, Slav},
title = {Structured Training for Neural Network Transition-Based Parsing},
booktitle = {Proceedings of ACL 2015},
year = {2015},
pages = {323--333},
}
We present structured perceptron training for neural network
transition-based dependency parsing. We learn the neural network
representation using a gold corpus augmented by a large number of
automatically parsed sentences. Given this fixed network
representation, we learn a final layer using the structured
perceptron with beam-search decoding. On the Penn Treebank, our
parser reaches 94.26% unlabeled and 92.41% labeled attachment
accuracy, which to our knowledge is the best accuracy on
Stanford Dependencies to date. We also provide in-depth ablative
analysis to determine which aspects of our model provide the largest
gains in accuracy.
@InProceedings{Artzi-Das-Petrov:2014:EMNLP,
author = {Yoav Artzi and Dipanjan Das and Slav Petrov},
title = {Learning Compact Lexicons for CCG Semantic Parsing},
booktitle = {Proceedings of the EMNLP 2014},
month = {October},
year = {2014},
publisher = {Association for Computational Linguistics},
}
We present methods to control the lexicon size when learning a Combinatory
Categorial Grammar semantic parser. Existing methods incrementally expand the
lexicon by greedily adding entries, considering a single training datapoint
at a time. We propose using corpus-level statistics for lexicon learning
decisions. We introduce voting to globally consider adding entries to the
lexicon, and pruning to remove entries no longer required to explain the
training data. Our methods result in state-of-the-art performance on the
task of executing sequences of natural language instructions, achieving up
to 25% error reduction, with lexicons that are up to 70% smaller and are
qualitatively less noisy.
@InProceedings{Mann-EtAl:2014:ACL,
author = {Mann, Jason and Zhang, David and Yang, Lu and Das, Dipanjan and Petrov, Slav}
title = {Enhanced Search with Wildcards and Morphological Inflections
in the Google Books Ngram Viewer},
booktitle = {Proceedings of the ACL 2014 System Demonstrations},
month = {June},
year = {2014},
publisher = {Association for Computational Linguistics},
}
We present a new version of the Google Books Ngram Viewer, which plots
the frequency of words and phrases over the last five centuries; its
data encompasses 6% of the world's published books. The new Viewer adds
three features for more powerful search: wildcards, morphological
inflections, and capitalization. These additions allow the discovery
of patterns that were previously difficult to find and further facilitate
the study of linguistic trends in printed text.
@InProceedings{Lerner-Petrov:2013:EMNLP,
author = {Uri Lerner and Slav Petrov},
title = {Source-Side Classifier Preordering for Machine Translation},
booktitle = {Proceedings of the EMNLP 2013},
month = {October},
year = {2013},
publisher = {Association for Computational Linguistics},
}
We present a simple and novel classifier-based preordering approach. Unlike
existing preordering models, we train feature-rich discriminative classifiers
that directly predict the target-side word order. Our approach combines the
strengths of lexical reordering and syntactic preordering models by performing
long-distance reorderings using the structure of the parse tree, while
utilizing a discriminative model with a rich set of features, including
lexical features. We present extensive experiments on 22 language pairs,
including preordering into English from 7 other languages. We obtain
improvements of up to 1.4 BLEU on language pairs in the WMT 2010 shared task.
For languages from different families the improvements often exceed 2 BLEU.
Many of these gains are also significant in human evaluations.
@InProceedings{McDonald-EtAl:2013:ACL,
author = {Ryan McDonald, Joakim Nivre, Yvonne Quirmbach-Brundage, Yoav Goldberg, Dipanjan Das, Kuzman Ganchev, Keith Hall, Slav Petrov, Hao Zhang, Oscar T\"ckstr\"om, Claudia Bedini, N\'uria Bertomeu Castell\'o, Jungmee Lee},
title = {Universal Dependency Annotation for Multilingual Parsing},
booktitle = {Proceedings of the ACL 2013},
month = {August},
year = {2013},
publisher = {Association for Computational Linguistics},
}
We present a new collection of treebanks with homogeneous syntactic dependency
annotation for six languages: German, English, Swedish, Spanish, French and
Korean. To show the usefulness of such a resource, we present a case study of
cross-lingual transfer parsing with more reliable evaluation than has been
possible before. This `universal' treebank is made freely available in order
to facilitate research on multilingual dependency parsing.
- "Universal Dependency Annotation for Multilingual Parsing"
Ryan McDonald, Joakim Nivre, Yvonne Quirmbach-Brundage, Yoav Goldberg,
Dipanjan Das, Kuzman Ganchev, Keith Hall, Slav Petrov, Hao Zhang, Oscar Täckström,
Claudia Bedini, Núria Bertomeu Castelló, Jungmee Lee, ACL 2013 (short paper)
[abstract]
[paper (pdf)]
[slides (keynote)]
[slides (pdf)]
[bibtex]
[data]
@InProceedings{Tackstrom-EtAl:2013:TACL,
author = {T\"ackstr\"om, Oscar and Das, Dipanjan and Petrov, Slav and McDonald Ryan and Nivre, Joakim},
title = {Token and Type Constraints for Cross-Lingual Part-of-Speech Tagging},
booktitle = {Transactions of the ACL},
month = {March},
year = {2013},
publisher = {Association for Computational Linguistics},
}
We consider the construction of part-of-speech taggers for resource-poor
languages. Recently, manually constructed tag dictionaries from Wiktionary
and dictionaries projected via bitext have been used as type constraints
to overcome the scarcity of annotated data in this setting. In this paper,
we show that additional token constraints can be projected from a resource-
rich source language to a resource-poor target language via word-aligned
bitext. We present several models to this end; in particular a partially
observed conditional random field model, where coupled token and type
constraints provide a partial signal for training. Averaged across eight
previously studied Indo-European languages, our model achieves a 25%
relative error reduction over the prior state of the art. We further
present successful results on seven additional languages from different
families, empirically demonstrating the applicability of coupled token
and type constraints across a diverse set of languages.
@InProceedings{spector-norvig-petrov:2012:CACM,
author = {Alfred Spector and Peter Norvig and Slav Petrov},
title = {Google's Hybrid Approach to Research},
booktitle = {Communications of the ACM},
month = {July},
year = {2012},
}
@InProceedings{lin-EtAl:2012:ACL,
author = {Lin, Yuri and Michel, Jean-Baptiste and Aiden Lieberman, Erez and Orwant, Jon and Brockman, Will and Petrov, Slav},
title = {Syntactic Annotations for the Google Books NGram Corpus},
booktitle = {Proceedings of the ACL 2012 System Demonstrations},
month = {July},
year = {2012},
address = {Jeju Island, Korea},
publisher = {Association for Computational Linguistics},
pages = {169--174},
url = {http://www.aclweb.org/anthology/P12-3029}
}
We present a new edition of the Google Books Ngram Corpus, which describes
how often words and phrases were used over a period of five centuries, in
eight languages; it reflects 6% of all books ever published. This new edition
introduces syntactic annotations: words are tagged with their part-of-speech,
and head-modifier relationships are recorded. The annotations are produced
automatically with statistical models that are specifically adapted to
historical text. The corpus will facilitate the study of linguistic trends,
especially those related to the evolution of syntax.
@InProceedings{ganchev-EtAl:2012:ACL,
author = {Ganchev, Kuzman and Hall, Keith and McDonald, Ryan and Petrov, Slav},
title = {Using Search-Logs to Improve Query Tagging},
booktitle = {Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
month = {July},
year = {2012},
address = {Jeju Island, Korea},
publisher = {Association for Computational Linguistics},
pages = {238--242},
url = {http://www.aclweb.org/anthology/P12-2047}
}
Syntactic analysis of search queries is important for a variety of information-
retrieval tasks; however, the lack of annotated data makes training query
analysis models difficult. We propose a simple, efficient procedure in which
part-of-speech tags are transferred from retrieval-result snippets to queries
at training time. Unlike previous work, our final model does not require any
additional resources at run-time. Compared to a state-of-the-art approach, we
achieve more than 20% relative error reduction. Additionally, we annotate a
corpus of search queries with part-of-speech tags, providing a resource for
future work on syntactic query analysis.
@misc{petrov-mcdonald:2012:SANCL,
author = {Slav Petrov and Ryan McDonald},
title = {Overview of the 2012 Shared Task on Parsing the Web},
year = {2012},
howpublished = {Notes of the First Workshop on Syntactic Analysis of Non-Canonical Language {(SANCL)}},
}
We describe a shared task on parsing web text from the Google Web Treebank.
Participants were to build a single parsing system that is robust to domain
changes and can handle noisy text that is commonly encountered on the web.
There was a constituency and a dependency parsing track and 11 sites
submitted a total of 20 systems. System combination approaches achieved the
best results, however, falling short of newswire accuracies by a large margin.
The best accuracies were in the 80-84\% range for F1 and LAS; even
part-of-speech accuracies were just above 90\%.
@InProceedings{rush-petrov:2012:NAACL-HLT,
author = {Rush, Alexander and Petrov, Slav},
title = {Vine Pruning for Efficient Multi-Pass Dependency Parsing},
booktitle = {Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
month = {June},
year = {2012},
address = {Montr\'{e}al, Canada},
publisher = {Association for Computational Linguistics},
pages = {498--507},
url = {http://www.aclweb.org/anthology/N12-1054}
}
Coarse-to-fine inference has been shown to be a robust approximate method for
improving the efficiency of structured prediction models while preserving their
accuracy. We propose a multi-pass coarse-to-fine architecture for dependency
parsing using linear-time vine pruning and structured prediction cascades.
Our first-, second-, and third-order models achieve accuracies comparable to
those of their unpruned counterparts, while exploring only a fraction of the
search space. We observe speed-ups of up to two orders of magnitude compared
to exhaustive search. Our pruned third-order model is twice as fast as an
unpruned first-order model and also compares favorably to a state-of-the-art
transition-based parser for multiple languages.
@InProceedings{petrov-das-mcdonald:2012:LREC,
author = {Petrov, Slav and Das, Dipanjan and McDonald, Ryan},
title = {A Universal Part-of-Speech Tagset},
booktitle = {Proceedings of LREC},
month = {May},
year = {2012},
}
To facilitate future research in unsupervised induction of syntactic
structure and to standardize best-practices, we propose a tagset that
consists of twelve universal part-of-speech categories. In addition
to the tagset, we develop a mapping from 25 different treebank
tagsets to this universal set. As a result, when combined with the
original treebank data, this universal tagset and mapping produce a
dataset consisting of common parts-of-speech for 22 different
languages. We highlight the use of this resource via two experiments,
including one that reports competitive accuracies for unsupervised
grammar induction without gold standard part-of-speech tags.
@InProceedings{hall-EtAl:2011:NIPS-WKSHP,
author = {Keith Hall and Ryan McDonald and Slav Petrov},
title = {Training Structured Prediction Models with Extrinsic Loss Functions},
booktitle = {Domain Adaptation Workshop at NIPS},
month = {October},
year = {2011},
}
We present an online learning algorithm for training structured
prediction models with extrinsic loss functions. This allows us
to extend a standard supervised learning objective with additional
loss-functions, either based on intrinsic or task-specific
extrinsic measures of quality. We present experiments with
sequence models on part-of-speech tagging and named entity
recognition tasks, and with syntactic parsers on dependency
parsing and machine translation reordering tasks.
@InProceedings{yi-EtAl:2011:IWPT,
author = {Yi, Youngmin and Lai, Chao-Yue and Petrov, Slav and Keutzer, Kurt},
title = {Efficient Parallel CKY Parsing on GPUs},
booktitle = {Proceedings of the 2011 Conference on Parsing Technologies},
month = {October},
year = {2011},
address = {Dublin, Ireland},
}
Low-latency solutions for syntactic parsing are needed if parsing is to
become an integral part of user-facing natural language applications.
Unfortunately, most state-of-the-art constituency parsers employ large
probabilistic context-free grammars for disambiguation, which renders
them impractical for real-time use. Meanwhile, Graphics Processor Units
(GPUs) have become widely available, offering the opportunity to alleviate
this bottleneck by exploiting the fine-grained data parallelism found in
the CKY algorithm. In this paper, we explore the design space of
parallelizing the dynamic programming computations carried out by the CKY
algorithm. We use the Compute Unified Device Architecture (CUDA)
programming model to reimplement a state-of-the-art parser, and compare
its performance on two recent GPUs with different architectural features.
Our best results show a 26-fold speedup compared to a sequential C
implementation.
@InProceedings{mcdonald-petrov-hall:2011:EMNLP,
author = {McDonald, Ryan and Petrov, Slav and Hall, Keith},
title = {Multi-Source Transfer of Delexicalized Dependency Parsers},
booktitle = {Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing},
month = {July},
year = {2011},
address = {Edinburgh, Scotland, UK.},
publisher = {Association for Computational Linguistics},
pages = {62--72},
url = {http://www.aclweb.org/anthology/D11-1006}
}
We present a simple method for transferring dependency parsers from
source languages with labeled training data to target languages without
labeled training data. We first demonstrate that delexicalized parsers
can be directly transferred between languages, producing significantly
higher accuracies than unsupervised parsers. We then use a constraint
driven learning algorithm where constraints are drawn from parallel
corpora to project the final parser. Unlike previous work on projecting
syntactic resources, we show that simple methods for introducing multiple
source languages can significantly improve the overall quality of the
resulting parsers. The projected parsers from our system result in
state-of-the-art performance when compared to previously studied
unsupervised and projected parsing systems across eight different
languages.
@InProceedings{katzbrown-EtAl:2011:EMNLP,
author = {Katz-Brown, Jason and Petrov, Slav and McDonald, Ryan and Och, Franz and Talbot, David and Ichikawa, Hiroshi and Seno, Masakazu and Kazawa, Hideto},
title = {Training a Parser for Machine Translation Reordering},
booktitle = {Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing},
month = {July},
year = {2011},
address = {Edinburgh, Scotland, UK.},
publisher = {Association for Computational Linguistics},
pages = {183--192},
url = {http://www.aclweb.org/anthology/D11-1017}
}
We propose a simple training regime that can improve the extrinsic
performance of a parser, given only a corpus of sentences and a way
to automatically evaluate the extrinsic quality of a candidate parse.
We apply our method to train parsers that excel when used as part of
a reordering component in a statistical machine translation system.
We use a corpus of weakly-labeled reference reorderings to guide
parser training. Our best parsers contribute significant improvements
in subjective translation quality while their intrinsic attachment
scores typically regress.
@InProceedings{das-petrov:2011:ACL-HLT2011,
author = {Das, Dipanjan and Petrov, Slav},
title = {Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections},
booktitle = {Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies},
month = {June},
year = {2011},
address = {Portland, Oregon, USA},
publisher = {Association for Computational Linguistics},
pages = {600--609},
url = {http://www.aclweb.org/anthology/P11-1061}
}
We describe a novel approach for inducing unsupervised part-of-speech
taggers for languages that have no labeled training data, but have
translated text in a resource-rich language. Our method does not
assume any knowledge about the target language (in particular no
tagging dictionary is assumed), making it applicable to a wide array
of resource-poor languages. We use graph-based label propagation for
cross-lingual knowledge transfer and use the projected labels as
features in an unsupervised model (Berg-Kirkpatrick et al. 2010).
Across eight European languages, our approach results in an average
absolute improvement of 10.4% over a state-of-the-art baseline, and
16.7% over vanilla hidden Markov models induced with the Expectation
Maximization algorithm.
@InProceedings{subramanya-petrov-pereira:2010:EMNLP,
author = {Subramanya, Amarnag and Petrov, Slav and Pereira, Fernando},
title = {Efficient Graph-Based Semi-Supervised Learning of Structured Tagging Models},
booktitle = {Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing},
month = {October},
year = {2010},
address = {Cambridge, MA},
publisher = {Association for Computational Linguistics},
pages = {167--176},
url = {http://www.aclweb.org/anthology/D10-1017}
}
We describe a new scalable algorithm for semi-supervised training of
conditional random fields (CRF) and its application to
part-of-speech (POS) tagging. The algorithm uses a similarity graph
to encourage similar n-grams to have similar POS tags. We
demonstrate the efficacy of our approach on a domain adaptation
task, where we assume that we have access to large amounts of
unlabeled data from the target domain, but no additional labeled
data. The similarity graph is used during training to smooth the state
posteriors on the target domain. Standard inference can be used at test
time. Our approach is able to scale to very large problems and yields
significantly improved target domain accuracy.
@InProceedings{petrov-EtAl:2010:EMNLP,
author = {Petrov, Slav and Chang, Pi-Chuan and Ringgaard, Michael and Alshawi, Hiyan},
title = {Uptraining for Accurate Deterministic Question Parsing},
booktitle = {Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing},
month = {October},
year = {2010},
address = {Cambridge, MA},
publisher = {Association for Computational Linguistics},
pages = {705--713},
url = {http://www.aclweb.org/anthology/D10-1069}
}
It is well known that parsing accuracies drop significantly on out-of-domain
data. What is less known is that some parsers suffer more from domain
shifts than others. We show that dependency parsers have more difficulty
parsing questions than constituency parsers. In particular, deterministic
shift-reduce dependency parsers, which are of highest interest for
practical applications because of their linear running time, drop to 60%
labeled accuracy on a question test set. We propose an *uptraining*
procedure in which a deterministic parser is trained on the output of a
more accurate, but slower, latent variable constituency parser (converted
to dependencies). Uptraining with 100K unlabeled questions achieves
results comparable to having 2K labeled questions for training. With 100K
unlabeled and 2K labeled questions, uptraining is able to improve parsing
accuracy to 84%, closing the gap between in-domain and out-of-domain
performance.
@InProceedings{huang-harper-petrov:2010:EMNLP,
author = {Huang, Zhongqiang and Harper, Mary and Petrov, Slav},
title = {Self-Training with Products of Latent Variable Grammars},
booktitle = {Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing},
month = {October},
year = {2010},
address = {Cambridge, MA},
publisher = {Association for Computational Linguistics},
pages = {12--22},
url = {http://www.aclweb.org/anthology/D10-1002}
}
We study self-training with products of latent variable grammars in
this paper. We show that increasing the quality of the
automatically parsed data used for self-training gives higher
accuracy self-trained grammars. Our generative self-trained
grammars reach F scores of 91.6 on the WSJ test set and surpass even
discriminative reranking systems without self-training.
Additionally, we show that multiple self-trained grammars can be
combined in a product model to achieve even higher accuracy. The
product model is most effective when the individual underlying
grammars are most diverse. Combining multiple grammars that were
self-trained on disjoint sets of unlabeled data results in a final
test accuracy of 92.5\% on the WSJ test set and 89.6\% on our
Broadcast News test set.
@InProceedings{burkett-EtAl:2010:CONLL,
author = {Burkett, David and Petrov, Slav and Blitzer, John and Klein, Dan},
title = {Learning Better Monolingual Models with Unannotated Bilingual Text},
booktitle = {Proceedings of the Fourteenth Conference on Computational Natural Language Learning},
month = {July},
year = {2010},
address = {Uppsala, Sweden},
publisher = {Association for Computational Linguistics},
pages = {46--54},
url = {http://www.aclweb.org/anthology/W10-2906}
}
This work shows how to improve state-of-the-art monolingual natural
language processing models using unannotated bilingual text. We build
a multiview learning objective that enforces agreement between
monolingual and bilingual models. In our method the first,
monolingual view consists of supervised predictors learned separately
for each language. The second, bilingual view consists of log-linear
predictors learned over both languages on bilingual text. Our
training procedure estimates the parameters of the bilingual model
using the output of the monolingual model, and we show how to combine
the two models to account for dependence between views. For the task
of named entity recognition, using bilingual predictors increases
F1 by 16.1% absolute over a supervised monolingual model, and
retraining on bilingual predictions increases *monolingual* model
F1 by 14.6%. For syntactic parsing, our bilingual predictor
increases F1 by 2.1% absolute, and retraining a monolingual model
on its output gives an improvement of 2.0%.
@InProceedings{petrov:2010:NAACLHLT,
author = {Petrov, Slav},
title = {Products of Random Latent Variable Grammars},
booktitle = {Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics},
month = {June},
year = {2010},
address = {Los Angeles, California},
publisher = {Association for Computational Linguistics},
pages = {19--27},
url = {http://www.aclweb.org/anthology/N10-1003}
}
We show that the automatically induced latent variable grammars of
Petrov et al. 2006 vary widely in their underlying representations,
depending on their EM initialization point. We use this to our
advantage, combining multiple automatically learned grammars into
an unweighted product model, which gives significantly improved
performance over state-of-the-art individual grammars. In our model,
the probability of a constituent is estimated as a product of
posteriors obtained from multiple grammars that differ only in the
random seed used for initialization, without any learning or tuning
of combination weights. Despite its simplicity, a product of eight
automatically learned grammars improves parsing accuracy from 90.2%
to 91.8% on English, and from 80.3% to 84.5% on German.
@incollection{bouchard-petrov-klein:2009:NIPS,
title = {Randomized Pruning: Efficiently Calculating Expectations in Large Dynamic Programs},
author = {Alexandre Bouchard-C\^{o}t\'{e} and Slav Petrov and Dan Klein},
booktitle = {Advances in Neural Information Processing Systems 22}
pages = {144--152},
year = {2009},
url = {http://www.petrovi.de/data/nips09.pdf}
}
Pruning can massively accelerate the computation of feature expectations
in large models. However, any single pruning mask will introduce bias.
We present a novel approach which employs a randomized sequence of
pruning masks. Formally, we apply auxiliary variable MCMC sampling to
generate this sequence of masks, thereby gaining theoretical guarantees
about convergence. Because each mask is generally able to skip large
portions of an underlying dynamic program, our approach is particularly
compelling for high-degree algorithms. Empirically, we demonstrate our
method on bilingual parsing, showing decreasing bias as more masks are
incorporated, and outperforming fixed tic-tac-toe pruning.
@incollection{petrov:2009:NIPS-WKSHP,
title = {Generative and Discriminative Latent Variable Grammars}
author = {Slav Petrov},
booktitle = {The Generative and Discriminative Learning Interface Workshop at NIPS 22}
year = {2009},
url = {http://www.petrovi.de/data/nips09w.pdf}
}
Latent variable grammars take an observed (coarse) treebank and induce
more fine-grained grammar categories, that are better suited for
modeling the syntax of natural languages. Estimation can be done in a
generative or a discriminative framework, and results in the best
published parsing accuracies over a wide range of syntactically
divergent languages and domains. In this paper we highlight the
commonalities and the differences between the two learning paradigms.
@phdThesis{petrov:PhD,
author = {Petrov, Slav},
title = {Coarse-to-Fine Natural Language Processing},
school = {University of California at Bekeley},
address = {Berkeley, CA, USA},
year = {2009},
url = {http://www.petrovi.de/data/dissertation.pdf}
}
State-of-the-art natural language processing models are anything but
compact. Syntactic parsers have huge grammars, machine translation systems
have huge transfer tables, and so on across a range of tasks. With such
complexity come two challenges. First, how can we learn highly complex
models? Second, how can we efficiently infer optimal structures within
them?
Hierarchical coarse-to-fine methods address both questions.
Coarse-to-fine approaches exploit a sequence of models which introduce
complexity gradually. At the top of the sequence is a trivial model in
which learning and inference are both cheap. Each subsequent model
refines the previous one, until a final, full-complexity model is
reached. Because each refinement introduces only limited complexity,
both learning and inference can be done in an incremental fashion. In
this dissertation, we describe several coarse-to-fine systems.
In the domain of syntactic parsing, complexity is in the grammar. We
present a latent variable approach which begins with an X-bar grammar
and learns to iteratively refine grammar categories. For example, noun
phrases might be split into subcategories for subjects and objects,
singular and plural, and so on. This splitting process admits an
efficient incremental inference scheme which reduces parsing times by
orders of magnitude. Furthermore, it produces the best parsing
accuracies across an array of languages, in a fully language-general
fashion.
In the domain of acoustic modeling for speech recognition, complexity
is needed to model the rich phonetic properties of natural languages.
Starting from a mono-phone model, we learn increasingly refined models
that capture phone internal structures, as well as context-dependent
variations in an automatic way. Our approaches reduces error rates
compared to other baseline approaches, while streamlining the learning
procedure.
In the domain of machine translation, complexity arises because there
and too many target language word types. To manage this complexity, we
translate into target language clusterings of increasing vocabulary
size. This approach gives dramatic speed-ups while additionally increasing
final translation quality.
@InProceedings{petrov-haghighi-klein:2008:EMNLP,
author = {Petrov, Slav and Haghighi, Aria and Klein, Dan},
title = {Coarse-to-Fine Syntactic Machine Translation using Language Projections},
booktitle = {Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing},
month = {October},
year = {2008},
address = {Honolulu, Hawaii},
publisher = {Association for Computational Linguistics},
pages = {108--116},
url = {http://www.aclweb.org/anthology/D08-1012}
}
The intersection of tree transducer-based translation models
with n-gram language models results in huge dynamic
programs for machine translation decoding. We propose a
multipass, coarse-to-fine approach in which the language
model complexity is incrementally introduced. In contrast
to previous *order-based* bigram-to-trigram approaches,
we focus on *encoding-based* methods, which use a
clustered encoding of the target language. Across various
hierarchical encoding schemes and for multiple language
pairs, we show speed-ups of up to 50 times over single-pass
decoding while improving BLEU score. Moreover, our entire
decoding cascade for trigram language models is faster than
the corresponding bigram pass alone of a bigram-to-trigram
decoder.
@InProceedings{petrov-klein:2008:EMNLP,
author = {Petrov, Slav and Klein, Dan},
title = {Sparse Multi-Scale Grammars for Discriminative Latent Variable Parsing},
booktitle = {Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing},
month = {October},
year = {2008},
address = {Honolulu, Hawaii},
publisher = {Association for Computational Linguistics},
pages = {867--876},
url = {http://www.aclweb.org/anthology/D08-1091}
}
We present a discriminative, latent variable approach to
syntactic parsing in which rules exist at multiple scales
of refinement. The model is formally a latent variable
CRF grammar over trees, learned by iteratively splitting
grammar productions (not categories). Different regions
of the grammar are refined to different degrees, yielding
grammars which are three orders of magnitude smaller
than the single-scale baseline and 20 times smaller than
the split-and-merge grammars of Petrov et al. 2006.
In addition, our discriminative approach integrally admits
features beyond local tree configurations. We present a
multi-scale training method along with an efficient
CKY-style dynamic program. On a variety of domains
and languages, this method produces the best published
parsing accuracies with the smallest reported grammars.
@inproceedings{favre-etal:2008:SLT,
author = {Favre, Benoit and Hakkani-Tur, Dilek and Petrov, Slav and Klein, Dan},
title = {{Efficient Sentence Segmentation Using Syntactic Features}},
booktitle = {Spoken Language Technologies (SLT)},
year = {2008},
address = {Goa, India},
url = {http://petrovi.de/data/slt08.pdf}
}
To enable downstream language processing, automatic speech
recognition output must be segmented into its individual sentences.
Previous sentence segmentation systems have typically been very
local, using low-level prosodic and lexical features to independently
decide whether or not to segment at each word boundary position.
In this work, we leverage global syntactic information from a syn-
tactic parser, which is better able to capture long distance depen-
dencies. While some previous work has included syntactic features,
ours is the first to do so in a tractable, lattice-based way, which is
crucial for scaling up to long-sentence contexts. Specifically, an ini-
tial hypothesis lattice is constrcuted using local features. Candidate
sentences are then assigned syntactic language model scores. These
global syntactic scores are combined with local low-level scores in
a log-linear model. The resulting system significantly outperforms
the most popular long-span model for sentence segmentation (the
hidden event language model) on both reference text and automatic
speech recognizer output from news broadcasts.
@InProceedings{petrov-klein:2008:PaGe,
author = {Petrov, Slav and Klein, Dan},
title = {Parsing {German} with Latent Variable Grammars},
booktitle = {Proceedings of the Workshop on Parsing German at ACL '08},
month = {June},
year = {2008},
address = {Columbus, Ohio},
publisher = {Association for Computational Linguistics},
pages = {33--39},
url = {http://www.aclweb.org/anthology/W/W08/W08-1005}
}
We describe experiments on learning latent variable
grammars for various German treebanks, using a
language-agnostic statistical approach. In our method,
a minimal initial grammar is hierarchically refined
using an adaptive split-and-merge EM procedure,
giving compact, accurate grammars. The learning
procedure directly maximizes the likelihood of the
training treebank, without the use of any language
specific or linguistically constrained features.
Nonetheless, the resulting grammars encode many
linguistically interpretable patterns and give the best
published parsing accuracies on three German
treebanks.
@InProceedings{petrov-klein:2008:NIPS2008,
author = {Slav Petrov and Dan Klein},
title = {Discriminative Log-Linear Grammars with Latent Variables},
booktitle = {Advances in Neural Information Processing Systems 20 (NIPS)},
editor = {J.C. Platt and D. Koller and Y. Singer and S. Roweis},
publisher = {MIT Press},
address = {Cambridge, MA},
pages = {1153--1160},
year = {2008},
url = {http://books.nips.cc/papers/files/nips20/NIPS2007_0630.pdf}
}
We demonstrate that log-linear grammars with latent variables can be
practically trained using discriminative methods. Central to
efficient discriminative training is a hierarchical pruning procedure
which allows feature expectations to be efficiently approximated
in a gradient-based procedure. We compare L1 and L2 regularization
and show that L1 regularization is superior, requiring fewer iterations
to converge, and yielding sparser solutions. On full-scale treebank
parsing experiments, the discriminative latent models outperform both
the comparable generative latent models as well as the discriminative
non-latent baselines.
@InProceedings{petrov-pauls-klein:2007:EMNLP-CoNLL2007,
author = {Petrov, Slav and Pauls, Adam and Klein, Dan},
title = {Learning Structured Models for Phone Recognition},
booktitle = {Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)},
pages = {897--905},
year = {2007},
url = {http://www.aclweb.org/anthology/D/D07/D07-1094}
}
We present a maximally streamlined approach to learning
HMM-based acoustic models for automatic speech recognition.
In our approach, an initial monophone HMM is iteratively
refined using a split-merge EM procedure which makes no
assumptions about subphone structure or context-dependent
structure, and which uses only a single Gaussian per HMM
state. Despite the much simplified training process, our
acoustic model achieves state-of-the-art results on phone
classification (where it outperforms almost all other methods) and
competitive performance on phone recognition (where it
outperforms standard CD triphone / subphone / GMM approaches).
We also present an analysis of what is and is not learned by
our system.
@InProceedings{liang-EtAl:2007:EMNLP-CoNLL2007,
author = {Liang, Percy and Petrov, Slav and Jordan, Michael and Klein, Dan},
title = {The Infinite {PCFG} Using Hierarchical {Dirichlet} Processes},
booktitle = {Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)},
pages = {688--697},
year = {2007},
url = {http://www.aclweb.org/anthology/D/D07/D07-1072}
}
We present a nonparametric Bayesian model
of tree structures based on the hierarchical
Dirichlet process (HDP). Our HDP-PCFG
model allows the complexity of the grammar
to grow as more training data is available.
In addition to presenting a fully Bayesian
model for the PCFG, we also develop an efficient
variational inference procedure. On
synthetic data, we recover the correct grammar
without having to specify its complexity
in advance. We also show that our techniques
can be applied to full-scale parsing
applications by demonstrating its effectiveness
in learning state-split grammars.
@inproceedings{Petrov-Klein-2007:AAAI,
author = {Slav Petrov and Dan Klein},
title = {Learning and Inference for Hierarchically Split {PCFG}s}
booktitle = {AAAI 2007 (Nectar Track)},
year = {2007},
url = {http://www.petrovi.de/data/aaai2007.pdf},
}
Treebank parsing can be seen as the search for an optimally
refined grammar consistent with a coarse training treebank.
We describe a method in which a minimal grammar is hier-
archically refined using EM to give accurate, compact gram-
mars. The resulting grammars are extremely compact com-
pared to other high-performance parsers, yet the parser gives
the best published accuracies on several languages, as well
as the best generative parsing numbers in English. In addi-
tion, we give an associated coarse-to-fine inference scheme
which vastly improves inference time with no loss in test set
accuracy.
@InProceedings{petrov-klein:2007:main,
author = {Petrov, Slav and Klein, Dan},
title = {Improved Inference for Unlexicalized Parsing},
booktitle = {Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference},
month = {April},
year = {2007},
address = {Rochester, New York},
publisher = {Association for Computational Linguistics},
pages = {404--411},
url = {http://www.aclweb.org/anthology/N/N07/N07-1051}
}
We present several improvements to unlexicalized
parsing with hierarchically state-split PCFGs. First,
we present a novel coarse-to-fine method in which
a grammar's own hierarchical projections are used
for incremental pruning, including a method for efficiently
computing projections of a grammar without
a treebank. In our experiments, hierarchical
pruning greatly accelerates parsing with no loss in
empirical accuracy. Second, we compare various
inference procedures for state-split PCFGs from the
standpoint of risk minimization, paying particular
attention to their practical tradeoffs. Finally, we
present multilingual experiments which show that
parsing with hierarchical state-splitting is fast and
accurate in multiple languages and domains, even
without any language-specific tuning.
@inproceedings{Petrov-EtAl:2006:TRECVID,
author = {Slav Petrov and Arlo Faria and Pascal Michaillat and Alexander Berg and Andreas Stolcke and Dan Klein and Jitendra Malik},
title = {Detecting Categories in News Video using Acoustic, Speech and Image Features},
booktitle = {Proceedings of (VIDEO) TREC (TrecVid 2006)},
year = {2006},
url = {http://www.petrovi.de/data/trecvid06.pdf},
}
This work describes systems for detecting semantic categories
present in news video. The multimedia data was processed in
three ways: the audio signal was converted to a sequence of
acoustic features, automatic speech recognition provided a
word-level transcription, and image features were computed for
selected frames of the video signal. Primary acoustic, speech,
and vision systems were trained to discriminate instances of
the categories. Higher-level systems exploited correlations
among the categories, incorporated sequential context, and
combined the joint evidence from the three information sources.
We present experimental results from the TREC video retrieval
evaluation.
@InProceedings{petrov-EtAl:2006:COLACL,
author = {Petrov, Slav and Barrett, Leon and Thibaux, Romain and Klein, Dan},
title = {Learning Accurate, Compact, and Interpretable Tree Annotation},
booktitle = {Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics},
month = {July},
year = {2006},
address = {Sydney, Australia},
publisher = {Association for Computational Linguistics},
pages = {433--440},
url = {http://www.aclweb.org/anthology/P/P06/P06-1055}
}
We present an automatic approach to tree annotation
in which basic nonterminal symbols are alternately
split and merged to maximize the likelihood
of a training treebank. Starting with a simple Xbar
grammar, we learn a new grammar whose nonterminals
are subsymbols of the original nonterminals.
In contrast with previous work, we are able
to split various terminals to different degrees, as appropriate
to the actual complexity in the data. Our
grammars automatically learn the kinds of linguistic
distinctions exhibited in previous work on manual
tree annotation. On the other hand, our grammars
are much more compact and substantially more accurate
than previous work on automatic annotation.
Despite its simplicity, our best grammar achieves
an F1 of 89.9% on the Penn Treebank, higher than
most fully lexicalized systems.
@InProceedings{petrov-barrett-klein:2006:CoNLL-X,
author = {Petrov, Slav and Barrett, Leon and Klein, Dan},
title = {Non-Local Modeling with a Mixture of {PCFG}s},
booktitle = {Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL-X)},
month = {June},
year = {2006},
address = {New York City},
publisher = {Association for Computational Linguistics},
pages = {14--20},
url = {http://www.aclweb.org/anthology/W/W06/W06-2903}
}
While most work on parsing with PCFGs
has focused on local correlations between
tree configurations, we attempt to model
non-local correlations using a finite mixture
of PCFGs. A mixture grammar fit
with the EM algorithm shows improvement
over a single PCFG, both in parsing
accuracy and in test data likelihood. We
argue that this improvement comes from
the learning of specialized grammars that
capture non-local correlations.
@inproceedings{Tomasi-Petrov-Sastry-2003:ICCV,
author = {Carlo Tomasi and Slav Petrov and Arvind Sastry},
title = {3{D} Tracking = {C}lassification + {I}nterpolation},
booktitle = {Proceedings of the Ninth IEEE International Conference on Computer Vision (ICCV)},
year = {2003},
url = {http://www.petrovi.de/data/iccv03.pdf},
}
Hand gestures are examples of fast and complex motions.
Computers fail to track these in fast video, but sleight of
hand fools humans as well: what happens too quickly we
just cannot see. We show a 3D tracker for these types of
motions that relies on the recognition of familiar configurations
in 2D images (classification), and fills the gaps
in-between (interpolation). We illustrate this idea with experiments
on hand motions similar to finger spelling. The
penalty for a recognition failure is often small: if two con-
figurations are confused, they are often similar to each
other, and the illusion works well enough, for instance, to
drive a graphics animation of the moving hand. We contribute
advances in both feature design and classifier training:
our image features are invariant to image scale, translation,
and rotation, and we propose a classification method
that combines VQPCA with discrimination trees.
@mastersthesis{Petrov-Masters,
author = {Slav Petrov},
title = {Computer vision, sensor fusion, and behavior control for soccer playing robots},
school = {Freie Universitaet Berlin}
year = {2004},
url = {http://www.petrovi.de/data/slav_diplom_arbeit.pdf},
}
This Master's thesis describes parts of the control software
used by the soccer robots of the Free University of Berlin,
the so called FU-Fighters. The FU-Fighters compete in the
Middle Sized League of RoboCup and reached the semi-finals
during the 2004 RoboCup World Cup in Lisbon, Portugal. The
thesis covers several independent topics:
- Automatic White Balance: It is shown how to improve the
white balancing of an omni-directional camera by using a
reference color and a PID-controller.
- Ball Tracking: The reliable tracking of the ball is vital
in robot soccer. Therefore a Kalman-filter based system for
estimating the ball position and velocity in the presence
of occlusions is developped.
- Sensor Fusion: The robot perceives its environment through
several independent sensors (camera, odometer, etc.), which
have different delays. We propose a novel method for fusing
the sensor data and show our results through examples of
selflocalization.
- Behavior Control: Finally we show how all these elements
can be incorporated into a goal keeping robot. We develop
simple behaviors that can be used in a layered architecture
and enable the robot to block most balls that are being shot
at the goal.
Other materials:
Slav Petrov - Слав Петров, November 2018