IX Summer School for advanced studies in Multimodal Information Retrieval - [ERMITES]


From Information Development to Information Retrieval

23, 24 and 25th of sept. 2014 Ile de Porquerolles - National Park - Var

The ERMITES 2014 Summer School brings together international leading researchers and provides participants the opportunity to gain deeper insight into current research trends in scaled audiovisual information retrieval. It is organized as a series of long talks, during which students are invited to interact.

The target audience of the school is graduate and PhD students, post-doctoral researchers, and academic or industrial researchers.
Any participant can present its research in a poster or oral format, please contact the organizer if you wish to do so.
As there is a limited number of participants (about 32), a confirmation notification will be sent (first come first served politic).

- Early deadline for registration: 30th of may.

Preliminary program :

A) Overview and Objectives

H. Glotin *, Professor Inst. Univ France, UTLN, CNRS, LSIS

Cognilego ANR project: from Pixels to Semantics, http://cognilego.univ-tln.fr


B) Cognitive Development

J. Grainger *, Research Director CNRS LPC

Cognition & Reading


I'll present our new theory of orthographic processing - that is a theory about how information concerning letter identity and letter position is encoded during reading. The theory is couched within a general framework for word recognition that makes a critical distinction between a coarse-grained orthographic code that provides a fast-track to semantics, and a fine-grained orthographic code that is used to generate a prelexical phonological code, hence providing the connection with auditory word processing. My talk is divided into three sections, each examining a specific component of this theoretical framework. Section 1 examines low-level visual constraints on the earliest phase of orthographic processing - the retinotopic mapping of visual features onto letter identities. Section 2 examines how this preliminary orthographic information can be most efficiently used to constrain lexical identity via a coarse-grained, word-centered orthographic code. Section 3 examines a further critical constraint on orthographic processing - the fact that the orthographic system is grafted onto a pre-existing phonological system during the course of reading acquisition. It is this unique combination that is expected to generate the breakthroughs that will provide the foundations for a general account of skilled reading and its breakdown in reading disabled persons.

C. Touzet *, Senior Researcher AMU CNRS LNIA

Cognition Neural Theory & Reading


Formalized in 2010, the Theory of neuronal Cognition (TnC) departs from all existing materialist theories of mind by claiming that our brain does not process information, but only represents information. The logical implication is that we are only a crystallization of our interactions with the environment. Since « extraordinary claims require extraordinary proofs », the goal of my talk will be to provide the audience with the neuronal blueprints of a number of cognitive functions and concepts. Reading will illustrate my description of the cortex as a hierarchy of self-organizing associative memories. Afterwhat, I will show how the synergy between sensory and sensory-motor maps generates behaviors, and offer explanations about intelligence (a side effect of the observer knowledge), consciousness (an automatic verbalization), endogenous and exogenous attentions, episodic and semantic memories, motivation or joy (a side effect of associative memories functioning). Last, I will present new insights about how unsupervised systems achieve homeostasis.

T. Hannagan *, Coordinator of Neurocomputation group in ERC Brain & Language Research Institute

Spherical reader and Convolutional Neural Net

Brain and Language Research Institute – ERC

"What are the cognitive representations that children use for letters and for words in the very first stages of reading? I will describe a deep learning convolutional model that operates with a plausible developmental timeline and with a realistic visual environment. With this model, I will then explore the possible mechanisms whereby mirror invariance could be formed and selectively broken in the child's visual system, upon learning about letter and word stimuli."

C) Simulated Information Development

P.-Y. Oudeyer *, Research Director INRIA

Curiosity-driven automatic learning and information development with robots


A great mystery is how human infants develop: how they progressively discover their bodies, how they learn to interact with objects and social peers, and accumulate new skills all over their lives. Constructing robots, and building mechanisms that model such developmental processes, is key to advance our understanding of human development. I will present examples of robotics models of curiosity-driven learning and exploration, and show how developmental trajectories can self-organize, starting from discovery of the body, then object affordances, then vocal babbling and vocal interactions with others. In particular, I will show that the onset of language spontaneously forms out of such sensorimotor development.

B. de Boer *, Professor, ULB, Belgium

Evolution of Language Learning


Acquisition of speech and language can be seen as an example of sophisticated information retrieval, yet it is performed effortlessly by children. This feat is even more amazing if we take into account that children start essentially from scratch, knowing neither the signals nor the meanings they have to learn. However, when we consider that language has evolved both biologically and culturally, it will become clear that its acquisition may be less mysterious than once thought. In this contribution, we will discuss what students of information retrieval can learn from studying the acquisition of language, as well as what linguists can learn from studying information retrieval. It will contain a brief overview of what children do, what we (think we) know about evolution and what role computer models have played. The focus will be on speech, not only because this is the presenter's specialty, but also because it is a physical signal (which makes it easier to study directly) as well as a continuous signal (which makes it a special challenge for studying it computationally).

A. Graves *, Senior Researcher, Toronto

Learning to Write


The idea a building machine able to perform the quintessentially human act of cursive handwriting has fascinated scientists and inventors for centuries. As well as being a challenging task in fine motor control, handwriting is interesting from the perspective of pattern discovery due to the great diversity of writing styles and letter forms. This talk describes a novel recurrent neural network architecture able to transform character sequences into highly realistic pen trajectories. Unlike most handwriting synthesis methods (which are trained for a single writer), the network learns to model, and interpolate between, a wide variety of writing styles. It can also be used to mimic – and even improve – the writing of a particular individual.

D) Robust Scaled Information Retrieveal

C. Kermorvant *, Research Manager A2IA

Deep Neural Net for Reading


Since their first success in 2009, deep neural networks have been largely adopted by the written text recognition community. Today, most of the state-of-the art systems on this task include deep and recurrent neural networks for feature extraction, classification and/or sequence modeling. We will present how the deep architectures are used in text recognition systems and what are the results of these systems in recent international evaluations.

H. Li *, Associate Reseacher at INRA, Paris. CNRS LIP6

Multimedia Maximal Marginal Relevance for Multi-video Summarization


The amount of various videos from mobile phone, personal DV, video surveillance, movie industry and so on rapidly increases on the Internet and in our daily life. Consequently, how to manage such a large amount of visual data is an active research topic now. Video summarization has been identified as an important component to deal with the large-scale video data. Video summarization produces an abbreviated form of the video by extracting the most important and pertinent content in the video. I will present a novel video summarization algorithm, Maximal Marginal Relevance (MMR), which can incrementally constructs the video summary by exploiting all the multimodal indices in the video, including the text, the video and the audio. MMR is an universal approach for all the video genres and does not require a priori knowledge.

P. Bellot *, Professor AMU CNRS LSIS

Information Retrieval in Big Text


Information retrieval focuses on automatic linking textual user queries and documents: web pages, books, news, tweets... The first numerical models defined statistical and probabilistic criteria to represent how a word could be representative of a document collection and how likely a document might be relevant for a user. This has led to define the concept of user profiles and to some information retrieval models "learning to rank". On the other hand, the Web allowed the development of models taking into account the hyperlinks between pages and the construction of large semantic networks and of natural language processing softwares the inclusion of high level features in the retrieval models. In this talk, I will describe some models that are effective on very large collections of documents and I will show how different disciplines can work together to achieve more adaptive and personnalized search models and systems. I will present the French equipment of excellency OpenEdition.org, a Digital Library for Open Humanities, that aims to develop new capabilities for browsing, searching and reading recommendation.

ERMITES is supported by TOULON PROVENCE MEDITERRANEE (TPM), USTV, MASTODONS CNRS project, IUF, INRIA, CNRS, LSIS, ARIA, Fed. for Computer Sciences and Interactions (FRIIAM), and ANR COGNILEGO.

ERMITES is recognized by the doctoral schools as disciplinary lectures, for a total of 25 hours.

Link to online videos of previous editions and link to previous ERMITES editions.

Registration Fees (payment by CB or invoice to UTLN)
You may choose between 1 or 3 days pack, single or shared room studio.
The 3 days pack includes: 2 nights, 5 meals, 2 breakfasts, coffee breaks, proceedings,
with D1 or D2 registrations for double shared room studio,
D1 : Only for PhD., Post-doctorate, Master = 300 euros,
D2 : Other (Full position, company) = 450 euros.
Or with S1 or S2 formula for single room,
S1 : Like D1 but single room = 330 euros,
S2 : Like D2 but single room = 480 euros.

The daily pack includes 1 meal, coffee break, proceedings, without sleeping accommodations.
Daily student: PhD, Post-doctorate, Master = 70 euros per day,
Daily non-student = 100 euros per day.
You can either pay by invoice, or credit card at this adress : DO ONLINE REGISTRATION BY CB (To open in Feb 2014, but mail to ermites@univ-tln.fr now to book your place)


Access : ERMITES 14 is at IGESA center, in the middle of Porqueroles island, with access from Hyeres TGV station then bus (67), or Toulon International Airport, then boat (15 mn). We may also organize car travels from Hyeres to the boat - More details on trains / boats.

Social activities : a little walk starting from IGESA to the Cap Grand Langoustier will offer a great breath in this paradise to the attendies, and the opportunity to extend unformal discussions :

Committees :

Organizing co. : Pierre-Hugues Joalland and J. Razik (pres), H. Glotin, M. Bartcus
Program co. : H. Glotin (Pres.), S. Bengio, S. Paris, J. Razik, T. Artieres, F. Chamroukhi.
Contact : ermites@gmail.com