(0)

Write a review

-20%

Computational Methods for Integrating Vision and Language

Kenichi Kanatani, Yasuyuki Sugaya

(0)

Write a review

Language English

Cover Softcover

Published 2016-04-21

€65.98 €82.48

-20% with code BOOKS

Softcover €82.48 Hardcover

In stock at our supplier

Shipping in 12-18 days

30-day return policy

Description

Modeling data from visual and linguistic modalities together creates opportunities for better understanding of both, and supports many useful applications. Examples of dual visual-linguistic data includes images with keywords, video with narrative, and figures in documents. We consider two key task-driven themes: translating from one modality to another (e.g., inferring annotations for images) and understanding the data using all modalities, where one modality can help disambiguate information in another. The multiple modalities can either be essentially semantically redundant (e.g., keywords provided by a person looking at the image), or largely complementary (e.g., meta data such as the camera used). Redundancy and complementarity are two endpoints of a scale, and we observe that good performance on translation requires some redundancy, and that joint inference is most useful where some information is complementary. Computational methods discussed are broadly organized into ones forsimple keywords, ones going beyond keywords toward natural language, and ones considering sequential aspects of natural language. Methods for keywords are further organized based on localization of semantics, going from words about the scene taken as whole, to words that apply to specific parts of the scene, to relationships between parts. Methods going beyond keywords are organized by the linguistic roles that are learned, exploited, or generated. These include proper nouns, adjectives, spatial and comparative prepositions, and verbs. More recent developments in dealing with sequential structure include automated captioning of scenes and video, alignment of video and text, and automated answering of questions about scenes depicted in images.

More Information

Author	Kenichi Kanatani, Yasuyuki Sugaya
Publisher	Springer Nature Switzerland
Series	Synthesis Lectures on Computer Vision
Release year	2016
Cover type	Softcover
EAN	9783031006869

Write Your Own Review

You're reviewing: Computational Methods for Integrating Vision and Language

Your Rating:

Goodreads Reviews

€65.98 €82.48

Computational Methods for Integrating Vision and Language

You May Also Like

If Anyone Builds It, Everyone Dies: Why Superhuman AI Would Kill Us All

The God Test

How To Think About AI: A Guide For The Perplexed

Gödel, Escher, Bach: An Eternal Golden Braid

AI Engineering: Building Applications with Foundation Models

Empire of AI: Dreams and Nightmares in Sam Altman's OpenAI

The Rust Programming Language

The Art of Game Design: A Book of Lenses

The Pragmatic Programmer: journey to mastery, 20th Anniversary Edition, 2/e: your journey to mastery, 20th Anniversary Edition

Python Crash Course: A Hands-On, Project-Based Introduction to Programming

HBR Guide to Generative AI for Managers

How to Talk to AI: (And How Not To)

Hackers. 25th Anniversary Edition: Heroes of the Computer Revolution

How Linux Works: What Every Superuser Should Know

Hands-On Large Language Models: Language Understanding and Generation

Fundamentals of Software Architecture: A Modern Engineering Approach

Speak Data: Artists, Scientists, Thinkers, and Dreamers on How We Live Our Lives in Numbers

World of Warcraft Chronicle, Volume 2

Linux Basics for Hackers, 2nd Edition: Getting Started with Networking, Scripting, and Security in Kali

Deep Learning: Foundations and Concepts

Description

More Information

Goodreads Reviews

Computational Methods for Integrating Vision and Language - Kenichi Kanatani,Yasuyuki Sugaya

Computational Methods for Integrating Vision and Language

You May Also Like

Description

More Information

Goodreads Reviews