Please enable JavaScript.
Coggle requires JavaScript to display documents.
Research Challenges and Opportunities in Knowledge Representation - Coggle…
Research Challenges and Opportunities in Knowledge Representation
3. What Can KR Do for You? The Application Pull
Extractubg knowledge from data, creating new knowledge-driven applications, and generating new expressive knowledge will likely lead to advances in many areas
Almost any domain that has any data to process into knowledge will benefit from advances in KR
KR will play a role in assuring success in many of the challenges
KR is already being applied in
finance
biomedicine,
heal care and life sciences
oil and gas industry and sustainable energy
engineering
open government initiatives
earth and environmental sciences
defence, autonomous robotics, digital humanities
social sciences (census and decision making)
museums and cultural collections
material and geosciences
personal assistants
KR methods underpin
information management and retrieval
data analysis and analytics
machine learning
processing of sensor data
agents and multi-agent collaboration
representation of engineering systems
natural language processing and understanding
representation of preferences
human-human and human-machine collaboration (human augmentation)
3.1 Scientific discovery
The KR methods provide
mechanisms for formulationg and processing complex queries over heterogeneous sources
common ontologies to share threse description
representation formalisms to describe the data
methods to overcome heterogeneity and variety of data
reasoning over the increasing volumes of data
formalisms to describe provenance of the data and its context
3.1.1 Use case: environmental sustainability
Data sources range from industrial reports to a wealth of on-the-ground measurements representing sensors, systematic monitoring efforts
The semantic challenges here are clear and prevalent
The semantic challenge
of assisting with the discovery and integration of highly heterogeneous data
Semantic technologies can help tame terminological
idiosyncrasies that currently abound within earth science domains
KR techniques can motivate the development and use of standardized terminologies for the Earth Sciences, will advantages of helping to unify
and disambiguate semantic intention
KR techniques can greatly enhance the comparability and re-use of analyses, models,
and workflows
By deploying best practieces in ontology construction, KR techniques can enable far more than simple terminological harmonization
The logical expressivity of modern KR languages enable expression of detailed provenance information and other metadata
Finally by constructing communitybased, cross-disciplinary ontologies, KR methods can escalate prospects for transdisciplinary communication
The KR&R community can assist the earth sciences
by helping the field to better organize and adapt its data and modeling resources
its communication of results, to a digital, networked information
environment
KR solutions will be prime enablers for Grand Challenge questions in the Earth Sciences
3.1.2 Use case: Biomedical and pharmaceutical research
A major aspect of modern scientific data management to create useful data for query answering and analysis lies
The most recognized use of ontology in biomedical research is enrichment analysis
The goal of enrichment analysis is to find a set of attributes that are significantly enriched in a target set over some background set also sharing that attribute
A major outstanding challenge lies in being able to
reconcile these
associations with prior knowledge
establishing the degree to which we are confident about any assertion found in a web of data in which broad scientific
3.1.3 Use case: Advancing healthcare
Continuous and large-scale analysis of data will lead to new insights in many areas.
We will need KR in order
to
aggregate information
match automatically patients and patients to clinical trials
3.2 Education
One of the major success stories of AI and Cognitive Science has been the rise of intelligent tutoring systems
Intelligent tutoring systems and learning environments incorporate formally represented models of the domain and skills to be learned
A key in creating such
systems is the availability of formally represented domain knowledge
Intelligent tutoring systems
fueled by substantial knowledge bases and commonsense knowledge
could revolution education
3.3 Robotics, sensors, computer vision
3.3.1 Household robots
The control of robotic agents has been a key motivating topic since the early days of KR
In AI-based robot control, plans are sequences of actions to achieve a given goal
There are three critical areas:
Methods for representing knowledge
Methods for updating the robot’s internal knowledge representation based on percepts and actions
Methods for planning, execution, and execution monitoring
Reasoning about actions
we have to investigate and develop knowledge processing methods
The decisions are context- and task-dependent
given a vague task that are capable of inferring the information needed to do the appropriate acction
Uncertainty
There are three major challenges for operating in a domain
For example: a household robot operates:
the dimensionality of which is unbounded
a mixed continuous- discrete state space
about the effects of actions
3.3.2 Understanding spatial and spatio-temporal data
There are many applications
require sophisticated understanding of such data
can be usefully augmented with it, from surveillance, to mobile assistance, to environmental monitoring
The use of qualitative spatial representations
provides some relief from both
the volume and noisiness of such data
enables integration of different kinds
of spatial knowledge
Understanding visual data has been a challenge for the Computer Vision community for decades
Much progress has been made in methods which attempt to understand such data using continuous/numerical techniques
The challenge is made that
much harder by the sheer volume of data
The sheer volume of Big Data also
provides mitigation for the problem since there is often redundancy
Another form of mitigation can come in the form of background knowledge,
to help understand missing data
correct noisy
data
to help integrate and fuse conflicting data
Another opportunity in this area is to combine data acquired from sensors with language data
3.4 From Text to Knowledge
Question answering systems are systems that answer questions with respect to a collection of documents
The questions in QA systems are often in natural language and the
documents include text and may include other forms of information
Question-answering systems are useful in many domains
At the top level, question answering involves understanding questions, understanding text and other forms of information, and formulating answers
KR and reasoning plays important roles in this
Understanding questions and text is mainly about natural language understanding.
Building natural language understanding systems involves translating text to a KR formalism
augmenting it with various kinds of knowledge that include commonsense knowledge, domain knowledge and linguistic knowledge
reasoning with all of them to come up with components of answers then need to be glued together to form answers
3.5 Why KR?
Reasoning
Reasoning about actions and objects in the outside world
Hierarchical inference
Inferring new facts from explicitly asserted data and knowledge
Query expansion and query answering from heterogenous data sources
Ontologies and other formal domain models
Explicit and unambiguous domain descriptions for knowledge sharing
Reuse and comparability of models, analyses, and interpretations
Domain models for natural-language understanding
Ontology-based data access for heterogeneous data sources
Advanced KR languages and techniques
Formal representation of both domain knowledge and students in education systems
Use of knowledge representation in machine learning
Understanding text and extracting explicit knowledge from it
KR as "lingua franca" for diverse knowledge resources
4. Why is difficult? Challenges for the KR Community
4.1 KR Languages and Reasoning
KR languages and reasoning methods are naturally at the core of the KR research
KR languages enable engineers to describe their domains formally, with clear semantics
Over the years, scientists have developed many different representation languages for the effective representation of different kinds of information
We have built and optimized reasoning engines for these languages, with effective performance on problems of moderate size or complexity
Other kinds of languages and techniques have also been developed for storing or transforming information
The key challenge today is finding the right balance between more complex formalisms and the lightweight KR
More critical is the task of integrating diffenrent formalisms and approaches to develop hybird approaches that get "best of all words"
4.1.1 Hybrid KR
Using one particular language limits us to the problems that we can effectively represent in that language
So if we want to gain some or all of the benefits of multiple languages, we must try to combime languages
However, combining two languages often results in an increase in the complexity reasoning
Developing systematic means for combining heterogeneous KR languages and various reasoning techniques under one roof is a solved issue
Description Logic Rules combine description logics and rules but limits the scope of the rules to obtain decidable reasoning
In this way, we can obtain most of the benefits of the two (or more) languages, while still retaining the desirable features of the component languages
We need to perform this analysis for each combination of languages
We can also consider producing a loose combination
This sort of loose combination can also be used to combine several reasoner over the same language
Scientists have used these loose combinations for quite some time
However, there are still many problems remain, ranging from issues of allocation resources to issues related to characterization of the capabilities of the combination.
Bridging KR and Machine Learning
Bridging open-world knowledge and closed-world data
Mixing data and text simulations
4.1.2 Representing inconsistency, uncertainty, and imcompleteness
A number of recent developments give rise to the growing body of knowledge bases
imcomplete knowledge
inconsistent knowledge
These knowledge bases include linguistic knowledge bases
The body of knowledge created by a distributed, often uncoordinated “crowd.”
This knowledge is inevitably inconsistent and incomplete
The knowledge that we extract from the “big data” that is produced by scientists may also appear inconsistent or incomplete
One of the key challenges today is developing reasoners that will perform and scale robustly given incomplete and inconsistent knowledge
4.1.3 Challenges in reasoning
The worst-case computational complexity of complete reasoning is dismal for even representation languages of moderate expressive power
For languages of higher expressive power, reasoning is undecidable
There are many reasoning systems for various languages, including propositional logic
Improving expected performance remains a vital issue in reasoning
Robust reasoning
The current level of reasoner performance is often not adequate
We need robust scalable reasoning
There are some reasonnings systems on languages of limited expressive power that approach this level of performance
Major goals
Improve the performance of these
systems to rival that of the fastest storage and querying systems
Provide similar levels of performance for more expressive languages
Effective human-scale reasoning
Major differences between today's AI reasoning systems and human reasoning is human reasoning tends to become more efficient and effective
AI systems require careful hand-crafted optimization to achieve high performance
Interesting research question
Understanding the space of reasoning tasks and architectures that handle them optimally
How to achieve the desirable properties of human reasoning in software
Human reasoning is robust, flexible, and operates over broad domains of
knowledge
Some promising approaches currently being explored include
Partitioning knowledge
Parallel processing
Analogical processing
4.1.4 Lightweight KR
We use this to refer to methods and solutions that have low expressivity
The recently increased focus on lightweight KR is driven both by theoretical advances concerning such languages and by application successes
This development is in contrast to highly expressive logics that KR researchers often investigate
We need the principled development of lightweight languages, algorithms and tools form both an application perspective and theoretical angle
We need viable pathways for bootstrapping light-weight solutions
It would be very helpful to develop knowledge modeling languages and interfaces whichs have a light-weight appearance
4.2 Dealing with heterogeneity of data and knowledge
One of the biggest challenges and opportunities for KR lies in integrating heterogeneous data and knowledge sources
Heterogeneity comes in many different forms
heterogeneity of knowledge models
heterogeneity of data and information artifacts
heterogeneity of data items
dynamic data and models that change over time
data acquired from rapidly proliferating sensors
Intergrating diverse objects or data and knowledge sources results in whole that is larger that the sum of its parts
We can gain valuable insights
By integrating data produced by different scientific experiments
By bringing together observations from different species
Robots integrate diverse instructions and inputs
Scientists work on different modes of integrating heterogeneous data and models, from tight coupling and intergration to the loose intergration
There still remain a holy grail of KR research and still pose many challenges
A numbter of recent developments bring new opportunities that make us hopeful that we can make significant progress in the coming years
New incentives to share data
Crowdsourcing technology
Better tools
Increasing capabilities of KR backends
Potential is becoming acknowledged
Big data renders useful
4.2.1 Closing the Knowledge -- Data Representation Gap
The KR community has developed sophisticated languages and ontologies for representing the knowledge in diverse subjects
The amount of data that is actually represented in a KR system continues to shrink as an overall percentage of data available
The challenge and the opportunity are to bring the rich set of KR languages and ontologies to the vast amount of data
Solving the knowledge--data representation gap will lead to huge advances in our ability to exploit diverse sources of knowledge
The ability to find and reuse data is extremely limited
If all of the data within this domain were published and described with respect to shared domain ontologies, researcher could discover relevant data sources and exploit this knowledge to more effectively conduct their research
Closing this gap requires developing new methods, tools and incentives to represent the huge amount of data
The core research problems
Automatic Modeling
Data Transformation
Data Linking
Source Publication
Incentives
By bringing KR techniques and tools to the data and services, we have the opportunity to start a revolution in representing, discovering and exploiting the vast amount of data
4.2.2 Heterogeneity: The Ontology Perspective
The flexibility of KR languages makes them well-suited for describing a diverse collection of ontologies
We can use axioms to explicitly specify the relationships between terms from different ontologies
We can define the terms using common vocabularies and infer the relationships between them
The vast majority of the approaches on automated ontology alignment only produce subclass or equivalent class alignments
Real-world heterogeneity requires complex axioms to resolve and requires the use of negation and disjunction among other things
Concepts not typically found in KR languages, such as arithmetic to perform unit conversions or string manipulations
Noise and quality become critical issues when considering multiple ontologies and data sources
Another integration question is essentially the centralized vs. distributed storage model
4.3 Knowledge capture
Significant trends in recent years offer new opportunities to advance knowledge
capture
1) The availability of people to contribute significant amounts of knowledge
2) The continuously improving performance of text extraction approaches
3) The availability of data at unprecedented scale enabling the discovery of new knowledge
4) The widespread use of sensors and other cyberphysical systems that collect continuous and detailed data about dynamic phenomena
4.3.1 Social knowledge collection
There are many challenges in the social acquisition of knowledge
In current approaches, the systems are quite passive and the contributors largely manage contents and extensions to the knowledge base
We need further research to enable the knowledge collection framework to take a more active role in guiding the acquisition process
We will need advances in meta-reasoning architectures
To assess missing knowledge
To estimate confidence on what is known
To design strategies to seek new knowledge
We foresee that knowledge repositories are likely to be interconnected and draw from knowledge has been collected from different groups of contributor
The provenance of knowledge sources will be crucial to propagate updates throughout the knowledge bases
4.3.2 Acquiring Knowledge from people
We need intelligent systems that can acquire knowledge from people
Acquiring knowledge directly from people will always be a necessary skill for intelligent systems
4.3.3 Capturing knowledge from text
Extracting relation facts from text has a long history where researchers have used methods based on manually encoded patterns, machine learned patterns and combinations of both
However, these methods require that we fix the relations a priori
We need novel methods that can extract arbitrary relations
A bigger challenge in capturing knowledge from text
To go beyond extraction of relational facts
To obtain more general information
Addressing this challenge will essentially involve translating text to a knowledge representation formailism
Such translation is necessary in many applications
Making advances in capturing knowledge from text will require developing KR formalisms that are particularly well suited for knowledge extraction from text
The choice of a particular formalism may depend on the type of text
We are processing and need to develop a general methodology to find appropriate formalism
4.3.4 Building large commonsense knowledge bases
One of the important lessons from AI and cognitive science research is that human commonsense reasoning rests on a vast accumulation of knowledge
This knowledge ranges from high-level abstractions to concrete, everyday facts
Our broad base of experience enables us to quickly ascertain when things do and don’t make sense
Endowing software with these same reasoning abilities is important for
Overcoming brittleness
Making them more autonomous
Facilitating trust in their operations
Key challenges for the community
Kinds of knowledge needed
The point of common sense knowledge is that it can be used for many tasks
Some kinds of questions can be answered directly
Everyday knowledge provides background needed to describe situations and frame problems for professional reasoning
Experience to date indicates that a wide range of knowledge is needed
However, we are still working to understand the representation and reasoning requirements for different tasks
More experimentation with large-scale systems that integrate rich knowledge resources, high-performance reasoning and learning at scale are needed
Maintaining human-scale knowledge bases
No real-world process of constructing large-scale artifacts is perfect, and errors are an inevitable
For human-scale knowledge bases, the software itself must become an active curator of its knowledge
This include
monitoring its own performance
identifying problems and gaps
taking proactive steps to repair and improve its knowledge
reasoning abilities
Understanding how to put most of the burden of maintenance onto software itself, albeit with human oversight for trust, is an important question
Building human-scale large knowledge bases
We have learned much from efforts to build large knowledge bases by hand
Those efforts have provide useful resources for research community
Building beyond where we are now requires continuing and expanding the movement to automatic and semi-automatic learning already underway
4.3.5 Knowledge discovery from big data
The approaches range from explanation-based learning based on applying knowledge
To describe examples
To pattern extraction from large amounts of data
Automated techniques to extract knowledge from data have always been valuable but become crucial when dealing with large and complex data
Automated algorithms can discover new patterns, but those patterns must be related to current scientific knowledge and models
4.4 Making KR accessible to non-experts
KR brings huge benefits to scientists and practitioners in many fields
The barrier to KR is very high
Today, it is impossible for someone who is not familiar with KR to build an ontology and use it
Enabling non-experts to use KR tools is a two-fold challenge
Another critical challenge is visualizing and exploring the massive quantities of data that becoming available
4.4.1 KR in the afternoon
Recent work has seen the development of standards for knowledge representation
languages, in particular web-based representation
This development has been accompanied by the development of an ecosystem of tools for creating and manipulating representations
While these tools exist, there is a lack of introductory materials that would
introduce novice users to the potential benefits of using such representations
If we consider analogies of text processing or ML, tools often come with simple applications allow a user to explore the technologies
Such packaging tends to be absent with KR tools
4.4.2 Visualization and data exploration
A challenge for KR in the 21 century is enabling ordinary users to investigate the data
One of the promise of KR is to integrate diverse data from different domains and allow serendipitous discoveries
For the KR research community, there is a question of how to evaluate visualization contributions
KR experts are often not familiar with the evaluation approaches generally accepted by the user interface community
Such experiments can be costly and difficult to set up than running a system against benchmark