Please enable JavaScript.

Coggle requires JavaScript to display documents.

Research Challenges and Opportunities in Knowledge Representation - Coggle…

- - - - mechanisms for formulationg and processing complex queries over heterogeneous sources
      - common ontologies to share threse description
      - representation formalisms to describe the data
      - methods to overcome heterogeneity and variety of data
      - reasoning over the increasing volumes of data
      - formalisms to describe provenance of the data and its context
    - - Data sources range from industrial reports to a wealth of on-the-ground measurements representing sensors, systematic monitoring efforts
      - The semantic challenges here are clear and prevalent
        
        The semantic challenge
        of assisting with the discovery and integration of highly heterogeneous data
        
        Semantic technologies can help tame terminological
        idiosyncrasies that currently abound within earth science domains
        
        KR techniques can motivate the development and use of standardized terminologies for the Earth Sciences, will advantages of helping to unify
        and disambiguate semantic intention
      - KR techniques can greatly enhance the comparability and re-use of analyses, models,
        and workflows
      - By deploying best practieces in ontology construction, KR techniques can enable far more than simple terminological harmonization
      - The logical expressivity of modern KR languages enable expression of detailed provenance information and other metadata
      - Finally by constructing communitybased, cross-disciplinary ontologies, KR methods can escalate prospects for transdisciplinary communication
      - The KR&R community can assist the earth sciences
        
        by helping the field to better organize and adapt its data and modeling resources
        
        its communication of results, to a digital, networked information
        environment
      - KR solutions will be prime enablers for Grand Challenge questions in the Earth Sciences
    - - A major aspect of modern scientific data management to create useful data for query answering and analysis lies
      - The most recognized use of ontology in biomedical research is enrichment analysis
      - The goal of enrichment analysis is to find a set of attributes that are significantly enriched in a target set over some background set also sharing that attribute
      - A major outstanding challenge lies in being able to
        
        reconcile these
        associations with prior knowledge
        
        establishing the degree to which we are confident about any assertion found in a web of data in which broad scientific
    - - Continuous and large-scale analysis of data will lead to new insights in many areas.
      - We will need KR in order
        to
        
        aggregate information
        
        match automatically patients and patients to clinical trials
  - - - fueled by substantial knowledge bases and commonsense knowledge
      - could revolution education
  - - - The control of robotic agents has been a key motivating topic since the early days of KR
      - In AI-based robot control, plans are sequences of actions to achieve a given goal
      - There are three critical areas:
        
        Methods for representing knowledge
        
        Methods for updating the robot’s internal knowledge representation based on percepts and actions
        
        Methods for planning, execution, and execution monitoring
      - Reasoning about actions
        
        we have to investigate and develop knowledge processing methods
        
        The decisions are context- and task-dependent
        
        given a vague task that are capable of inferring the information needed to do the appropriate acction
      - Uncertainty
        
        There are three major challenges for operating in a domain
        
        For example: a household robot operates:
        
        the dimensionality of which is unbounded
        
        a mixed continuous- discrete state space
        
        about the effects of actions
    - - There are many applications
        
        require sophisticated understanding of such data
        
        can be usefully augmented with it, from surveillance, to mobile assistance, to environmental monitoring
      - The use of qualitative spatial representations
        
        provides some relief from both
        the volume and noisiness of such data
        
        enables integration of different kinds
        of spatial knowledge
      - Understanding visual data has been a challenge for the Computer Vision community for decades
      - Much progress has been made in methods which attempt to understand such data using continuous/numerical techniques
      - The challenge is made that
        much harder by the sheer volume of data
      - The sheer volume of Big Data also
        provides mitigation for the problem since there is often redundancy
      - Another form of mitigation can come in the form of background knowledge,
        
        to help understand missing data
        
        correct noisy
        data
        
        to help integrate and fuse conflicting data
      - Another opportunity in this area is to combine data acquired from sensors with language data
  - - - Building natural language understanding systems involves translating text to a KR formalism
      - augmenting it with various kinds of knowledge that include commonsense knowledge, domain knowledge and linguistic knowledge
      - reasoning with all of them to come up with components of answers then need to be glued together to form answers
  - - - Reasoning about actions and objects in the outside world
      - Hierarchical inference
      - Inferring new facts from explicitly asserted data and knowledge
      - Query expansion and query answering from heterogenous data sources
    - - Explicit and unambiguous domain descriptions for knowledge sharing
      - Reuse and comparability of models, analyses, and interpretations
      - Domain models for natural-language understanding
      - Ontology-based data access for heterogeneous data sources
    - - Formal representation of both domain knowledge and students in education systems
      - Use of knowledge representation in machine learning
      - Understanding text and extracting explicit knowledge from it
      - KR as "lingua franca" for diverse knowledge resources
- - - - Using one particular language limits us to the problems that we can effectively represent in that language
      - So if we want to gain some or all of the benefits of multiple languages, we must try to combime languages
      - However, combining two languages often results in an increase in the complexity reasoning
      - Developing systematic means for combining heterogeneous KR languages and various reasoning techniques under one roof is a solved issue
      - Description Logic Rules combine description logics and rules but limits the scope of the rules to obtain decidable reasoning
      - In this way, we can obtain most of the benefits of the two (or more) languages, while still retaining the desirable features of the component languages
      - We need to perform this analysis for each combination of languages
      - We can also consider producing a loose combination
      - This sort of loose combination can also be used to combine several reasoner over the same language
      - Scientists have used these loose combinations for quite some time
      - However, there are still many problems remain, ranging from issues of allocation resources to issues related to characterization of the capabilities of the combination.
      - Bridging KR and Machine Learning
      - Bridging open-world knowledge and closed-world data
      - Mixing data and text simulations
    - - A number of recent developments give rise to the growing body of knowledge bases
        
        imcomplete knowledge
        
        inconsistent knowledge
      - These knowledge bases include linguistic knowledge bases
      - The body of knowledge created by a distributed, often uncoordinated “crowd.”
      - This knowledge is inevitably inconsistent and incomplete
      - The knowledge that we extract from the “big data” that is produced by scientists may also appear inconsistent or incomplete
      - One of the key challenges today is developing reasoners that will perform and scale robustly given incomplete and inconsistent knowledge
    - - The worst-case computational complexity of complete reasoning is dismal for even representation languages of moderate expressive power
      - For languages of higher expressive power, reasoning is undecidable
      - There are many reasoning systems for various languages, including propositional logic
      - Improving expected performance remains a vital issue in reasoning
      - Robust reasoning
        
        The current level of reasoner performance is often not adequate
        
        We need robust scalable reasoning
        
        There are some reasonnings systems on languages of limited expressive power that approach this level of performance
        
        Major goals
        
        Improve the performance of these
        systems to rival that of the fastest storage and querying systems
        
        Provide similar levels of performance for more expressive languages
      - Effective human-scale reasoning
        
        Major differences between today's AI reasoning systems and human reasoning is human reasoning tends to become more efficient and effective
        
        AI systems require careful hand-crafted optimization to achieve high performance
        
        Interesting research question
        
        Understanding the space of reasoning tasks and architectures that handle them optimally
        
        How to achieve the desirable properties of human reasoning in software
        
        Human reasoning is robust, flexible, and operates over broad domains of
        knowledge
        
        Some promising approaches currently being explored include
        
        Partitioning knowledge
        
        Parallel processing
        
        Analogical processing
    - - We use this to refer to methods and solutions that have low expressivity
      - The recently increased focus on lightweight KR is driven both by theoretical advances concerning such languages and by application successes
      - This development is in contrast to highly expressive logics that KR researchers often investigate
      - We need the principled development of lightweight languages, algorithms and tools form both an application perspective and theoretical angle
      - We need viable pathways for bootstrapping light-weight solutions
      - It would be very helpful to develop knowledge modeling languages and interfaces whichs have a light-weight appearance
  - - - heterogeneity of knowledge models
      - heterogeneity of data and information artifacts
      - heterogeneity of data items
      - dynamic data and models that change over time
      - data acquired from rapidly proliferating sensors
    - - By integrating data produced by different scientific experiments
      - By bringing together observations from different species
    - - New incentives to share data
      - Crowdsourcing technology
      - Better tools
      - Increasing capabilities of KR backends
      - Potential is becoming acknowledged
      - Big data renders useful
    - - The KR community has developed sophisticated languages and ontologies for representing the knowledge in diverse subjects
      - The amount of data that is actually represented in a KR system continues to shrink as an overall percentage of data available
      - The challenge and the opportunity are to bring the rich set of KR languages and ontologies to the vast amount of data
      - Solving the knowledge--data representation gap will lead to huge advances in our ability to exploit diverse sources of knowledge
      - The ability to find and reuse data is extremely limited
      - If all of the data within this domain were published and described with respect to shared domain ontologies, researcher could discover relevant data sources and exploit this knowledge to more effectively conduct their research
      - Closing this gap requires developing new methods, tools and incentives to represent the huge amount of data
      - The core research problems
        
        Automatic Modeling
        
        Data Transformation
        
        Data Linking
        
        Source Publication
        
        Incentives
      - By bringing KR techniques and tools to the data and services, we have the opportunity to start a revolution in representing, discovering and exploiting the vast amount of data
    - - The flexibility of KR languages makes them well-suited for describing a diverse collection of ontologies
      - We can use axioms to explicitly specify the relationships between terms from different ontologies
      - We can define the terms using common vocabularies and infer the relationships between them
      - The vast majority of the approaches on automated ontology alignment only produce subclass or equivalent class alignments
      - Real-world heterogeneity requires complex axioms to resolve and requires the use of negation and disjunction among other things
      - Concepts not typically found in KR languages, such as arithmetic to perform unit conversions or string manipulations
      - Noise and quality become critical issues when considering multiple ontologies and data sources
      - Another integration question is essentially the centralized vs. distributed storage model
  - - - 1) The availability of people to contribute significant amounts of knowledge
      - 2) The continuously improving performance of text extraction approaches
      - 3) The availability of data at unprecedented scale enabling the discovery of new knowledge
      - 4) The widespread use of sensors and other cyberphysical systems that collect continuous and detailed data about dynamic phenomena
    - - There are many challenges in the social acquisition of knowledge
      - In current approaches, the systems are quite passive and the contributors largely manage contents and extensions to the knowledge base
      - We need further research to enable the knowledge collection framework to take a more active role in guiding the acquisition process
      - We will need advances in meta-reasoning architectures
        
        To assess missing knowledge
        
        To estimate confidence on what is known
        
        To design strategies to seek new knowledge
      - We foresee that knowledge repositories are likely to be interconnected and draw from knowledge has been collected from different groups of contributor
      - The provenance of knowledge sources will be crucial to propagate updates throughout the knowledge bases
    - - We need intelligent systems that can acquire knowledge from people
      - Acquiring knowledge directly from people will always be a necessary skill for intelligent systems
    - - Extracting relation facts from text has a long history where researchers have used methods based on manually encoded patterns, machine learned patterns and combinations of both
      - However, these methods require that we fix the relations a priori
      - We need novel methods that can extract arbitrary relations
      - A bigger challenge in capturing knowledge from text
        
        To go beyond extraction of relational facts
        
        To obtain more general information
      - Addressing this challenge will essentially involve translating text to a knowledge representation formailism
      - Such translation is necessary in many applications
      - Making advances in capturing knowledge from text will require developing KR formalisms that are particularly well suited for knowledge extraction from text
      - The choice of a particular formalism may depend on the type of text
      - We are processing and need to develop a general methodology to find appropriate formalism
    - - One of the important lessons from AI and cognitive science research is that human commonsense reasoning rests on a vast accumulation of knowledge
      - This knowledge ranges from high-level abstractions to concrete, everyday facts
      - Our broad base of experience enables us to quickly ascertain when things do and don’t make sense
      - Endowing software with these same reasoning abilities is important for
        
        Overcoming brittleness
        
        Making them more autonomous
        
        Facilitating trust in their operations
        
        Key challenges for the community
        
        Kinds of knowledge needed
        
        The point of common sense knowledge is that it can be used for many tasks
        
        Some kinds of questions can be answered directly
        
        Everyday knowledge provides background needed to describe situations and frame problems for professional reasoning
        
        Experience to date indicates that a wide range of knowledge is needed
        
        However, we are still working to understand the representation and reasoning requirements for different tasks
        
        More experimentation with large-scale systems that integrate rich knowledge resources, high-performance reasoning and learning at scale are needed
        
        Maintaining human-scale knowledge bases
        
        No real-world process of constructing large-scale artifacts is perfect, and errors are an inevitable
        
        For human-scale knowledge bases, the software itself must become an active curator of its knowledge
        
        This include
        
        monitoring its own performance
        
        identifying problems and gaps
        
        taking proactive steps to repair and improve its knowledge
        
        reasoning abilities
        
        Understanding how to put most of the burden of maintenance onto software itself, albeit with human oversight for trust, is an important question
        
        Building human-scale large knowledge bases
        
        We have learned much from efforts to build large knowledge bases by hand
        
        Those efforts have provide useful resources for research community
        
        Building beyond where we are now requires continuing and expanding the movement to automatic and semi-automatic learning already underway
    - - The approaches range from explanation-based learning based on applying knowledge
        
        To describe examples
        
        To pattern extraction from large amounts of data
      - Automated techniques to extract knowledge from data have always been valuable but become crucial when dealing with large and complex data
      - Automated algorithms can discover new patterns, but those patterns must be related to current scientific knowledge and models
  - - - Recent work has seen the development of standards for knowledge representation
        languages, in particular web-based representation
      - This development has been accompanied by the development of an ecosystem of tools for creating and manipulating representations
      - While these tools exist, there is a lack of introductory materials that would
        introduce novice users to the potential benefits of using such representations
      - If we consider analogies of text processing or ML, tools often come with simple applications allow a user to explore the technologies
      - Such packaging tends to be absent with KR tools
    - - A challenge for KR in the 21 century is enabling ordinary users to investigate the data
      - One of the promise of KR is to integrate diverse data from different domains and allow serendipitous discoveries
      - For the KR research community, there is a question of how to evaluate visualization contributions
      - KR experts are often not familiar with the evaluation approaches generally accepted by the user interface community
      - Such experiments can be costly and difficult to set up than running a system against benchmark