Skip to main content
Brain Simulator III

Verbal System

By March 13, 2024No Comments

This is a proposed design which is subject to change.  Please review this description and add comments/questions below.  This page will describe the verbal system for the Brain Simulator III, first the reasoning behind it and then its possible implementation.

The point of the verbal system is to make intelligent use of verbal input and create intelligent-sounding verbal output. Initially, “verbal” will mean text as most available audio/verbal systems create text from verbal input and verbal output from text.

Ambiguity

A key tenet of the Brain Simulator is that UKS Things have specific, unambiguous meanings. Many knowledge graph systems have attempted to use a language-based method to identify nodes and then run into difficulty because all languages have words which are ambiguous and phrases and usages which are idiomatic. Attempts to add on disambiguation are doomed to fail because at a fundamental level, all words have multiple meanings.  Imagine that basketball refers to a single unambiguous object. Then, one can ask, “How do you spell basketball?” in which case the word basketball does NOT refer to that object but refers to a word. Likewise, “ball rhymes with fall” has nothing to do with either balls or falling but refers to the words alone.

Accordingly, this description assumes that all words have multiple meanings, and all unambiguous meanings can be described in multiple ways using different words.  Then, there may be some cases in which this ability is not used, but it is the default. Therefore, there is a many-to-many mapping between words and the conceptual Things they reference.

Words and Phrases

Many meanings are associated with phrases rather than individual words. In, “Mary goes to the store”, the phrase “to the YYY’ implies that YYY is a (current) destination. To accomplish this, a system must be able to handle phrases and individual words more-or-less interchangeably and phrases must be built out of some combination of specific words and placeholders.

Parts of Speech

As a child learning a language, nuances of grammar develop after specific words and meanings.  While specific grammar is a primary focus of NLP systems and some experimentation has been done with NLP in the Brain Simulator, this iteration will ignore this and rely on the underlying meanings to drive the syntax.  In the cases of “Mary can play piano” and “Suzie went to a play”, the meaning of play can be understood from the UKS Things it is likely to refer to.  In the former case, a Thing which descends from Action and the latter, and Thing which may descend from location or entertainment.

Multiple Languages

A fundamental idea of the Brain Simulator is the decoupling of meaning from language. If a person is multi-lingual, they can view an image and describe it in any one of the languages they know. The key is that the person knows which language to respond in—likely from the context of a conversation, requiring that they can quickly identify which language is being spoken.  Handling multiple languages is not a high priority for this development but the ability to handle multiple languages within the design is. It can be implemented with parallel trees of words and phrases in various languages.

Spelling vs Pronunciation

In English, words can sound identical but be spelled differently and vice versa. As the computer can rapidly convert from text to phonemes, internal storage of words may be handled in phonemes.  This may prove useful when considering rhyming, puns, and other features of children’s common sense which could be lost if only words with “correct” spellings are used.

 

Implementation

UKS Content

The UKS is preset with a list of Phonemes and letters which descend from the Thing sense. Not yet defined: phonemes will have attributes which allow them to be recognized when heard and subsequently spoken. Letters will have attributes which allow them to be recognized when seen and output. Both phonemes and letters are generic Things which can subsequently be referenced by words.

New words are added with a parent of EnglishWord and a label of ewXXXX where XXXX is the word. Words have an ordered list of Relationships referencing the phonemes representing their pronunciation and letters representing their spelling.

Open questions: How do we handle words with multiple spellings or multiple pronunciations? Can we build a system of spelling likelihoods of phonemes so when hearing a word, one can assume a likely spelling?

When words are received, they are matched against words already in the UKS (in a given language) using a closest-match algorithm.

All words have a use count and an agent will periodically delete words with a low use count. This will eliminate spelling and mispronunciation errors.

Sequences of words are continuously matched against a similar list of EnglishPhrase entries in a similar manner to individual words.  An agent handles merging of phrases.  Phrases are stored and matched with descendancy so the phrases “to the store” and “to the beach” could be replaced by the single phrase “ewTo ewThe location” assuming that beach and store descend from the common ancestor, location.

Word and phrase Things have Relationships (with type “means”) to abstract Things they may represent. These have weights which adjust over time.

Initially, a hard-coded mechanism would handle is-a and is Relationships so simple phrases like “Fido is a dog”  and “Fido is brown” could be handled properly.

The system can handle declarations and queries. Simple queries would be similar to declarations with a hard-coded query word in the subject.

Dialog Box

Conceptually, the dialog box consists of two textboxes: input and output.  User can enter any text into the input and get an output.  Practically, to allow long, repeatable sequences of inputs and outputs, the input textbox can be file-driven and the file can be edited on the fly. Furthermore, a desired output can be put in parentheses in the input. This will be matched against any actual output for repeatable testing or may be useful for training.

Subscribe
Notify of
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x