Comparing Languages to Find Underlying Meaning

Comparing Languages to Find Underlying Meaning

Reading Time: 1 minute

Comparing Languages to Find Underlying Meaning


To quantify the problem, the researchers chose a few basic concepts that we see in nature (sun, moon, mountain, fire, and so on). Each concept was translated from English into 81 diverse languages, then back into English. Based on these translations, a weighted network was created. The structure of the network was used to compare languages’ ways of partitioning concepts.


The team found that the translated concepts consistently formed three theme clusters in a network, densely connected within themselves and weakly to one another: water, solid natural materials, and earth and sky.


“For the first time, we now have a method to quantify how universal these relations are,” says Bhattacharya. “What is universal – and what is not – about how we group clusters of meanings teaches us a lot about psycholinguistics, the conceptual structures that underlie language use.”


  1. Great find, thanks Gideon Rosenblatt​.

  2. When you presuppose a structure for concepts, then assume translations across language can process a 1-1 correspondence into identical concepts,  your quantification of “clusters” doesn’t imply any concepts for the words, only the correspondence for which a function was postulated. Is “plan” a noun or a verb?  How does that “map” into other languages.  Having ignored the semantics of inflection, no syntax can form many meanings since the given words lost any connotative meaning in the mapping and any denotative meaning in the concept of such structure itself.

    You’re left with an existential semantics informing a universal syntax about an inferred structure, not the deep one of cognitive versus structural psycholinguistics.

  3. Gideon Rosenblatt I visited this topic via a different article on it about a week ago. Obviously this deep connection within transliterated languages is the very core of my own research

    On the one hand, I am excited that others are exploring this great enigma i.e, the matrix that intrinsically underlies every known language but on the other hand I thoroughly agree that the methodology used in this particular testing severely limits the potential depth of these connections, quite unnecessarily.

    Our initial approach was to create and hone our lingual matching algorithms within the English language. What pleasantly surprised us, after 14 years research and development, was that these algorithms work just effectively with any transliterated language.

    To limit it to phonetics is a potential minefield too as this is where the main elements of confusion between languages exist. Yet when language is examined in a purely written form the connections are far more obvious and it is this route that we have chosen as the benchmark of connectivity.

    We realised that our algorithms had the ability to connect every language not through streamlining an imposed pattern on it but by investigating its very own interlinking patterns individually and then converting them into mathematical formulas.

    In my opinion this latency should not be limited by enforced subsets even though quite obviously they could potentially prove that there was a connection but only to a limited extent.

    One could immediately think of the benefits that we’ve uncovered through our own research such as the more obvious links via etymological and consonantal roots quite apart from the clear connections through alliteration, rhyming, paronyms, synonyms and every other #nym you could possibly shake a stick at.

    So in summary. It is quite exciting that others are now beginning to prove that language is intrinsically linked because that benchmark needs to be established before AI comes into its own but for me it is only the tip of the iceberg and what lies beneath is far more exciting than these initial steps that prove it in principle.


    Why did the appearance of “Linguistics Science” coincide with the disappearance of any formal grammar from the curricula of other subjects, like Logic and English (I’m thinking of the 1970s)?  In banishing all formal concepts in the name of generative grammar and egalitarian principles of college graduation (don’t want to be “prescriptive” as that will spoil our scientifically objective, on-biased, “descriptitive” approach to a more mathmatical grammar), how did the DOD contracts trying to commission automatic translations of English-Russian turn up worthless as the Open Source got established at Berkeley, then ripped off by Silicone Valley.  There’s a whole generation which was taught to express themselves regardless of whether they had anything to say or not, in order to generate the children who still occupy the frontiers of cognitive and structural approaches alike in order to develop their instincts as consumers, not their consciousnesses as citizens.  The minority races got sucked into a load of debt which only inspires presidential candidates to remind us that “high school is no longer enough.”  They never could decipher the level of abstraction that transformational and generative grammar introduced to discourse on any universal basis.  Now they will  need another four years of indoctrination, either to brainwash us with science without method or with language without relevance.  Follow the money.

  5. I might add Bruce Mincks As you might know linguistic science isolates these fragments of language as what we know as semes

    This is all about smartening up the masses not dumbing them down and in my view the complete antithesis of “language without relevance” on the contrary this is totally about language with relevance

    So clearly any fragments that retain meaning can be interconnected to create composite meanings whether any might gainsay this or not. If fact they provide a far better compass than any dictionary meaning ever could. Any investigation that starts with this scientifically accepted fact will undoubtedly enjoy the benefits of exploring the collages of meaning that unfold in front of them as it has in my case over a 14 year period.

    Either way the difference between philosophising and practicality is vast IMO and for me the practical benefits of isolating and connecting these fragments of language both manually and via algorithmic pattern isolation has already led to a radical increase in audience engagement and search positioning with material that has been optimised through this methodology and IMO these facts far outweigh any hypotheses that may or may not be correct

  6. Peter Hatherley What is it that we know as “semes” that is either correlated with meaning or meaningful in itself?  You relate it to a “collage” metaphorically without really identifying any tenor for the metaphor.

  7. These two links describe the correlation quite clearly +Bruce Mincks

    As to the collage, it relates more to our product than the discussion i.e, our technology produces relative links via search queries.

    Unfortunately we’ve had a severe earthquake here in Christchurch today so it’s been quite difficult to give this thread my full attention.

  8. Peter Hatherley They define a unit of semantics semantically.  This is the first time I have encountered “ontologies” as a plural concept.   Thus a “seme” not only constitutes the object of your inquiry but also bridges many languages into some common denominator at the same time.  Does Academic jargon also qualify as a “language” in this context, apparently being a closed system of such semes?  That seems to be implied by the way phonetics provides scientific “speech” as the phenomenon corresponding with either mathematics or discourse (as you choose) without revealing any substance in the object of inquiry or the method of analysis this universal scope of the “Language seme”?  What happened to tongues in this process?

  9. Great find Gideon Rosenblatt 

  10. Bruce Mincks Yes I see your point now. Sorry I couldn’t address it earlier.

    My observation is that they definitely do qualify.

    I’ve found these apparently invented words still seem to match more correctly than it might seem.

    I have a very inclusive view of language and it has helped my research to take a flexible viewpoint of it.

    This phenomena has been an enigma to me especially when it comes to urban language that also seems to fit the right mould at the seme level.

    A perfect example is the word cool that on the surface seems incorrectly matched and out of context but in reality it links intrinsically with cola and the adverts that have presented it that way over the past few generations.

  11. Peter Hatherley Then your “morphology” should be organic in nature, since your data is given to experience.  The mapping of such clusters would deceive you by implying some necessary transformation by the symmetry.  I might recommend you look at Percy Shelley’s Defense of Poetry, then compare it to Phillip Sidney’s Defense of Poesie, if your subject is English. It isn’t rocket science. . . .

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Sign up here for the latest articles. You can opt out at any time.

Subscribe by email:

Or subscribe by RSS:

%d bloggers like this: