Robert A Downing II | Fremont, California, United States (original) (raw)

Drafts by Robert A Downing II

Research paper thumbnail of An analysis of the Voynich Manuscript using word repetition and correlation with images

In this work we present an algorithm based on word translation technique. We use word frequencies... more In this work we present an algorithm based on word translation technique. We use word frequencies instead of letter frequencies to find words that are specific to a certain page and identify the word using the image. Our algorithm attempts to use the fact that words are invariant even if they have been encrypted in an unknown language. Instead of using the captions we use word frequencies and correlate them to the image. Even though the text is encrypted the meaning of the word in the context remains invariant thus correlating with the images will help us decrypt the words.

Research paper thumbnail of An Examination of the Repetition of Multiple Plant Names and their Origins, Uses, and Appearances as an Insight into the Content of the Pages of the Voynich Manuscript

This paper attempts to create an understanding as to what content some pages of the Voynich manus... more This paper attempts to create an understanding as to what content some pages of the Voynich manuscript may contain through the identity and uses of two or more plant names on the same page. The first word of each page on the botanical section of the Voynich manuscript, a codex that has not been deciphered yet, usually occurs only once in the entire manuscript, leading to the hypothesis that the first word on the page may be referring to the name of the plant illustrated on the page. Some of these plants are, however, mentioned more than once on different pages and even occur in different sections than the botanical section. Through the speculation of the identity of the plants that repeat and their properties, origins, or uses, the author attempts to predict the content that is discussed in the pages where the repetition occurs.

Research paper thumbnail of Using Stellar Radius Error Values as a Method to Discover Black Hole Candidates

In April of 2021, Jayasinghe et al. reported the discovery of a small, black hole binary companio... more In April of 2021, Jayasinghe et al. reported the discovery of a small, black hole binary companion to the red giant V723 Mon. The discovery of a black hole this small is very important as it closes the mass gap between the smallest known black holes and the smallest theorized. The black hole's mass, distorting the surrounding spacetime fabric, affects its binary companion by stretching its stellar envelope causing the star to take the shape of an ellipsoid. Under normal circumstances we expect stellar radius error to asymptotically approach zero. However, stars with a mini-black hole companion would have stellar radius error increase as the stretching of the star into an ellipsoidal shape makes it harder to ascertain the stellar radius. Using the data found in the NASA Exoplanet Archive, we used an algorithm to initially look through the M and K type stars for any stars that meet our criteria. While we haven't found any definitive candidates yet, we assume that with more time-series data, and by further refining our algorithm, we will have a solid methodology to identify possible mini-Black Hole binary pair candidates.

Research paper thumbnail of An Examination of Folios 4r-4v of the Voynich Manuscript, from the Perspective of Vulgar Latin and a Possible Connection to Sanskrit

Folios 4r and 4v of the Voynich Manuscript, a mysterious codex, written in an unknown language du... more Folios 4r and 4v of the Voynich Manuscript, a mysterious codex,
written in an unknown language during the 15th century, contain
various illustrations of plants, believed to hold medicinal
and/or religious significance. The Voynich manuscript has
importance since it may contain novel herbal remedies for
diseases and reveal cultural practices previously undiscovered.
The manuscript is theorized to be written in a derivative of a
proto-Indo-European language, with Vulgar Latin and Italian
being the primary possible contemporary languages.
Using the same methodologies as applied to the earlier folios,
further transliterations to contemporary languages having
the same root in Vulgar Latin are explored. In evaluating
the potential words on these folios for frequencies matching
known frequencies in the contemporary descendants of proto-
Indo-European language, possible equalities are identified.
Contemporary words sharing the same frequencies are then
explored in the context of the embedded diagram(s) on the
folios for context, then the process is extended through the
remainder of MS 408 in an effort to demonstrate the proposed
context holds true.
The author proposes a new theory of the Voynich manuscript
having been encoded in a mix of Sanskrit and Arabic, which
is historically sound, since during the 15th century—the same
time as when the Voynich Manuscript is carbon dated to have
been written—trade across the Silk Road was flourishing allowing
for cultures and languages to be spread. Additionally,
the Ottoman Empire was in its glory days, making Turkish
languages the languages of the world. We are currently looking
into exploring additional similarities between the Voynich
manuscript and characters from Sanskrit and Arabic, opening
new avenues for further research.

Research paper thumbnail of A Quantitative Study of the Voynich Manuscript through the Kolmogorov- Smirnov Test

Research paper thumbnail of An Application of Data Mining And Frequency Analyses to Determine Source Languages of the Voynich Manuscript

ASDRP Communications, 2019

​ MS 408, also known as the Voynich Manuscript, has perplexed readers for centuries due to its st... more ​ MS 408, also known as the Voynich Manuscript, has perplexed readers for centuries due to its strange writing and illustrations of plants, symbols, and human figures. The nature of the Voynich Manuscript, along with existing transcriptions of its writing, promote the use of data mining and machine learning techniques to find underlying patterns in its text. Resulting letter frequency analyses reveal that the text in the Voynich Manuscript is closely connected to both Latin and Italian. Comparisons between bigram frequencies from the Voynich Manuscript and those from representative Latin, Italian, Old French, and Old Spanish texts show strong correlations. Essentially, these texts support how the Voynich Manuscript is heavily influenced by Latin or a close derivative of Latin, which is historically plausible. Ultimately, the resulting conclusions attempt to clarify the mystery surrounding the manuscript and assist ongoing efforts to solve this enigma by forging new connections to help understand the Voynich Manuscript.

Papers by Robert A Downing II

Research paper thumbnail of Voynich Character MSE Paper Final

Decoding the Voynich Manuscript using Mean Square Error Image Analysis, 2024

The Voynich manuscript is a mysterious, undeciphered codex written in an unknown script, referred... more The Voynich manuscript is a mysterious, undeciphered codex written in an unknown script, referred to as “Voynichese,” by an unknown author during the early fifteenth century, as proved by radiocarbon dating. Written from left to right and top to bottom, the indecipherable “Voynichese” character set and perplexing illustrations have confounded researchers for years, captivating the interests of researchers in various fields, such as linguistics, machine learning, cryptography, and botany. During the time period when the Voynich manuscript is believed to be written, namely the 15th century, the Silk Road, while largely commercial, allowed for all sorts of creative exchange, including language, between tremendously diverse peoples and cultures. For this reason, Voynichese characters could have roots in Sanskrit and Arabic (languages spoken along the Silk Road), as well as Phoenician and Greek (Mediterranean languages). To examine the Voynichese character set based on visual analysis, computer vision and mean square error image analysis can be employed to determine the degree of similarity between Voynichese characters and characters of the previously mentioned languages.

Research paper thumbnail of ExoplanetFall2020.pdf

The NASA Exoplanet Archive is a dataset that is an extraction from the total sets of data from th... more The NASA Exoplanet Archive is a dataset that is an extraction
from the total sets of data from the Keck, Kepler, TESS,
and Gaia observations, where observations show that the observed
stellar objects have been determined to possess one or
more planets. It is continually updated as more and more exoplanets,
or planets outside our own solar system, are discovered
and documented. Our first objective was to see how many of
these entries were duplicates, which would bring the total number
of entries we would work with from 29,283 to 4,259. In previous
research, this dataset was filtered by determining which of
these exoplanets are inside their Circumstellar Habitable-Zone
(CHZ), commonly defined as the range of distance from a host
star such that a planet may contain liquid water, a key requirement
for life as we know it. However, this calculation was done
only for exoplanets with M-type host stars. Over the course of
our research, we were able to expand this calculation of the CHZ
to exoplanets with host stars of all spectral types. We performed
more in-depth investigation of planets with G, K, and M types
stars by comparing them to planets in the Planetary Habitable
Labratory (PHL) exoplanet dataset to see how many similarities
there are. The PHL catalog used its own set of criteria to
define those planets in it as habitable. Using this method, we
determined that there were 3 exoplanets with M-type host stars,
0 exoplanets with a G-type host star, and 1 exoplanet with a Ktype
host star.

Research paper thumbnail of The Exploration of Habitable Exoplanets using Data Mining Algorithms and Data Manipulation

The NASA Exoplanet Archive is a dataset that is an extraction from the total sets of data from th... more The NASA Exoplanet Archive is a dataset that is an extraction from the total sets of data from the Keck, Kepler, TESS, and Gaia observations, where observations show that the observed stellar objects have been determined to possess one or more planets. It is continually updated as more and more exoplanets, or planets outside our own solar system, are discovered and documented. Our first objective was to see how many of these entries were duplicates, which would bring the total number of entries we would work with from 29,283 to 4,259. In previous research, this dataset was filtered by determining which of these exoplanets are inside their Circumstellar Habitable-Zone (CHZ), commonly defined as the range of distance from a host star such that a planet may contain liquid water, a key requirement for life as we know it. However, this calculation was done only for exoplanets with M-type host stars. Over the course of our research, we were able to expand this calculation of the CHZ to exoplanets with host stars of all spectral types. We performed more in-depth investigation of planets with G, K, and M types stars by comparing them to planets in the Planetary Habitable Labratory (PHL) exoplanet dataset to see how many similarities there are. The PHL catalog used its own set of criteria to define those planets in it as habitable. Using this method, we determined that there were 3 exoplanets with M-type host stars, 0 exoplanets with a G-type host star, and 1 exoplanet with a Ktype host star.

Research paper thumbnail of Application of Data Mining to Search for Potentially Habitable Exoplanets

Many light years away from our own solar system, over four thousand confirmed planets orbit stars... more Many light years away from our own solar system, over four thousand confirmed planets orbit stars in a fashion similar to our own eight planets and the sun. With the discovery of these planets, called "exoplanets," comes the question of extraterrestrial life, a concept scientists have been exploring for years. The possibility of exoplanetary habitability relies on a number of factors, such as spectral type, density, and eccentricity, but most importantly: whether the exoplanet in question contains water, the fundamental requirement for life, as we know it, to exist. To determine whether an exoplanet provides the ideal conditions for sustaining this vital ingredient for life, we considered the concept of the Goldilocks Zone, or the circumstellar habitable zone (CHZ)-the range of orbits around a star where liquid water is capable of existing. The research we have been conducting this summer utilizes the public dataset provided by NASA and Caltech and data mining methods, including Python and Microsoft Excel, to identify exoplanets with potentially habitable conditions. The discovery of the exoplanet K2-18b's water vapor-containing atmosphere was a major part of our research, in which we focused on identifying exoplanets with similar attributes to that of K2-18b, in hopes that they too may be able to retain atmospheric water vapor. After a two-month period, we discovered that 59 exoplanets orbit in the CHZ of their host star. As for the K2-18b ruleset, only 1 planet, K2-3d, satisfies the conditions. We believe K2-3d to have a high degree of similarity to K2-18b, but more in-depth analysis will have to be conducted to conclude its potential to support atmospheric water vapor and life as we know it.

Research paper thumbnail of A Proposed Mapping of the Voynich Alphabet to an Indo-European Language

ASDRP Communications v3, 2020

Folios 1v-6r of the Voynich Manuscript, a mysterious codex that was written in an unknown languag... more Folios 1v-6r of the Voynich Manuscript, a mysterious codex that was written in an unknown language during the 15th or 16th century, contains various illustrations of plants, and these diagrams offer valuable clues about the text and its meaning (1). The labels associated with these diagrams were roughly translated to various derivatives of Proto-Indo-European languages (Latin, Italian, French, Spanish, and Portuguese). This was done using a frequency-based character map where the most common characters in Voynichese were mapped to the most frequent characters in the target language. The translated text was manually checked to identify words with possible meaning in a target language, and the meanings of those words were cross-referenced with Dr. Gerard Cheshire's work to identify which words were false positives and which were likely correct translations. Our mapping helps validate Dr. Cheshire's research on the plants described in the first ten folios of the Voynich Manuscript and provides further insight into the information contained in those folios (2). Since the Voynich Manuscript is over six hundred years old and may possess information that reveals novel remedies to common diseases and plants that were previously undiscovered, it can provide meaningful information that could alter the field of medicine. Also, because of the use of Vulgar Latin after the fall of the Roman Empire, the text may give clues about this purely spoken language. Voynich Manuscript | MS 408 | Romance Languages | Medieval Plants | Natural Language Processing | Latin

Research paper thumbnail of A Quantitative Study of the Voynich Manuscript through the Kolmogorov- Smirnov Test

The National High School Journal of Science, 2020

The Voynich Manuscript [MS408, VM] is a collection of drawings and text written in an unknown sou... more The Voynich Manuscript [MS408, VM] is a collection of drawings and text written in an unknown source language, using an unknown character set from the 15th century, and written by an unknown author, so virtually nothing is known about the significance of what it contains. From booksellers to cryptographers, no one has been able to decipher the contents of the VM, but they have come up with various theories, from alien influences to the manuscript being a hoax. Though there is much research on the qualitative aspects of the VM, there is a lack of quantitative analyses of the text. The goal of this paper is to provide a quantitative comparison between the possible source language used in the VM and modern-day languages through an application of the Kolmogorov-Smirnov [K-S] test, a robust statistical test that compares the distribution of two datasets. Results of this test indicate a strong similarity between the VM, Latin, and Italian, supporting the widely-regarded belief that the language of the VM is a Vulgar Latin derivative.

Research paper thumbnail of An analysis of the Voynich Manuscript using word repetition and correlation with images

In this work we present an algorithm based on word translation technique. We use word frequencies... more In this work we present an algorithm based on word translation technique. We use word frequencies instead of letter frequencies to find words that are specific to a certain page and identify the word using the image. Our algorithm attempts to use the fact that words are invariant even if they have been encrypted in an unknown language. Instead of using the captions we use word frequencies and correlate them to the image. Even though the text is encrypted the meaning of the word in the context remains invariant thus correlating with the images will help us decrypt the words.

Research paper thumbnail of An Examination of the Repetition of Multiple Plant Names and their Origins, Uses, and Appearances as an Insight into the Content of the Pages of the Voynich Manuscript

This paper attempts to create an understanding as to what content some pages of the Voynich manus... more This paper attempts to create an understanding as to what content some pages of the Voynich manuscript may contain through the identity and uses of two or more plant names on the same page. The first word of each page on the botanical section of the Voynich manuscript, a codex that has not been deciphered yet, usually occurs only once in the entire manuscript, leading to the hypothesis that the first word on the page may be referring to the name of the plant illustrated on the page. Some of these plants are, however, mentioned more than once on different pages and even occur in different sections than the botanical section. Through the speculation of the identity of the plants that repeat and their properties, origins, or uses, the author attempts to predict the content that is discussed in the pages where the repetition occurs.

Research paper thumbnail of Using Stellar Radius Error Values as a Method to Discover Black Hole Candidates

In April of 2021, Jayasinghe et al. reported the discovery of a small, black hole binary companio... more In April of 2021, Jayasinghe et al. reported the discovery of a small, black hole binary companion to the red giant V723 Mon. The discovery of a black hole this small is very important as it closes the mass gap between the smallest known black holes and the smallest theorized. The black hole's mass, distorting the surrounding spacetime fabric, affects its binary companion by stretching its stellar envelope causing the star to take the shape of an ellipsoid. Under normal circumstances we expect stellar radius error to asymptotically approach zero. However, stars with a mini-black hole companion would have stellar radius error increase as the stretching of the star into an ellipsoidal shape makes it harder to ascertain the stellar radius. Using the data found in the NASA Exoplanet Archive, we used an algorithm to initially look through the M and K type stars for any stars that meet our criteria. While we haven't found any definitive candidates yet, we assume that with more time-series data, and by further refining our algorithm, we will have a solid methodology to identify possible mini-Black Hole binary pair candidates.

Research paper thumbnail of An Examination of Folios 4r-4v of the Voynich Manuscript, from the Perspective of Vulgar Latin and a Possible Connection to Sanskrit

Folios 4r and 4v of the Voynich Manuscript, a mysterious codex, written in an unknown language du... more Folios 4r and 4v of the Voynich Manuscript, a mysterious codex,
written in an unknown language during the 15th century, contain
various illustrations of plants, believed to hold medicinal
and/or religious significance. The Voynich manuscript has
importance since it may contain novel herbal remedies for
diseases and reveal cultural practices previously undiscovered.
The manuscript is theorized to be written in a derivative of a
proto-Indo-European language, with Vulgar Latin and Italian
being the primary possible contemporary languages.
Using the same methodologies as applied to the earlier folios,
further transliterations to contemporary languages having
the same root in Vulgar Latin are explored. In evaluating
the potential words on these folios for frequencies matching
known frequencies in the contemporary descendants of proto-
Indo-European language, possible equalities are identified.
Contemporary words sharing the same frequencies are then
explored in the context of the embedded diagram(s) on the
folios for context, then the process is extended through the
remainder of MS 408 in an effort to demonstrate the proposed
context holds true.
The author proposes a new theory of the Voynich manuscript
having been encoded in a mix of Sanskrit and Arabic, which
is historically sound, since during the 15th century—the same
time as when the Voynich Manuscript is carbon dated to have
been written—trade across the Silk Road was flourishing allowing
for cultures and languages to be spread. Additionally,
the Ottoman Empire was in its glory days, making Turkish
languages the languages of the world. We are currently looking
into exploring additional similarities between the Voynich
manuscript and characters from Sanskrit and Arabic, opening
new avenues for further research.

Research paper thumbnail of A Quantitative Study of the Voynich Manuscript through the Kolmogorov- Smirnov Test

Research paper thumbnail of An Application of Data Mining And Frequency Analyses to Determine Source Languages of the Voynich Manuscript

ASDRP Communications, 2019

​ MS 408, also known as the Voynich Manuscript, has perplexed readers for centuries due to its st... more ​ MS 408, also known as the Voynich Manuscript, has perplexed readers for centuries due to its strange writing and illustrations of plants, symbols, and human figures. The nature of the Voynich Manuscript, along with existing transcriptions of its writing, promote the use of data mining and machine learning techniques to find underlying patterns in its text. Resulting letter frequency analyses reveal that the text in the Voynich Manuscript is closely connected to both Latin and Italian. Comparisons between bigram frequencies from the Voynich Manuscript and those from representative Latin, Italian, Old French, and Old Spanish texts show strong correlations. Essentially, these texts support how the Voynich Manuscript is heavily influenced by Latin or a close derivative of Latin, which is historically plausible. Ultimately, the resulting conclusions attempt to clarify the mystery surrounding the manuscript and assist ongoing efforts to solve this enigma by forging new connections to help understand the Voynich Manuscript.

Research paper thumbnail of Voynich Character MSE Paper Final

Decoding the Voynich Manuscript using Mean Square Error Image Analysis, 2024

The Voynich manuscript is a mysterious, undeciphered codex written in an unknown script, referred... more The Voynich manuscript is a mysterious, undeciphered codex written in an unknown script, referred to as “Voynichese,” by an unknown author during the early fifteenth century, as proved by radiocarbon dating. Written from left to right and top to bottom, the indecipherable “Voynichese” character set and perplexing illustrations have confounded researchers for years, captivating the interests of researchers in various fields, such as linguistics, machine learning, cryptography, and botany. During the time period when the Voynich manuscript is believed to be written, namely the 15th century, the Silk Road, while largely commercial, allowed for all sorts of creative exchange, including language, between tremendously diverse peoples and cultures. For this reason, Voynichese characters could have roots in Sanskrit and Arabic (languages spoken along the Silk Road), as well as Phoenician and Greek (Mediterranean languages). To examine the Voynichese character set based on visual analysis, computer vision and mean square error image analysis can be employed to determine the degree of similarity between Voynichese characters and characters of the previously mentioned languages.

Research paper thumbnail of ExoplanetFall2020.pdf

The NASA Exoplanet Archive is a dataset that is an extraction from the total sets of data from th... more The NASA Exoplanet Archive is a dataset that is an extraction
from the total sets of data from the Keck, Kepler, TESS,
and Gaia observations, where observations show that the observed
stellar objects have been determined to possess one or
more planets. It is continually updated as more and more exoplanets,
or planets outside our own solar system, are discovered
and documented. Our first objective was to see how many of
these entries were duplicates, which would bring the total number
of entries we would work with from 29,283 to 4,259. In previous
research, this dataset was filtered by determining which of
these exoplanets are inside their Circumstellar Habitable-Zone
(CHZ), commonly defined as the range of distance from a host
star such that a planet may contain liquid water, a key requirement
for life as we know it. However, this calculation was done
only for exoplanets with M-type host stars. Over the course of
our research, we were able to expand this calculation of the CHZ
to exoplanets with host stars of all spectral types. We performed
more in-depth investigation of planets with G, K, and M types
stars by comparing them to planets in the Planetary Habitable
Labratory (PHL) exoplanet dataset to see how many similarities
there are. The PHL catalog used its own set of criteria to
define those planets in it as habitable. Using this method, we
determined that there were 3 exoplanets with M-type host stars,
0 exoplanets with a G-type host star, and 1 exoplanet with a Ktype
host star.

Research paper thumbnail of The Exploration of Habitable Exoplanets using Data Mining Algorithms and Data Manipulation

The NASA Exoplanet Archive is a dataset that is an extraction from the total sets of data from th... more The NASA Exoplanet Archive is a dataset that is an extraction from the total sets of data from the Keck, Kepler, TESS, and Gaia observations, where observations show that the observed stellar objects have been determined to possess one or more planets. It is continually updated as more and more exoplanets, or planets outside our own solar system, are discovered and documented. Our first objective was to see how many of these entries were duplicates, which would bring the total number of entries we would work with from 29,283 to 4,259. In previous research, this dataset was filtered by determining which of these exoplanets are inside their Circumstellar Habitable-Zone (CHZ), commonly defined as the range of distance from a host star such that a planet may contain liquid water, a key requirement for life as we know it. However, this calculation was done only for exoplanets with M-type host stars. Over the course of our research, we were able to expand this calculation of the CHZ to exoplanets with host stars of all spectral types. We performed more in-depth investigation of planets with G, K, and M types stars by comparing them to planets in the Planetary Habitable Labratory (PHL) exoplanet dataset to see how many similarities there are. The PHL catalog used its own set of criteria to define those planets in it as habitable. Using this method, we determined that there were 3 exoplanets with M-type host stars, 0 exoplanets with a G-type host star, and 1 exoplanet with a Ktype host star.

Research paper thumbnail of Application of Data Mining to Search for Potentially Habitable Exoplanets

Many light years away from our own solar system, over four thousand confirmed planets orbit stars... more Many light years away from our own solar system, over four thousand confirmed planets orbit stars in a fashion similar to our own eight planets and the sun. With the discovery of these planets, called "exoplanets," comes the question of extraterrestrial life, a concept scientists have been exploring for years. The possibility of exoplanetary habitability relies on a number of factors, such as spectral type, density, and eccentricity, but most importantly: whether the exoplanet in question contains water, the fundamental requirement for life, as we know it, to exist. To determine whether an exoplanet provides the ideal conditions for sustaining this vital ingredient for life, we considered the concept of the Goldilocks Zone, or the circumstellar habitable zone (CHZ)-the range of orbits around a star where liquid water is capable of existing. The research we have been conducting this summer utilizes the public dataset provided by NASA and Caltech and data mining methods, including Python and Microsoft Excel, to identify exoplanets with potentially habitable conditions. The discovery of the exoplanet K2-18b's water vapor-containing atmosphere was a major part of our research, in which we focused on identifying exoplanets with similar attributes to that of K2-18b, in hopes that they too may be able to retain atmospheric water vapor. After a two-month period, we discovered that 59 exoplanets orbit in the CHZ of their host star. As for the K2-18b ruleset, only 1 planet, K2-3d, satisfies the conditions. We believe K2-3d to have a high degree of similarity to K2-18b, but more in-depth analysis will have to be conducted to conclude its potential to support atmospheric water vapor and life as we know it.

Research paper thumbnail of A Proposed Mapping of the Voynich Alphabet to an Indo-European Language

ASDRP Communications v3, 2020

Folios 1v-6r of the Voynich Manuscript, a mysterious codex that was written in an unknown languag... more Folios 1v-6r of the Voynich Manuscript, a mysterious codex that was written in an unknown language during the 15th or 16th century, contains various illustrations of plants, and these diagrams offer valuable clues about the text and its meaning (1). The labels associated with these diagrams were roughly translated to various derivatives of Proto-Indo-European languages (Latin, Italian, French, Spanish, and Portuguese). This was done using a frequency-based character map where the most common characters in Voynichese were mapped to the most frequent characters in the target language. The translated text was manually checked to identify words with possible meaning in a target language, and the meanings of those words were cross-referenced with Dr. Gerard Cheshire's work to identify which words were false positives and which were likely correct translations. Our mapping helps validate Dr. Cheshire's research on the plants described in the first ten folios of the Voynich Manuscript and provides further insight into the information contained in those folios (2). Since the Voynich Manuscript is over six hundred years old and may possess information that reveals novel remedies to common diseases and plants that were previously undiscovered, it can provide meaningful information that could alter the field of medicine. Also, because of the use of Vulgar Latin after the fall of the Roman Empire, the text may give clues about this purely spoken language. Voynich Manuscript | MS 408 | Romance Languages | Medieval Plants | Natural Language Processing | Latin

Research paper thumbnail of A Quantitative Study of the Voynich Manuscript through the Kolmogorov- Smirnov Test

The National High School Journal of Science, 2020

The Voynich Manuscript [MS408, VM] is a collection of drawings and text written in an unknown sou... more The Voynich Manuscript [MS408, VM] is a collection of drawings and text written in an unknown source language, using an unknown character set from the 15th century, and written by an unknown author, so virtually nothing is known about the significance of what it contains. From booksellers to cryptographers, no one has been able to decipher the contents of the VM, but they have come up with various theories, from alien influences to the manuscript being a hoax. Though there is much research on the qualitative aspects of the VM, there is a lack of quantitative analyses of the text. The goal of this paper is to provide a quantitative comparison between the possible source language used in the VM and modern-day languages through an application of the Kolmogorov-Smirnov [K-S] test, a robust statistical test that compares the distribution of two datasets. Results of this test indicate a strong similarity between the VM, Latin, and Italian, supporting the widely-regarded belief that the language of the VM is a Vulgar Latin derivative.