Roger Bilisoly | Central Connecticut State University (original) (raw)

Papers by Roger Bilisoly

The Mathematical Gazette, 2008

RefDoc Refdoc est un service / is powered by. ...

In this article, we discuss the modeling of count data occurring in biological applications. We t... more In this article, we discuss the modeling of count data occurring in biological applications. We then derive asymptotic procedures for the construction of confidence limits for the over-dispersion parameter of count data when there is no likelihood available. We also obtain closed-form asymptotic variance formulae for the estimator of the over-dispersion parameter. We finally conduct a simulation study to compare these with a procedure using the maximum likelihood estimator based on the negative binomial model, in terms of the coverage. It appears that confidence interval based on the method of moments or the double extended quasi-likelihood of Lee and Nelder (Biometrika 2001, 88, 987-1006) is better for smaller deviations from the Poisson assumption and larger sample sizes. An example using biological data illustrates these procedures.

Computational Statistics Data Analysis, Jul 1, 2009

Extra-dispersion (overdispersion or underdispersion) is a common phenomenon in practice when the ... more Extra-dispersion (overdispersion or underdispersion) is a common phenomenon in practice when the variance of count data differs from that of a Poisson model. This can arise when the data come from different subpopulations or when the assumption of independence is violated. This paper develops a procedure for testing the equality of the means of several groups of counts, when extra-dispersions among the treatment groups are unequal, based on the adjusted counts using the concept of the design and size effects employed by Rao ...

ABSTRACT Interest in the mathematical structure of poetry dates back to at least the 19th century... more ABSTRACT Interest in the mathematical structure of poetry dates back to at least the 19th century: after retiring from his mathematics position, J. J. Sylvester wrote a book on prosody called textitTheLawsofVerse\textit{The Laws of Verse}textitTheLawsofVerse. Today there is interest in the computer analysis of poems, and this paper discusses how a statistical approach can be applied to this task. Starting with the definition of what Middle English alliteration is, textitSirGawainandtheGreenKnight\textit{Sir Gawain and the Green Knight}textitSirGawainandtheGreenKnight and William Langland's textitPiersPlowman\textit{Piers Plowman}textitPiersPlowman are used to illustrate the methodology. Theory first developed for analyzing data from a Riemannian manifold turns out to be applicable to strings allowing one to compute a generalized mean and variance for textual data, which is applied to the poems above. The ratio of these two variances produces the analogue of the F test, and resampling allows p-values to be estimated. Consequently, this methodology provides a way to compare prosodic variability between two texts.

Http Dx Doi Org 10 1080 02664763 2013 840273, Nov 22, 2013

ABSTRACT In many clinical trials and epidemiological studies, comparing the mean count response o... more ABSTRACT In many clinical trials and epidemiological studies, comparing the mean count response of an exposed group to a control group is often of interest. This type of data is often over-dispersed with respect to Poisson variation, and previous studies usually compared groups using confidence intervals (CIs) of the difference between the two means. However, in some situations, especially when the means are small, interval estimation of the mean ratio (MR) is preferable. Moreover, Cox and Lewis [44. D.R. Cox and P.A.W. Lewis, The Statistical Analysis of Series of Events, Methuen, London, 1966.[CrossRef]View all references] pointed out many other situations where the MR is more relevant than the difference of means. In this paper, we consider CI construction for the ratio of means between two treatments for over-dispersed Poisson data. We develop several CIs for the situation by hybridizing two separate CIs for two individual means. Extensive simulations show that all hybrid-based CIs perform reasonably well in terms of coverage. However, the CIs based on the delta method using the logarithmic transformation perform better than other intervals in the sense that they have slightly shorter interval lengths and show better balance of tail errors. These proposed CIs are illustrated with three real data examples.

Although textbook publishers offer course management systems, they do so to promote brand loyalty... more Although textbook publishers offer course management systems, they do so to promote brand loyalty, and while an open source tool such as WeBWorK is promising, it requires administrative and IT buy-in. So supported in part by a College Access Challenge Grant from the Department of Education, we collaborated with other instructors to create online homework sets for three classes: Elementary Algebra, Intermediate Algebra, and Statistics for Behavioral Sciences I. After experimentation, some of these question pools are now created by Mathematica programs that can generate data sets from specified distributions, generate random polynomials that factor in a given way, create image files of histograms, scatterplots, and so forth. These programs produce files that can be read by the software package, Respondus, which then uploads the questions into Blackboard Learn, the course management system used by the Connecticut State University system. Finally, we summarize five classes worth of student performance data along with lessons learned while working on this project.

Word Ways, 2007

This note is inspired by Numbo-Carrean, which was introduced in Ross Eckler's Word Recreations [1... more This note is inspired by Numbo-Carrean, which was introduced in Ross Eckler's Word Recreations [1 Jin the chapter called "Ten Logotopian Lingos." This lingo uses words with the following property: when each letter is replaced by its letter rank (or alphabetic position nwnber), the resulting number is a perfect square. That is, a is replaced by 1, b by 2, c by 3, and so forth, and these numbers are concatenated. For example, have becomes 81225, which is the square of285. However, if the restriction of being a square is dropped, then all words can be mapped to numbers. This note examines the cases where one number stands for more than one word. For instance, able and lay are both 12125. Since these numbers can be ambiguous, we use parentheses to surround the alphabetic position nwnbers. For example, 12125 is (I)(2)(12)(5) for able, but (12)(1)(25) for lay. The ambiguity is always due to the numerals I and 2, for instance, 12 can be the letter I or the two letters ab, but 6554 must befted, and although 25 can be either y or be, this is due to the digit 2, not the digit 5. Generating examples using a computer and a word list is straightforward to do. For each word on the list, convert it to its associated number. Then change this number to a string, and store the original word using this string as an index (that is, use an associative array).

wrote seventy short stories in his lifetime, and literary critics have categorized these stories ... more wrote seventy short stories in his lifetime, and literary critics have categorized these stories in many ways, e.g., by genres such as horror, detective or proto-science fiction. This paper discusses how a computer can group stories by using families of words related by a theme, e.g., words denoting colors. This approach combines two different techniques. First, we use term-document matrices, which were originally developed for document searches in the field of information retrieval. Second, we use formal concept theory, which defines concepts in a way that forms a Galois lattice. These lattices have both a well developed mathematical basis and have been used in applications beyond the computer sciences, e.g., social networks in mathematical sociology. Finally, we discuss how meaningful these groups of stories are to a human reader.

Interest in the mathematical structure of poetry dates back to at least the 19th century: after r... more Interest in the mathematical structure of poetry dates back to at least the 19th century: after retiring from his mathematics position, J. J. Sylvester wrote a book on prosody called textitTheLawsofVerse\textit{The Laws of Verse}textitTheLawsofVerse. Today there is interest in the computer analysis of poems, and this paper discusses how a statistical approach can be applied to this task. Starting with the definition of what Middle English alliteration is, textitSirGawainandtheGreenKnight\textit{Sir Gawain and the Green Knight}textitSirGawainandtheGreenKnight and William Langland's textitPiersPlowman\textit{Piers Plowman}textitPiersPlowman are used to illustrate the methodology. Theory first developed for analyzing data from a Riemannian manifold turns out to be applicable to strings allowing one to compute a generalized mean and variance for textual data, which is applied to the poems above. The ratio of these two variances produces the analogue of the F test, and resampling allows p-values to be estimated. Consequently, this methodology provides a way to compare prosodic variability between two texts.

This project was primarily funded by the LDRD office at Sandia National Laboratories. During the ... more This project was primarily funded by the LDRD office at Sandia National Laboratories. During the course of this three year project, the EPA, DHS, and SNL's CSRF funded other but related algorithmic extensions, which are included in this report. The EPA funding continues to support additional algorithmic development for the security of water distribution systems. DHS funded additional algorithmic research for security of internal facilities, both forward modeling and source inversion for transient simulations. CSRF funded research of the air security portion for the first year of this LDRD project. We wish to thank Dan Quintana and members of the engineering department from the Tucson Water utility company for very useful meetings and technical information exchanges, in addition to access to datasets. Most of our algorithms were tested against these datasets which are representative of real and production quality networks.

Statistics pedagogy values using a variety of examples. Thanks to text resources on the Web, and ... more Statistics pedagogy values using a variety of examples. Thanks to text resources on the Web, and since statistical packages have the ability to analyze string data, it is now easy to use language-based examples in a statistics class. Three such examples are discussed here. First, many types of wordplay (e.g., crosswords and hangman) involve finding words with letters that satisfy a certain pattern. Second, linguistics has shown that idiomatic pairs of words often appear together more frequently than chance. For example, in the Brown Corpus, this is true of the phrasal verb to throw up (p-value=7.92E-10.) Third, a pangram contains all the letters of the alphabet at least once. These are searched for in Charles Dickens' A Christmas Carol, and their lengths are compared to the expected value given by the unequal probability coupon collector's problem as well as simulations.

Markov chains are an important example for a course on stochastic processes. Simple board games c... more Markov chains are an important example for a course on stochastic processes. Simple board games can be used to illustrate the fundamental concepts. For example, a looping board game (like Monopoly®) consists of all recurrent states, and a game where players win by reaching a final square (like Chutes and Ladders®) consists of all transient states except for the recurrent ending state. With the availability of computer algebra packages, these games can be analyzed. For example, the mean times in transient states and the stationary probabilities for recurrent states are easily computed. This article shows several simple board games analyzed with Mathematica®, and indicates how more complex games can be approached.

As demonstrated by the anthrax attack through the United States mail, people infected by the biol... more As demonstrated by the anthrax attack through the United States mail, people infected by the biological agent itself will give the first indication of a bioterror attack. Thus, a distributed information system that can rapidly and efficiently gather and analyze public health data would aid epidemiologists in detecting and characterizing emerging diseases, including bioterror attacks. We propose using clusters of adverse health events in space and time to detect possible bioterror attacks. Space-time clusters can indicate exposure to infectious diseases or localized exposure to toxins. Most space-time clustering approaches require individual patient data. To protect the patient's privacy, we have extended these approaches to aggregated data and have embedded this extension in a sequential probability ratio test (SPRT) framework. The real-time and sequential nature of health data makes the SPRT an ideal candidate. The result of space-time clustering gives the statistical significance of a cluster at every location in the surveillance area and can be thought of as a "health-index" of the people living in this area. As a surrogate to bioterrorism data, we have experimented with two flu data sets. For both databases, we show that space-time clustering can detect a flu epidemic up to 21 to 28 days earlier than a conventional periodic regression technique. We have also tested using simulated anthrax attack data on top of a respiratory illness diagnostic category. Results show we do very well at detecting an attack as early as the second or third day after infected people start becoming severely symptomatic.

Critical Transitions in Water and Environmental Resources Management, 2004

The effect of variable demands at short time scales on the transport of a solute through a water ... more The effect of variable demands at short time scales on the transport of a solute through a water distribution network has not previously been studied. We simulate flow and transport in a small water distribution network using EPANET to explore the effect of variable demand on ...

Interest in the mathematical structure of poetry dates back to at least the 19th century: after r... more Interest in the mathematical structure of poetry dates back to at least the 19th century: after retiring from his mathematics position, J. J. Sylvester wrote a book on prosody called The Laws of Verse. Today there is interest in the computer analysis of poems, and this paper discusses how a statistical approach can be applied to this task. Starting with the definition of what Middle English alliteration is, Sir Gawain and the Green Knight and William Langland’s Piers Plowman are used to illustrate the methodology. Theory first developed for analyzing data from a Riemannian manifold turns out to be applicable to strings allowing one to compute a generalized mean and variance for textual data, which is applied to the poems above. The ratio of these two variances produces the analogue of the F test, and resampling allows p-values to be estimated. Consequently, this methodology provides a way to compare prosodic variability between two texts.

The Mathematical Gazette, 2008

RefDoc Refdoc est un service / is powered by. ...

Computational Statistics Data Analysis, Jul 1, 2009

Http Dx Doi Org 10 1080 02664763 2013 840273, Nov 22, 2013

Word Ways, 2007

Critical Transitions in Water and Environmental Resources Management, 2004

Interest in the mathematical structure of poetry dates back to at least the 19th century: after r... more Interest in the mathematical structure of poetry dates back to at least the 19th century: after retiring from his mathematics position, J. J. Sylvester wrote a book on prosody called The Laws of Verse. Today there is interest in the computer analysis of poems, and this paper discusses how a statistical approach can be applied to this task. Starting with the definition of what Middle English alliteration is, Sir Gawain and the Green Knight and William Langland’s Piers Plowman are used to illustrate the methodology. Theory first developed for analyzing data from a Riemannian manifold turns out to be applicable to strings allowing one to compute a generalized mean and variance for textual data, which is applied to the poems above. The ratio of these two variances produces the analogue of the F test, and resampling allows p-values to be estimated. Consequently, this methodology provides a way to compare prosodic variability between two texts.