Michael L Brodie - Profile on Academia.edu (original) (raw)
Dr. Michael L. Brodie has 45+ years of research and industry experience in data science, databases, artificial intelligence, and multidisciplinary problem-solving. I apply my knowledge to big-picture opportunities and challenges with philosophical and imminent practical impacts. For five years, my research objective is to better understand and define data science as a field of inquiry – its philosophy [2] and that of data per se [5], and the data science problem-solving paradigm [4]
I am a Visiting Scholar in DASlab, School of Engineering and Applied Sciences, Harvard University. For 2013-2019, I was a Research Scientist in the MIT Data Systems Group. I am a Canadian-American with a Ph.D. in databases and AI from the University of Toronto and a Doctor of Science (honoris causa) from the National University of Ireland.
I have authored 200+ articles and seven books on advanced technologies, often collaborating with Turing Laureate Michael Stonebraker, given 100+ keynote addresses, and have an H-factor of 32 and an i10-index of 63. The Association of Computing Machinery has distributed 33,000+ copies of my 2019 book [1]. Fulltime and visiting computer science professorships in Canada, USA, Germany, France, Italy, Australia, and Ireland expanded my knowledge of computer science research and practice. My practical, industrial knowledge comes from 25+ years as Chief Scientist of Verizon, one of the world’s largest enterprises. I was responsible for emerging technologies – architectures, methodologies, and strategies, e.g., the largest installation outside of Europe of SAP’s R3 ERP.
Since 1980, I have served on Scientific Advisory Boards of national and international research organizations including US National Academy of Sciences committees and 20+ startups. I chaired (2013-2019) the Scientific Advisory Committee of Science Foundation Ireland Research Center for Data Analytics, Europe’s first and largest data science research institute, where I contributed to defining the phenomenal new field of data science.
Fortuitously, my Ph.D. vision of AI extending databases and vice versa, prepared me, more than I could have foreseen, for the recent emergence of AI-based data science. I pursue that vison to better understand the miraculous yet inscrutable field of data science to enable knowledge discovery with scope, scale, complexity, and power beyond that of science, our previously most powerful knowledge discovery paradigm.
While what is data science? and what is data? may seem philosophical and far from urgent, practical concerns, they must be understood to realize potential benefits, to identify and minimize risks, and to anticipate our 21st C world [3]. Such data science thinking is required to gain insights into our world through data science problem-solving [4]. My recent papers on these topics [2][3] have been downloaded 2,000+ times in six months.
Supervisors: Dennis Tsichritzis
Phone: 7817104928
Address: Cambridge, Massachusetts
less
Related Authors
Uploads
Papers by Michael L Brodie
OTM’10 Keynote
Lecture Notes in Computer Science, 2010
Database Management: A Survey
The objective of Database Management technology is to provide general purpose mechanisms for mana... more The objective of Database Management technology is to provide general purpose mechanisms for managing large, shared data repositories. This chapter presents the basic concepts, techniques, and tools of database management. Data modelling, data models, a/trf database languages are described together with their application in database design and development, database management systems and implementation issues are outlined. The chapter concludes by discussing the current challenges that are driving advances in database technology and by identifying the future directions for database management research. Database motivations and concepts are compared and contrasted with those in Artificial Intelligence.
arXiv (Cornell University), Feb 14, 2024
The objective of this research is to provide a framework with which the data science community ca... more The objective of this research is to provide a framework with which the data science community can understand, define, and develop data science as a field of inquiry. The framework is based on the classical reference framework (axiology, ontology, epistemology, methodology) used for 200 years to define knowledge discovery paradigms and disciplines in the humanities, sciences, algorithms, and now data science. I augmented it for automated problem-solving with (methods, technology, community). The resulting data science reference framework is used to define the data science knowledge discovery paradigm in terms of the philosophy of data science addressed in previous papers and the data science problem-solving paradigm, i.e., the data science method, and the data science problem-solving workflow, both addressed in this paper. The framework is a much called for unifying framework for data science as it contains the components required to define data science. For insights to better understand data science, this paper uses the framework to define the emerging, often enigmatic, data science problem-solving paradigm and workflow, and to compare them with their well-understood scientific counterparts, scientific problem-solving paradigm and workflow.
On the Design and Specification of Database Transactions
Elsevier eBooks, 1989
OTM’10 Keynote
Springer eBooks, 2010
Database Management: A Survey
Topics in information systems, 1986
The objective of Database Management technology is to provide general purpose mechanisms for mana... more The objective of Database Management technology is to provide general purpose mechanisms for managing large, shared data repositories. This chapter presents the basic concepts, techniques, and tools of database management. Data modelling, data models, a/trf database languages are described together with their application in database design and development, database management systems and implementation issues are outlined. The chapter concludes by discussing the current challenges that are driving advances in database technology and by identifying the future directions for database management research. Database motivations and concepts are compared and contrasted with those in Artificial Intelligence.
ArXiv.org, 2024
The objective of this research is to provide a framework with which the data science community ca... more The objective of this research is to provide a framework with which the data science community can understand, define, and develop data science as a field of inquiry. The framework is based on the classical reference framework (axiology, ontology, 1 epistemology, methodology) used for 200 years to define knowledge discovery p a ra d i g m s a n d d i s c i p l i n e s i n t h e humanities, sciences, algorithms [28], and now data science. I augmented it for automated problem-solving with (methods, technology, community) [3][4]. The resulting data science reference framework is used to define the data science knowledge discovery paradigm in terms of the philosophy of data science addressed in [3] and the data science problem-solving paradigm, i.e., the data science method ,
An Extended Transaction Environment for Workflows in Distributed Object Computing
IEEE Data(base) Engineering Bulletin, 1993
Page 26. An Extended Transaction Environment for Workflows in Distributed Object Computing Dimitr... more Page 26. An Extended Transaction Environment for Workflows in Distributed Object Computing Dimitrios Georgakopoulos Mark F. Hornick Frank Manola Michael L. Brodie Sandra Heiler Farshad Nayeri Benjamin Hurwitz GTE ...
Springer eBooks, 2009
A compelling question for the 21 st Century is "What is the nature of our digital universe?" In d... more A compelling question for the 21 st Century is "What is the nature of our digital universe?" In designing the Future Internet, we have a remarkable opportunity and need for a deeper understanding of this question. Based on the profound importance of our Future Digital World and on the role of the Future Internet in shaping it, this paper suggests holistic objectives for the design process and some thoughts on challenges and opportunities that the design process may face.
On knowledge-based systems architectures
Springer eBooks, Jul 9, 1986
Association: A Database Abstraction for Semantic Modelling
International Conference on Entity-Relationship Approach, Oct 12, 1981
Google, Inc. (search), Subscribe (Full Service), Register (Limited Service, Free), Login. Search:... more Google, Inc. (search), Subscribe (Full Service), Register (Limited Service, Free), Login. Search: The ACM Digital Library The Guide. ...
A Functional Framework for Database Management Systems
Abstract : The concept of DBMS architecture played an important role in the design, analysis, and... more Abstract : The concept of DBMS architecture played an important role in the design, analysis, and comparison of DBMSs as well as in the development of other database concepts. the ANSI/SPARC prototypical database system architecture was a major contribution in this development. The architecture raised many issues, stimulated considerable research, and posed a number of new problems. Since the basic formulation of the ANSI architecture, in 1974, little consideration was given to resolving problems and accommodating new and future developments. The main problems concern its unnecessary ridgidity. The contributions of this paper are a distinction between DBMS framework and DBMS architecture, and a functional DBMS framework. The framework was developed using a functional approach in which a DEMS is characterized abstractly in terms of functional components and their potential relationships. The approach is based on the notions of modularity and data abstraction as developed in software engineering and programming languages. (Author)
Proceedings of the 1980 workshop on Data abstraction, databases and conceptual modeling
Distributed computing, and distributed object computing in particular, holds remarkable promise f... more Distributed computing, and distributed object computing in particular, holds remarkable promise for future Information Systems (ISs) and for more productive collaboration between our vast legacy IS base worldwide. This claim is not new to those who have read research, trade, or vendor literature over the past eight years. GTE has made a significant attempt to benefit from this technology. We have found that it is currently considerably more difficult and less beneficial than the literature or its proponents would have had us believe. This chapter outlines challenges that we and others have faced in attempting to put objects to work on a massive scale. The challenges were confirmed in a worldwide survey that I conducted of over 100 corporations that are attempting to deploy distributed object computing applications based on technologies such as CORBA, DCE, OLE/COM, distributed DBMSs, TP monitors, workflow management systems, and proprietary technologies. Distributed object computing has offered a vision, significant challenges, some progress toward a computing infrastructure, and some benefits. Whereas distributed computing infrastructure and its interoperability is critical, application interoperability is the fundamental challenge to users of distributed computing technology. More than 10 large corporations spend on the order of $1US billion annually addressing application interoperability. Although application interoperability is claimed to be the objective of distributed computing infrastructures, there has been little progress toward this critical ultimate requirement. This chapter presents a view of distributed object computing from the vantage point of a large organization attempting to deploy it in the large scale. Requirements are presented in a distributed computing framework that is necessarily more comprehensive than anything currently offered by the distributed object computing vendors and proponents. A distributed computing framework is seen as having four parts: • Distributed and Cooperative Information Systems • Computing Environment • Distributed Object Computational Model • Domain Orientation Relative to this framework, I outline GTE's approach to distributed object computing, challenges GTE faces and faced, why it is so hard, alternative distributed object computing infrastructure technologies, and an estimation of the state of these technologies. I conclude with the basic requirement for industrial-strength, enterprise-wide interoperable "applications." This non-technical requirement has always been a fundamental challenge for software. No, Virginia, there is no distributed object computing, yet. 1 The Challenge Future computing hardware and software will be scalable, service-oriented, and distributed. That is, computing requirements, on any scale, will be met by combining cooperating computing services that are distributed across computer networks. Distributed object computing (DOC) is a critical component in this long-term view, particularly for Distributed and Cooperative Information Systems (sometimes called CoopISs). The current challenge is to develop an adequate long-term computing vision and a sensible migration toward that vision [BR95, BR96]. This, however, is a technology-centric view. A more businessoriented, and hence realistic, re-statement might be as follows: To efficiently run our businesses, we would like to deal directly with the business process, not ISs, to define, alter, and execute them. Ideally, business processes would be directly and automatically implemented by underlying information technology, which we currently refer to as ISs. Business processes cooperate or interact, often in complex ways. Hence, IS must interact correspondingly. Hence, IS cooperation is one of our current key technical challenges.
Features of Relational Database Systems
Springer eBooks, 1983
In his 1981 Turing Award Lecture E.F.Codd stated that “A data model is, of course, not just a dat... more In his 1981 Turing Award Lecture E.F.Codd stated that “A data model is, of course, not just a data structure, as many people seem to think. It is natural that the principal data models are named after their principal structures, but that is not the whole story.
Silver bullet shy on legacy mountain: When neat technology just doesn't work -or Miracles to save the realm: Faustian bargains or noble pursuits
Springer eBooks, 1997
ABSTRACT Without Abstract
Third-Generation Database System Manifesto - The Committee for Advanced DBMS Function
Discovery Science, 1990
Axiomatic definitions for data model semantics
Information Systems, 1982
ABSTRACT The axiomatic method, a widely accepted technique for the precise (formal) definition of... more ABSTRACT The axiomatic method, a widely accepted technique for the precise (formal) definition of programming language semantics, is used to define data model semantics. First, a definition of the term “data model” is developed. The strong relationship between database and programming language concepts is discussed. The nature of data model formalization is described. Based on the experience in programming languages, it is argued that the formal definition of a data model aids database design, database management system implementation, semantic integrity verification and validation, and data model theory. It is further argued that precision must be weighed against understandability, and that a degree of informality can be introduced without loss of precision. Several different formal description techniques and their particular advantages are mentioned. It is argued that in order to achieve desired goals, more than one technique be used to develop consistent and complementary formal definitions of a data model. The axiomatic method is described. Axiomatic definitions are particularly appropriate for the design, analysis, and comparison of schemas, transactions, and databases. The axiomatic definition technique is demonstrated in an annotated, precise definition of the semantics of the structural aspects of a semantic data model which is based on the relational data model.
OTM’10 Keynote
Lecture Notes in Computer Science, 2010
Database Management: A Survey
The objective of Database Management technology is to provide general purpose mechanisms for mana... more The objective of Database Management technology is to provide general purpose mechanisms for managing large, shared data repositories. This chapter presents the basic concepts, techniques, and tools of database management. Data modelling, data models, a/trf database languages are described together with their application in database design and development, database management systems and implementation issues are outlined. The chapter concludes by discussing the current challenges that are driving advances in database technology and by identifying the future directions for database management research. Database motivations and concepts are compared and contrasted with those in Artificial Intelligence.
arXiv (Cornell University), Feb 14, 2024
The objective of this research is to provide a framework with which the data science community ca... more The objective of this research is to provide a framework with which the data science community can understand, define, and develop data science as a field of inquiry. The framework is based on the classical reference framework (axiology, ontology, epistemology, methodology) used for 200 years to define knowledge discovery paradigms and disciplines in the humanities, sciences, algorithms, and now data science. I augmented it for automated problem-solving with (methods, technology, community). The resulting data science reference framework is used to define the data science knowledge discovery paradigm in terms of the philosophy of data science addressed in previous papers and the data science problem-solving paradigm, i.e., the data science method, and the data science problem-solving workflow, both addressed in this paper. The framework is a much called for unifying framework for data science as it contains the components required to define data science. For insights to better understand data science, this paper uses the framework to define the emerging, often enigmatic, data science problem-solving paradigm and workflow, and to compare them with their well-understood scientific counterparts, scientific problem-solving paradigm and workflow.
On the Design and Specification of Database Transactions
Elsevier eBooks, 1989
OTM’10 Keynote
Springer eBooks, 2010
Database Management: A Survey
Topics in information systems, 1986
The objective of Database Management technology is to provide general purpose mechanisms for mana... more The objective of Database Management technology is to provide general purpose mechanisms for managing large, shared data repositories. This chapter presents the basic concepts, techniques, and tools of database management. Data modelling, data models, a/trf database languages are described together with their application in database design and development, database management systems and implementation issues are outlined. The chapter concludes by discussing the current challenges that are driving advances in database technology and by identifying the future directions for database management research. Database motivations and concepts are compared and contrasted with those in Artificial Intelligence.
ArXiv.org, 2024
The objective of this research is to provide a framework with which the data science community ca... more The objective of this research is to provide a framework with which the data science community can understand, define, and develop data science as a field of inquiry. The framework is based on the classical reference framework (axiology, ontology, 1 epistemology, methodology) used for 200 years to define knowledge discovery p a ra d i g m s a n d d i s c i p l i n e s i n t h e humanities, sciences, algorithms [28], and now data science. I augmented it for automated problem-solving with (methods, technology, community) [3][4]. The resulting data science reference framework is used to define the data science knowledge discovery paradigm in terms of the philosophy of data science addressed in [3] and the data science problem-solving paradigm, i.e., the data science method ,
An Extended Transaction Environment for Workflows in Distributed Object Computing
IEEE Data(base) Engineering Bulletin, 1993
Page 26. An Extended Transaction Environment for Workflows in Distributed Object Computing Dimitr... more Page 26. An Extended Transaction Environment for Workflows in Distributed Object Computing Dimitrios Georgakopoulos Mark F. Hornick Frank Manola Michael L. Brodie Sandra Heiler Farshad Nayeri Benjamin Hurwitz GTE ...
Springer eBooks, 2009
A compelling question for the 21 st Century is "What is the nature of our digital universe?" In d... more A compelling question for the 21 st Century is "What is the nature of our digital universe?" In designing the Future Internet, we have a remarkable opportunity and need for a deeper understanding of this question. Based on the profound importance of our Future Digital World and on the role of the Future Internet in shaping it, this paper suggests holistic objectives for the design process and some thoughts on challenges and opportunities that the design process may face.
On knowledge-based systems architectures
Springer eBooks, Jul 9, 1986
Association: A Database Abstraction for Semantic Modelling
International Conference on Entity-Relationship Approach, Oct 12, 1981
Google, Inc. (search), Subscribe (Full Service), Register (Limited Service, Free), Login. Search:... more Google, Inc. (search), Subscribe (Full Service), Register (Limited Service, Free), Login. Search: The ACM Digital Library The Guide. ...
A Functional Framework for Database Management Systems
Abstract : The concept of DBMS architecture played an important role in the design, analysis, and... more Abstract : The concept of DBMS architecture played an important role in the design, analysis, and comparison of DBMSs as well as in the development of other database concepts. the ANSI/SPARC prototypical database system architecture was a major contribution in this development. The architecture raised many issues, stimulated considerable research, and posed a number of new problems. Since the basic formulation of the ANSI architecture, in 1974, little consideration was given to resolving problems and accommodating new and future developments. The main problems concern its unnecessary ridgidity. The contributions of this paper are a distinction between DBMS framework and DBMS architecture, and a functional DBMS framework. The framework was developed using a functional approach in which a DEMS is characterized abstractly in terms of functional components and their potential relationships. The approach is based on the notions of modularity and data abstraction as developed in software engineering and programming languages. (Author)
Proceedings of the 1980 workshop on Data abstraction, databases and conceptual modeling
Distributed computing, and distributed object computing in particular, holds remarkable promise f... more Distributed computing, and distributed object computing in particular, holds remarkable promise for future Information Systems (ISs) and for more productive collaboration between our vast legacy IS base worldwide. This claim is not new to those who have read research, trade, or vendor literature over the past eight years. GTE has made a significant attempt to benefit from this technology. We have found that it is currently considerably more difficult and less beneficial than the literature or its proponents would have had us believe. This chapter outlines challenges that we and others have faced in attempting to put objects to work on a massive scale. The challenges were confirmed in a worldwide survey that I conducted of over 100 corporations that are attempting to deploy distributed object computing applications based on technologies such as CORBA, DCE, OLE/COM, distributed DBMSs, TP monitors, workflow management systems, and proprietary technologies. Distributed object computing has offered a vision, significant challenges, some progress toward a computing infrastructure, and some benefits. Whereas distributed computing infrastructure and its interoperability is critical, application interoperability is the fundamental challenge to users of distributed computing technology. More than 10 large corporations spend on the order of $1US billion annually addressing application interoperability. Although application interoperability is claimed to be the objective of distributed computing infrastructures, there has been little progress toward this critical ultimate requirement. This chapter presents a view of distributed object computing from the vantage point of a large organization attempting to deploy it in the large scale. Requirements are presented in a distributed computing framework that is necessarily more comprehensive than anything currently offered by the distributed object computing vendors and proponents. A distributed computing framework is seen as having four parts: • Distributed and Cooperative Information Systems • Computing Environment • Distributed Object Computational Model • Domain Orientation Relative to this framework, I outline GTE's approach to distributed object computing, challenges GTE faces and faced, why it is so hard, alternative distributed object computing infrastructure technologies, and an estimation of the state of these technologies. I conclude with the basic requirement for industrial-strength, enterprise-wide interoperable "applications." This non-technical requirement has always been a fundamental challenge for software. No, Virginia, there is no distributed object computing, yet. 1 The Challenge Future computing hardware and software will be scalable, service-oriented, and distributed. That is, computing requirements, on any scale, will be met by combining cooperating computing services that are distributed across computer networks. Distributed object computing (DOC) is a critical component in this long-term view, particularly for Distributed and Cooperative Information Systems (sometimes called CoopISs). The current challenge is to develop an adequate long-term computing vision and a sensible migration toward that vision [BR95, BR96]. This, however, is a technology-centric view. A more businessoriented, and hence realistic, re-statement might be as follows: To efficiently run our businesses, we would like to deal directly with the business process, not ISs, to define, alter, and execute them. Ideally, business processes would be directly and automatically implemented by underlying information technology, which we currently refer to as ISs. Business processes cooperate or interact, often in complex ways. Hence, IS must interact correspondingly. Hence, IS cooperation is one of our current key technical challenges.
Features of Relational Database Systems
Springer eBooks, 1983
In his 1981 Turing Award Lecture E.F.Codd stated that “A data model is, of course, not just a dat... more In his 1981 Turing Award Lecture E.F.Codd stated that “A data model is, of course, not just a data structure, as many people seem to think. It is natural that the principal data models are named after their principal structures, but that is not the whole story.
Silver bullet shy on legacy mountain: When neat technology just doesn't work -or Miracles to save the realm: Faustian bargains or noble pursuits
Springer eBooks, 1997
ABSTRACT Without Abstract
Third-Generation Database System Manifesto - The Committee for Advanced DBMS Function
Discovery Science, 1990
Axiomatic definitions for data model semantics
Information Systems, 1982
ABSTRACT The axiomatic method, a widely accepted technique for the precise (formal) definition of... more ABSTRACT The axiomatic method, a widely accepted technique for the precise (formal) definition of programming language semantics, is used to define data model semantics. First, a definition of the term “data model” is developed. The strong relationship between database and programming language concepts is discussed. The nature of data model formalization is described. Based on the experience in programming languages, it is argued that the formal definition of a data model aids database design, database management system implementation, semantic integrity verification and validation, and data model theory. It is further argued that precision must be weighed against understandability, and that a degree of informality can be introduced without loss of precision. Several different formal description techniques and their particular advantages are mentioned. It is argued that in order to achieve desired goals, more than one technique be used to develop consistent and complementary formal definitions of a data model. The axiomatic method is described. Axiomatic definitions are particularly appropriate for the design, analysis, and comparison of schemas, transactions, and databases. The axiomatic definition technique is demonstrated in an annotated, precise definition of the semantics of the structural aspects of a semantic data model which is based on the relational data model.
arXiv, 2023
Data science is not a science. It is a research paradigm. Its power, scope, and scale will surpas... more Data science is not a science. It is a research paradigm. Its power, scope, and scale will surpass science, our most powerful research paradigm, to enable knowledge discovery and change our world. We have yet to understand and define it, vital to realizing its potential and managing its risks. Modern data science is in its infancy. Emerging slowly since 1962 and rapidly since 2000, it is a fundamentally new field of inquiry, one of the most active, powerful, and rapidly evolving 21st century innovations. Due to its value, power, and applicability, it is emerging in 40+ disciplines, hundreds of research areas, and thousands of applications. Millions of data science publications contain myriad definitions of data science and data science problem solving. Due to its infancy, many definitions are independent, application-specific, mutually incomplete, redundant, or inconsistent, hence so is data science. This research addresses this data science multiple definitions challenge by proposing the development of coherent, unified definition based on a data science reference framework using a data science journal for the data science community to achieve such a definition. This paper provides candidate definitions for essential data science artifacts that are required to discuss such a definition. They are based on the classical research paradigm concept consisting of a philosophy of data science, the data science problem solving paradigm, and the six component data science reference framework (axiology, ontology, epistemology, methodology, methods, technology) that is a frequently called for unifying framework with which to define, unify, and evolve data science. It presents challenges for defining data science, solution approaches, i.e., means for defining data science, and their requirements and benefits as the basis of a comprehensive solution.
arXiv, 2023
Data science is not a science. It is a research paradigm with an unfathomed scope, scale, complex... more Data science is not a science. It is a research paradigm with an unfathomed scope, scale, complexity, and power for knowledge discovery that is not otherwise possible and can be beyond human reasoning. It is changing our world practically and profoundly already widely deployed in tens of thousands of applications in every discipline in an AI Arms Race that, due to its inscrutability, can lead to unfathomed risks. This paper presents an axiology of data science – it’s purpose, nature, importance, risks, and value for problem solving – by exploring and evaluating it’s remarkable, definitive features. As data science is in its infancy, this initial, speculative axiology is intended to aid in understanding and defining data science to recognize its potential benefits, risks, and open research challenges. AI-based data science is inherently about uncertainty that may be more realistic than our preference for the certainty of science. Data science will have impacts far beyond knowledge discovery and will take us into new ways of understanding the world.
Futurist of the Year , 2024
The state of the AI Revolution as presented at the Futurist of the Year 2024 Congress, Warsaw Pol... more The state of the AI Revolution as presented at the Futurist of the Year 2024 Congress, Warsaw Poland, April 9-11, 2024. Observations based on presenting inscrutable AI-based data science to many brilliant people and response to their questions that were naïve from a data science perspective but very sensible from a scientific perspective; a challenge faced by most educated people.