Saturday, April 11, 2020

The Fourth Paradigm: Data-Intensive Scientific Discovery (ed. by Tony Hey et al., Microsoft Research, 2009)

This book presents the first broad look at the rapidly emerging field of data-intensive science, with the goal of influencing the worldwide scientific and computing research communities and inspiring the next generation of scientists. Increasingly, scientific breakthroughs will be powered by advanced computing capabilities that help researchers manipulate and explore massive datasets. The speed at which any given scientific discipline advances will depend on how well its researchers collaborate with one another, and with technologists, in areas of eScience such as databases, workflow management, visualization, and cloud-computing technologies. This collection of essays expands on the vision of pioneering computer scientist Jim Gray for a new, fourth paradigm of discovery based on data-intensive science and offers insights into how it can be fully realized. (amazon) (accessible via scribd)

https://en.wikipedia.org/wiki/Data_science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from many structural and unstructured data.[1][2] Data science is related to data mining and big data.
Data science is a "concept to unify statisticsdata analysismachine learning and their related methods" in order to "understand and analyze actual phenomena" with data.[3] It employs techniques and theories drawn from many fields within the context of mathematicsstatisticscomputer science, and information science.
Turing award winner Jim Gray imagined data science as a "fourth paradigm" of science (empiricaltheoreticalcomputational and now data-driven) and asserted that "everything about science is changing because of the impact of information technology" and the data deluge.[4][5]