What I Read – “Thinking with Data” by Max Shron

Tags

, , , , ,

This book provided excellent further reading proposals and good references to some definitions related to scientific argument. Here is what I marked in that book:

[…] There are four parts to a project scope […] the context of the project; the needs that the project is trying to meet; the vision of what success might look like; and finally what the outcome will be, in terms of how the organization will adopt the results and how its effects will be measured down the line […]

[…] A data science need is a problem that can be solved with knowledge, not a lack of particular tool.

[…] There are three groups of patterns we will explore. The first group of patterns are called categories of disputes, and provide a framework for understanding how to make a coherent argument. The next group of patterns are called general topics, which give general strategies for making arguments. The last group is called special topics, which are the strategies for making arguments specific to working with data […]

[…] A very powerful way to organize our thoughts is by classifying each point of dispute in our argument. A point of dispute is the part of an argument where the audience pushes back, the point where we actually need to make a case to win over the sceptical audience […]

[…] Ancient rhetoricians created a classification system for disputes. It has been adapted by successive generations of rhetoricians to fit modern needs. A point of dispute will fall into one of four categories: fact, definition, value, and policy. […]

[…] Once we have identified what kind of dispute we are dealing with, automatic help arrives in the form of stock issues. Stock issues tell us what we need to demonstrate in order to overcome the point of contention. Once we have classified what kind of thing it is that is under dispute, there are specific subclaims we can demonstrate in order to make our case. If some of the stock issues are already believed by the audience, then we can safely ignore those. Stock issues greatly simplify the process of making a coherent argument. […]

[…] A dispute of fact turns on what is true, or on what has occurred. Such disagreements arise when there are concrete statements that the audience is not likely to believe without an argument. […]

[…] The typical questions of science are disputes of fact. […]

[…] There are thus two stock issues for disputes of fact. They are: What is a reasonable truth condition? Is that truth condition satisfied? […]

[…] Disputes of definition occur when there is a particular way we want to label something, and we expect that that label will be contested. […]

[…] Definitions in a data context are about trying to make precise relationships in an imprecise world. […]

[…] There are three stock issues with disputes of definition: Does this definition make a meaningful distinction? How well does this definition fit with prior ideas? What, if any, are the reasonable alternatives, and why is this one better? We can briefly summarize these as Useful, Consistent, and Best. A good definition should be all three. […]

[…] When we are concerned with judging something, the dispute is one of value. […]

[…] For disputes of value, our two stock issues are: how do our goals determine which values are the most important for this argument? Has the value been properly applied in this situation? […]

[…] Our values are dictated by our goals. Teasing out the implications of that relationship requires an argument. […]

[…] Disputes of policy occur whenever we want to answer the question, “Is this the right course of action?” or “Is this the right way of doing things?” […]

[…] The four stock issues of disputes of policy are: Is there a problem? Where is credit or blame due? Will the proposal solve it? Will it be better on balance? David Zarefsky distils these down into Ill, Blame, Cure, and Cost. […]

[…] Discussions about patterns in reasoning often center around what Aristotle called general topics. General topics are patterns of argument that he saw repeatedly applied across every field. These are the “classic” varieties of arguments: specific-to-general, comparison, comparing things by degree, comparing sizes, considering the possible as opposed to the impossible, etc. […]

[…] A specific-to-general argument is one concerned with reasoning from examples in order to make a point about a larger pattern. The justification for such an argument is that specific examples are good examples of the whole. […]

[…] General-to-specific arguments occur when we use beliefs about general patterns to infer results for particular examples. […]

[…] Arguments by analogy come in two flavors: literal and figurative. In a literal analogy, two things are actually of similar types. […]

[…] The justification for argument by analogy is that if the things are alike in some ways, they will be alike in a new way under discussion. […]

[…] In a figurative analogy, we have two things that are not of the same type, but we argue that they should still be alike. […]

[…] behaviour in one domain (math) can be helpful in understanding behaviour in another domain (like the physical world, or human decision-making). Whenever we create mathematical models as an explanation, we are making a figurative analogy. […]

[…] Optimization, bounding cases, and cost/benefit analysis are three special arguments that deserve particular focus. […]

[…] An argument about optimization is an argument that we have figured the best way to do something, given certain constraints. […]

[…] There are two major ways to make an argument about bounding cases. The first is called sensitivity analysis. In sensitivity analysis, we vary the assumptions to best – or worst-case values and see what the resulting answers look like. […]

[…] A more sophisticated approach to determining bounding cases is through simulation or statistical sensitivity analysis. […]

[…] In a cost/benefit analysis, each possible outcome from a decision or group of decisions is put in terms of a common unit, like time, money, or lives saved. […]

[…] The goal of a causal analysis is to find and account for as many confounders as possible, observed and unobserved. In an ideal world, we would know everything we needed in order to pin down which states always preceded others. That knowledge is never available to us, and so we have to avail ourselves of certain ways of grouping and measuring to do the best we can. […]

[…] Data science, as a field, is overly concerned with the technical tools for executing problems and not nearly concerned enough with asking the right questions. […] YEAH🙂

Good start for anyone who wants to dive into Big Data field.