Example edits

Content development

Task: Identifying paragraphs and sentences that are correct in terms of language but are difficult to understand / not clear.

  1. “To further generalize the findings to another pattern of interest, such as online-offline price differences among all products in the US sold both online and offline by the same retailers, we would have to assume that the distribution of price differences among the products represented in this data would be very similar in that more general pattern.”
    The sentence is a bit too long and part of it needs clarification.
  2. “Recall that the coefficient on the AR(1) term is estimated to be very strong on the training time series.”
    Explanation is needed for “strong”.
  3. “All of this is an iterative process.”
    Explanation is needed for “all of this”.
  4. “Under lucky circumstances, data analysts may assume that variation in x in observational data that is exogenous, just as if it came from a controlled experiment.”
    The sentence does not have an ending hence needs clarification.

Clarify and improve structure

Task: Identifying unlcear technical terms: authors use different expressions for a concept and it is not clear if they are the same.

  1. “The bootstrap is a method…The bootstrap procedures….Since the bootstrap distribution is a good approximation….”
    It can be confusing for the reader if a new definition is introduced by two different expressions on a single page.
  2. “…firms with more employees tend to have larger sales.” “….we see firms with zero or one employees but with very large revenues…”
    Sales and revenues do not mean the same hence the authors need to clarify.
  3. “…we have two slope coefficients and for the second range, we need to add them up.” “…each line segmnent corresponding to a specific interval of the explanatory variable.”
    Authors need to clarify that “second range” and “line segment” mean the same.

Ask for more precision or examples

Task: Identifying paragraphs and sentences that are clear in terms of language but need examples.

  1. “Understanding what mechanisms may play a role in the effect of the causal variable is important for various reasons.”
    “Various reasons” is vague, needs an example.
  2. “Instead it is an imperfect measure because there are differences within some countries, especially within large ones with many groups and areas that have different values and/or institutions.”
    Needs country or country group examples.
  3. “In fact, some important economic variables are well approximated by a lognormal distribution.”
    “Some important economic variables” is not well defined, needs either an example or focus.
  4. “Indeed, ARMA models can capture complicated patterns of serial correlation by mixing features of gradual decay with specific values for specific orders of serial correlation.”
    “Mixing features of gradual decay with specific values for specific orders” is difficult to understand as features, values and orders are not specified. Needs examples.

Checking numbers and format

Task: Going through text, tables, figures and other exhibits and check consistency

  1. “For the last bin [3,7] we chose 3.5km not the midpoint, because the distribution of distance is skewed and the median…”
    But the figure shows the midpoint at 5km.
  2. “Figures 8.2a and 8.2b show four regression lines.”
    But there is only one line per graph.
  3. “x axis shows price in (EUR).”
    But the text says USD.
  4. A table is called “House price prediction models”.
    But the source under the table says “swim-transactions dataset”.
  5. “….it (the model) has all the variables including flags but does not have interactions.”
    But the model does not have all the variables.
  6. “As the figure shows, z is a mechanism of reverse causality if y affects z, that in turn, affects x.”
    But there is no z variable in the figure.
  7. “…people who eat 100 more grams of fruit and vegetables have lower blood pressure…”
    But the descriptive table measures fruit and vegetables in numbers not in grams.
  8. “It produces a model that includes most but not all variables, reducing the number of predictors from 153 to 128.”
    But the table has 134 variables for this model.
  9. “…an assumption is called “homoskedasticity”.”
    But the authors use quotation marks to introduce a definition instead of bold letters used throughout the book.

Comparing documents

Task: Edit a document that is related to another one, such as a presentation vs a paper, or practice questions and answers

  1. Compare versions of a graph displayed in a paper and in a presentation to discover inconsistencies (edits made in the paper are not added to the presentation).
  2. Suggest edits to presentation to make it more concise and fit better on a slide.