Good governance requires good policy decisions. These decisions are (or should be) based on a careful analysis of available data and sound judgement. Conversely, faulty analysis is likely to result in bad policy decisions.

The May, 2017 Resident Mailer was something of a wake-up call for me, because I realized that critical decisions like level-funding the schools were being justified by analyses that were deeply flawed, to the point of being downright misleading.

Once I was paying attention, I noticed other examples of questionable analysis being used to justify policy decisions. I also realized that in some cases, the data used in the analysis was publicly available, so I undertook the task of replicating the council's analyses using rigorous, mathematically sound techniques when data could be obtained.

These often produced results quite different from those the council had published.

Those results were often produced by analyses that contained basic flaws anyone with experience working with data would point out, such as:

- Not using all of the data: As a data analyst, you never have as much data as you would like, so you are reluctant to throw anything out unless you are certain that it is wrong and you have no way of correcting it or adjusting for it. Selectively excluding data without a compelling reason is one of the telltale signs that the person doing the analysis is trying to introduce a bias (this is a form of what is commonly called "cherry picking").
- Two-point trends: If you are trying to estimate a trend and you have, say, 25 data points, you don't start by throwing 23 of them out and only working with the first and last. This produces estimates that are much less stable than those using all of the data available, and since the estimated trend line always fits the two data points perfectly, you lose the ability to assess the "goodness of fit" of the model or detect systematic departures from linearity.
- Use of flawed metrics: Retail businesses analyze trends using what are called "same store sales". This means computing the percentage change individually for each store, then using the average or median of those individual store changes. This is
*not*the same as taking the median of the store-by-store sales at the start and end of the study, which will most likely end up comparing sales at two different stores at different points in time, and is not meaningful if new stores are opened or existing stores are closed, because you cannot tell how much of any differences you measure is due to the trend, and how much is due to the different store population. Using same-store sales (like paired t-tests in statistics) ensures that the 'subjects' in the before and after samples are the same. This is true for estimating the growth in median house prices or tax bills. New construction does not have the same price distribution as the existing tax base, so you do not have the same population if you use the overall median at different points in time, especially if those points are far apart in time. By using the individual 'same house' percentage increases, you ensure that the houses you looked at in one time period are the same ones you looked at in the other. - Extrapolating exponential growth curves far beyond the data: A constant percentage growth per unit of time gives rise to a type of graph known as an
*exponential*curve. These are notorious for producing impossibly large predictions because in real life there are constraints that quickly act to limit the growth potential. If you measured the growth rate of, say, a puppy in the first three months of its life, you might find that its weight is doubling every three weeks. If you extrapolated this over the life of the dog, you would end up predicting that it would eventually weigh thousands of pounds.

## Examples

**Median tax bill increase from the May, 2017 resident mailer**

The mailer claimed the median residential tax bill had increased 51% in five years. Careful analysis using same-house changes revealed that the actual median increase was 13.3%,*less than a third of what was claimed*. You can tell at a glance that the 51% number is wrong, because the tax revenue only increased 15.4%, and the school appropriation, the largest single expenditure, only increased 10.6%.**Twenty Year House Value Study**

The town produced this analysis claiming that property tax bills grew faster than home values over the 20 year period from 1998 to 2017. It's easy to tell that this is false, because if they had, the 2017 tax rate of $24.08 would be higher than the 1998 combined town and fire district tax rate of $25.22, but it is not. The 2018 tax rate was even lower, but 2018 was inexplicably omitted from the study even though the data was available.**Why did my tax bill go up? (or down?)**

A brief examination of how property taxes are calculated, the factors that cause them to go up and down, and the recent and long term history of tax rates in East Greenwich. Includes a tax bill history back to 2000 useful for factchecking anecdotal claims about tax bills.**Our fiscally irresponsible FY2019 budget**

The council has taken the position that operating costs are rising faster than our ability to pay for them. This analysis examines tax rates over the six most recent revaluation cycles and estimates the growth rate of operating costs to be a smooth trend of 2.8%, or about $1.5 million per year. The council has chosen to effectively freeze the total levy and make up the shortfall with $1 million from reserves. This analysis suggests that $1 million will not be enough to cover expenses. An FY2019 tax rate that generates an additional $1.5 million would be still be lower than the FY2018 rate.**The $100 million levy**

Using a trend estimate based on two points, the council published a flyer stating that it was a "fact" that the total tax levy would reach $100 million in 15 years. This analysis suffered from two basic flaws: estimating a trend from two points, and extrapolating an exponential curve far into the future. Using a regression model with 30 years of levy data, it is possible to determine that the growth rate of the levy has been slowing down in recent years and shows no sign of the constant 4% annual growth required for the council's projection to be true.**Why I don't believe Ken Block's analysis of the East Greenwich Fire Department data**

At the June 6th East Greenwich Town Council meeting, a public relations campaign was initiated that paints the East Greenwich firefighters as grifters taking advantage of a flawed contract to rack up undeserved overtime payments that threaten to bankrupt the town. As with other information promulgated by this council and town manager, a careful analysis suggests that this is not even remotely true.**Why does Barrington get more state education aid per pupil than East Greenwich?**

Under the RI State Education Aid Funding Formula, Barrington received more aid per pupil than East Greenwich. This analysis examines the formula and explains why this is so. The main reason is that according to the criteria used by the formula (assessed real estate value per student times the ratio of the median income to the state median income),*East Greenwich residents can afford property taxes better than Barrington residents*.

**RIDE Uniform Charter of Accounts data**

The Rhode Island Department of Education requires every school district (technically, every "LEA" or Local Education Agency, a more general term that includes things like stand-alone charter schools) to submit detailed information on their revenues and expenditures in a standard format called Uniform Charter of Accounts (UCOA). This is documented in the UCOA Accounting Manual. I have been putting together a Jupyter notebook that reads the UCOA database, for use in ad hoc analyses of school department finances from previous years. The time period covered is from the 2009-2010 school year through the 2016-2017 school year.**RIDE In$ite data**

Prior to the implementation of UCOA, the Rhode Island Department of Education used a financial analysis model called In$ite. This is a bit harder to work with because it is in a format that requires a relational database product called Access. In$ite data is archived for the 2001-2002 school year through the 2008-2009 school year. The system is documented in the In$ite Handbook. I was able to extract the individual tables and have been putting together a Jupyter notebook that reads the In$ite data, for use in ad hoc analyses. I may try to load the data into a full-function relational database like Mariadb.