0103 1 STATE OF FLORIDA 2 DIVISION OF ADMINISTRATIVE HEARINGS 3 ---------------------------------------------------------- 4 SUGAR CANE GROWERS COOPERATIVE OF ) 5 FLORIDA, et al., ) 6 and ) Nos. 92-3038 7 FLORIDA SUGAR CANE LEAGUE, INC; et al., ) 92-3039 8 and ) 92-3040 9 FLORIDA FRUIT AND VEGETABLE ASSOCIATION,) 10 LEWIS POPE FARMS, et al., ) 11 Petitioners, ) 12 vs. ) 13 SOUTH FLORIDA WATER MANAGEMENT DISTRICT,) 14 Respondent, ) 15 and ) 16 MICCOSUKEE TRIBE OF INDIANS OF FLORIDA, ) 17 the UNITED STATES OF AMERICA, et al., ) 18 Intervenors. ) 19 ---------------------------------------------------------- 20 Deposition Upon Oral Examination Of 21 STEVEN P. MILLARD 22 Volume 2, Pages 103 - 159 23 Taken at 800 Fifth Avenue, Suite 3600, Seattle, WA 24 DATE: March 10, 1993 25 REPORTED BY: Joanne Leatiota, RPR CSR LEATIJL477Q5 LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0104 1 APPEARANCES: 2 For the United THOMAS A.W. FITZGERALD, ESQ. 3 States: Assistant United States Attorney 4 155 South Miami Avenue 5 Miami, Florida 33130 6 For the Florida ROBERT H. BLANK, ESQ. 7 Sugar Cane League: Peeples, Earl & Blank 8 One Biscayne Tower, Suite 3636 9 Two South Biscayne Boulevard 10 Miami, Florida 33131 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0105 1 DEPOSITION OF STEVEN P. MILLARD, V.2; taken 3-10-93 2 E X H I B I T S 3 NO. DESCRIPTION PAGE 4 9 Table 3 110 5 10 Loxahatchee Total Phosphorus table 133 6 11 Loxahatchee Total Phosphorus table 133 7 12 Possible Outcomes table 143 8 9 10 E X A M I N A T I O N 11 BY PAGES 12 MR. FITZGERLAD 106 - 157 13 14 15 16 17 18 19 20 21 22 23 24 25 LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0106 1 Seattle, Washington; Wednesday, March 10, 1993 2 9:00 A.M. 3 -------------------------- 4 STEVEN P. MILLARD, witness herein, having been 5 previously sworn by the Notary, 6 testified as follows: 7 E X A M I N A T I O N (Resumed) 8 BY MR. FITZGERALD: 9 Q. Doctor, you are still under oath from 10 yesterday. 11 After we left yesterday did you take any 12 further steps or measures to review documents or 13 otherwise prepare further for your deposition? 14 A. I glanced at some material last night 15 actually on the computer in response to a question that 16 counsel had for me yesterday. 17 Q. Was any of that material the diagnostics that 18 you were talking about yesterday? 19 A. No. 20 Q. In the bootstrap sampling program you 21 employed, how many base period dates will normally be 22 used in each run of the 14 dates? 23 A. In each bootstrap sample? 24 Q. Yes. 25 A. The way that the bootstrap sampling works, LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0107 STEVEN P. MILLARD, V.2, 3-10-93 1 the way that my program works, each date had an equal 2 chance of being chosen, so there may have been times 3 when almost all of the 14 observations -- actually all 4 came from the baseline period and there may have been 5 bootstrap samples where hardly any came. It's even 6 possible that none would come from the baseline period, 7 but in that case -- I haven't thought about this. I'll 8 have to think about it. In that case, if you had a 9 bootstrap sample where none of the observations were in 10 the baseline period and you were using baseline period 11 as a predictor variable in the model, you probably 12 would have a problem with what's called aliasing in the 13 predictor matrix and that that would result in -- I 14 didn't allow that to happen the way that I wrote the 15 program. So it would result in a problem called 16 aliasing or singularity in the predictor matrix. 17 Q. How did you eliminate that possibility in 18 your program? 19 A. I checked for it in my program before 20 proceeding with a fit. 21 Q. In how many of your thousand runs, then, 22 would you have had none of the base period dates 23 amongst the 14 in a particular resampling run? 24 A. Assuming that what I am saying is correct in 25 that if you didn't have any period -- any dates in the LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0108 STEVEN P. MILLARD, V.2, 3-10-93 1 baseline period, then you would have a problem with 2 singularity, then none of the thousand runs would have 3 a situation where none of the dates in the baseline 4 period were chosen. 5 Q. If you started your bootstrap runs one of the 6 thousand runs, for example, the very first one using 7 only five base period sampling dates, shouldn't you 8 have fixed for that number throughout the run to avoid 9 exaggerating the variance? 10 A. It depends on the purpose of the analysis. 11 Q. What was the purpose of your analysis? 12 A. The purpose of the analysis was to get an 13 idea of the variability in the estimate of the limits, 14 and it may be -- if I think about it for a while it may 15 be that I might want to go back and somehow think about 16 how I want to handle the problem of possibly not having 17 very many observations from the baseline period. 18 Q. What happens to variation or the observed 19 variation as a result of the bootstrap analysis if the 20 number of base period sampling dates is allowed to fall 21 towards zero in each run? 22 A. I can't really give you a definitive answer 23 at this point. It simply means that if you have fewer 24 observations in the baseline period, you have more 25 observations in the post-baseline period, and the way LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0109 STEVEN P. MILLARD, V.2, 3-10-93 1 that the predictor variable relating to baseline period 2 was constructed, it basically told you whether you were 3 in the baseline period or not in the baseline period, 4 so you had -- it basically allowed you to have a 5 separate average for those two periods. And if you 6 have fewer observations in the baseline period compared 7 to more observations in the post-baseline period, that 8 simply means you are getting a better estimate of the 9 mean in the post-baseline period as compared to in the 10 baseline period. 11 Q. Doesn't that in fact mean that in your effort 12 -- as you said, the goal of your program was to analyze 13 the variation or variances that might arise, doesn't in 14 fact the way you ran the program if it allowed your 15 base year date observations in a 14 point sampling run 16 to fall towards zero that that would act to inflate or 17 exaggerate the variance observed as a result of the 18 run? 19 A. The variance of? 20 Q. The variance in the limits. It would 21 overstate the variance. It would exaggerate the 22 variance and overstate the apparent error, if I can use 23 the term error. Do you understand what I mean? 24 A. Yes. I don't think that I can agree with 25 that statement. I am trying to think of a simple way LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0110 STEVEN P. MILLARD, V.2, 3-10-93 1 of explaining why I don't agree with that statement, 2 but at this point I can't. 3 MR. FITZGERALD: Mark this Exhibit-9. 4 (Exhibit-9 marked.) 5 Q. Doctor, showing you what's been marked as 6 deposition Exhibit-9 captioned "Table 3: Frequency 7 distribution of number of base year dates in the 8 (second) N equals 1000 plus 2 asterisk bootstrapped 9 regressions." Can you look at this and tell me if this 10 would agree with your understanding of the 11 distribution -- 12 MR. BLANK: Counsel, I would object to the 13 witness answering any questions based on this document 14 until it's properly identified. 15 MR. FITZGERALD: It has been identified. 16 What did you desire, Counsel? 17 MR. BLANK: Who prepared the document? 18 MR. FITZGERALD: You can ask that at your 19 deposition. I am showing the witness a document and I 20 am going to question him about it. The source of it is 21 immaterial. 22 Q. Can you tell me whether the distribution 23 indicated here from the base year dates and the 14 24 randomized choices from your bootstrap sampling runs is 25 accurate? LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0111 STEVEN P. MILLARD, V.2, 3-10-93 1 MR. BLANK: I repeat the same objection. 2 You can answer, Doctor. 3 A. I can't tell you for sure whether it's 4 accurate without sitting down and actually performing 5 the calculations myself. I believe if I understand the 6 table the -- what the table is showing is assuming you 7 have a certain number of dates in the baseline year -- 8 in the baseline period, excuse me, which I can't 9 remember if that was five or six, I believe somewhere 10 around five or six, and the rest of the observations 11 are in the post-baseline period for a total of 14 12 sampling dates, then the question is if you sample 13 those dates at random with replacement, what is the 14 probability of obtaining a sample with a specific 15 number of dates from the baseline period. So the first 16 column in this table is showing the possible number of 17 dates from the baseline period -- 18 Q. Can I ask a question about that before you 19 lose me completely. You said yesterday, if I recall 20 correctly, that because of the randomness and the 21 substitution aspects of your bootstrap program you 22 could have anywhere from zero to 14 because you could 23 have replicated -- duplicated dates, the same date you 24 said in theory could show up 14 times in the single 25 bootstrap. LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0112 STEVEN P. MILLARD, V.2, 3-10-93 1 A. That's correct, because you're sampling with 2 replacement. 3 Q. So your potential field in any bootstrap run 4 could be zero to 14 of the base year dates because it 5 could be none or it could be all, 14 could be the same 6 or it could be different. 7 A. That's correct, that's correct. 8 Q. Please continue. 9 A. So the first column is listing the possible 10 number of dates that you have from the baseline 11 period. 12 The second -- the second column I am not sure 13 where that came from. I would guess -- 14 MR. BLANK: Don't guess, please. 15 A. Okay, I don't know -- I am not sure where 16 that second column came from. 17 The third column -- well, I would have to 18 speculate on the third column. 19 MR. BLANK: I would instruct you not to 20 speculate. 21 Q. Let's see if we can do away with a little of 22 the speculation. Statistically speaking, in any 23 distribution using a randomized program where you are 24 going to pull 14 variables and plug it into a program 25 like bootstrap, you know in theory and can calculate, LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0113 STEVEN P. MILLARD, V.2, 3-10-93 1 can you not, what the theoretical distribution of those 2 inserted observations are, can't you? You could do 3 that. 4 A. You can calculate -- I am sorry, can you 5 repeat the statement? 6 Q. If you conduct a hundred bootstrap runs and 7 you know you have 14 observations that will be plugged 8 in based on a randomized replacement program, it's a 9 relatively mechanical effort to calculate the 10 distribution curve for those factors, isn't it? 11 A. Calculate -- 12 Q. Let me clarify the question. I see where the 13 problem is. You could calculate, could you not, the 14 theoretical distribution of each run to tell me what 15 the theoretical inclusion in those runs would be of a 16 given number of base year dates. 17 A. If you will allow me to rephrase what you 18 said, my understanding is for a given number of 19 bootstrap samples, say a hundred, you can calculate the 20 probability of observing a bootstrap sample that had a 21 specific number of dates from the baseline period. 22 Q. And if it were, for example, you wanted to 23 know what the probability would be in a run population 24 of N equal to a thousand of ten base year dates 25 appearing among the 14 random choices, you could LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0114 STEVEN P. MILLARD, V.2, 3-10-93 1 calculate that out as well at any given level? 2 A. That's correct. 3 Q. The theoretical binomial BIN expected number 4 would be precisely that, would it not? 5 A. No, it would be the probability. It would be 6 the probability multiplied by the number of bootstrap 7 samples. 8 Q. So the more bootstrap samples you run you 9 would expect to see the theoretical number and the true 10 number approach each other, would you not? If you ran 11 an infinite number of bootstrap runs, your theoretical 12 should approach your actual observed an infinite number 13 of runs; isn't that the essence of probability? 14 A. Yeah, you are confusing two quantities here. 15 You are confusing the expected number with the 16 probability. 17 Q. The expected number I recognize is not 18 expressed as a percentage. It's translated into an 19 actual number, but you can derive a percent probability 20 from that as well. You could express it that way 21 instead. 22 A. As long as you know the number of bootstraps 23 because -- yes, because what you're calling the 24 expected number I believe is the number of bootstraps 25 times the probability of seeing that particular LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0115 STEVEN P. MILLARD, V.2, 3-10-93 1 outcome. 2 Q. So in fact if this distribution has been 3 correctly calculated, in more than half of the thousand 4 runs you would expect to have five or less base year 5 dates among the 14 random choices? 6 A. If the expected numbers for five or less add 7 up to something more than 500, yes, I would agree with 8 that. 9 Q. Do you need a calculator? Can we agree that 10 218 plus 196, 128, 57, 16 and 2 exceeds 500? 11 A. Yes, I would agree with that. However, 12 including the outcome of zero dates from the baseline 13 year does not apply to my program. 14 Q. Which would drop two such bootstrap scenarios 15 or theoretical expected number of 2.06 runs if this 16 calculation is accurate? 17 A. It would drop those possibilities if they 18 came up during the computation, yes. 19 Q. You indicated that if you had not corrected 20 for that you would have had a singularity problem in 21 such runs in analyzing -- 22 A. I believe that's correct. 23 Q. Doesn't the same effect to a lesser degree 24 emerge any time you have a bootstrap run with a 25 relatively low number of base year dates included in LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0116 STEVEN P. MILLARD, V.2, 3-10-93 1 the resampling effort? 2 A. I don't believe so. I'll have to think about 3 it, but in layperson's terms, going from the situation 4 of having at least one date in the baseline period to 5 more versus having no dates in the baseline period is a 6 qualitative difference, not just necessarily a trend 7 difference in terms of thinking about the idea of 8 singularity. 9 Q. Sorry, I don't think you answered the 10 question. If I understand what you said, you said that 11 it is not just a qualitative, it's a quantitative 12 difference or the other way around when you go from 13 zero to, say, one or two base year dates? 14 A. In layperson terms it's a qualitative 15 difference, not just a quantitative difference. 16 Q. I understand that. But the effect is still 17 present, is it not? 18 A. The effect is meaning -- can you tell me 19 what -- tell me again what you mean by the phrase 20 effect. Effect is? 21 Q. Is to generate an exaggeration of the 22 variance in comparison to Dr. Walker's work, or to put 23 it another way in the terms you were using, you will be 24 moving towards the same singularity problem you have or 25 would have had if you had allowed a bootstrap with zero LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0117 STEVEN P. MILLARD, V.2, 3-10-93 1 base year dates. I understand there is a quantitative 2 and qualitative difference between zero and one or two 3 base year dates, but if you go down to an extremely low 4 number of base year dates in your bootstrap run of 14 5 resampling with the program as you have described it, 6 you still have the effect of exaggerating the 7 variance. 8 A. I can't agree with your phrase saying it 9 exaggerates the variance. The point of the bootstrap 10 is to try to get a feel for the actual variance in the 11 statistics that you compute and to make this -- to give 12 a simpler example, suppose that there's just one 13 predictor variable in the model just average stage. 14 You could have a bootstrap sample where you had only 15 two or three unique values of average stage rather than 16 I assume that all 14 sampling dates had slightly 17 different values of average stage. So when you run the 18 bootstrap in that case, you could end up with a 19 bootstrap sample where you have very few unique values 20 of average stage for that particular bootstrap sample, 21 and I assert that that is a legitimate bootstrap sample 22 and a legitimate way of -- in that example a legitimate 23 way of getting a feel for the variability in your 24 model, whatever statistic you are trying to estimate 25 based on the model that just has that one predictor LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0118 STEVEN P. MILLARD, V.2, 3-10-93 1 value of average stage, so it doesn't really matter 2 when you are talking about the predictor value of 3 average stage or the predictor variable baseline 4 period. That same sort of situation can still arise. 5 Q. What happens, though, if you are constrained 6 to use that baseline period, that you have no choice in 7 that, that you are constrained to employ the baseline 8 period by external forces whatever they may be? Would 9 you not want to consider that in analyzing the meaning 10 of the variance that your bootstrap program generates? 11 A. As I said earlier, I may after thinking about 12 it somewhat come to the conclusion that there is 13 another way to do the bootstrap as well as this way. 14 Q. Did you understand from your review of the 15 SWIM Plan portion of the planning document that's in 16 evidence as I think number 3 or 4 and Appendix E that 17 there was a constraint on the modelers or drafters of 18 the SWIM Plan and on Dr. Walker as well to use that 19 base year date because of statutory compulsion? 20 A. I understand why the variable, that the 21 predictor variable in the model that tells you whether 22 you are in the baseline period or not, why that was 23 included in the model. 24 Q. What was your understanding of the reason for 25 its inclusion in the model? LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0119 STEVEN P. MILLARD, V.2, 3-10-93 1 A. The reason that that variable -- my 2 understanding is the reason that variable was included 3 in the model is because Dr. Walker wanted to use data 4 that was not in the designated baseline period. He 5 wanted to use the information in that data to help get 6 a better estimate of the relationship between total 7 phosphorus and stage in the baseline period. 8 Q. In your estimation is that a reasonable way 9 to do it? 10 A. It's one possible way of doing it, but there 11 are several other models that could be considered in 12 order to try to use the information that's in the 13 post-baseline period to help you estimate relationships 14 in the baseline period. 15 Q. Do you know what other models Dr. Walker in 16 fact considered? 17 A. I do not. 18 Q. Do you know what information was provided him 19 by limnologists or environmental scientists that may 20 have led him to conclude that the mechanism chosen was 21 the most effective under the circumstances attendant in 22 the Loxahatchee National Wildlife Refuge? 23 A. I am sorry, the first part of the question 24 was what? 25 Q. Do you know what information may have been LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0120 STEVEN P. MILLARD, V.2, 3-10-93 1 available to him -- 2 A. No, I do not. 3 Q. -- from environmental sources? You would 4 agree, would you not, that such information might well 5 drive one to choose the model in fact employed in his 6 paper or the model employed in the district's Appendix 7 E? 8 A. Certainly information from environmental 9 scientists would be used by a statistician in 10 constructing a model. 11 Q. Can you explain to me how, if at all, this 12 effort contributes to defining the uncertainty in the 13 90th percentile adjusting for stage? 14 A. How which effort? 15 Q. Your bootstrapping effort to determine 16 variance. 17 A. How -- 18 Q. If at all. 19 A. How it -- I am sorry, could you repeat the 20 question, please? 21 Q. Was part of this effort to help you define 22 the uncertainty in that 90th percentile test or trigger 23 that is carried forward in Dr. Walker's May '91 24 document and in the district's Appendix E although it 25 rated differently? The 90 percentile test is in both, LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0121 STEVEN P. MILLARD, V.2, 3-10-93 1 or level. 2 A. Right. 3 Q. How did your work or how does your work 4 relate to defining the uncertainty in that level? 5 A. My report gives a table -- my report has a 6 table in it listing for a given average stage -- I am 7 now looking at my report Exhibit-No.-8, page 6, the 8 bottom of the page, Table 2, Estimated 5th and 95th 9 Percentiles of Target and Interim Limits. This table 10 was produced based on the results of the bootstrap, and 11 what this table shows is that for a particular given 12 value of stage, the computed target limit and the 13 computed interim limit could quite possibly have had a 14 wide range of values if, for example, slightly 15 different sampling dates had been used. 16 Q. What you are saying is if you statistically 17 construct a new set of sampling dates with samples 18 drawn at random from throughout the population of 19 observations, you may come up with different 20 concentration limits and targets. 21 A. That's correct. 22 Q. Isn't that self-evident? 23 A. That's correct, but the -- well, it may be 24 self-evident to some people or most people who are 25 statistically knowledgeable. The point of my report is LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0122 STEVEN P. MILLARD, V.2, 3-10-93 1 that there is a certain amount of variability 2 associated with any point estimate. In this case the 3 target limit is a point estimate, the interim limit is 4 a point estimate. 5 Q. In the column table 2 that you have been 6 referring to, the second group of numbers under Target 7 Limit, you have a set of numbers for the four stages, 8 then you have parentheticals with two sets of numbers 9 in each parenthetical. Can you explain the meaning of 10 that and why the two sets of numbers in parentheticals 11 seem to bracket the number outside? 12 A. The two sets of numbers in parentheses are 13 the 5th and 95th percentiles of the bootstrap 14 distribution of the quantity. So, for example, for a 15 given average stage of 15.5 for the target limit -- 16 Q. That's 12.8? 17 A. That's correct. The computed target limit 18 based on Dr. Walker's model is 12.8, and based on the 19 bootstrap distribution that I generated the 5th and 20 95th percentiles of that target limit are respectively 21 8.7 and 18.8. 22 Q. Does that mean, then, that based on your 23 bootstrap analysis you show essentially that in five 24 percent or less of the time you might expect a target 25 as low as 8.7 parts per billion? LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0123 STEVEN P. MILLARD, V.2, 3-10-93 1 A. That's correct, you could interpret it that 2 way. 3 Q. What does the 95th percentile at 18.8 mean in 4 similar layman's terms if you can? 5 A. In similar layman's terms you would expect 6 that if you repeated the same experiment several times, 7 95 percent of the time the target limit for average 8 stage equal 15.5 would be at 18.8 or less. 9 Q. What is the shape of the distribution curve, 10 then, for that range based on your bootstrap between 11 the 5th percentile and the 95th percentile? That 12 defines the distribution curve, doesn't it, or you can 13 define the distribution curve from that effort? 14 A. You can't define a distribution curve simply 15 from two percentiles. 16 Q. But those are derived, are they not, from the 17 thousand bootstrap runs? 18 A. That's correct. 19 Q. So you have a lot of data points? 20 A. That's correct. 21 Q. What would that curve look like? 22 A. The way that curve looks -- what that curve 23 looks like is shown in the figures in my report. 24 Q. Figure 1a? 25 A. Actually for the particular example we were LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0124 STEVEN P. MILLARD, V.2, 3-10-93 1 talking about the -- 2 Q. We were looking at average stage of 15.5 of 3 the target limit of 12.8. 4 A. Yes, then figure 1a corresponds to that 5 information in the table. 6 Q. Do you recall writing an article titled 7 "Environmental Monitoring, Statistics and the Law: Room 8 for Improvement," published in November of 1987? 9 A. Yes, I do. 10 Q. Were you talking about lawyers when you said 11 that a lot of people who have been involved in this 12 field in monitoring programs had a superficial 13 knowledge of statistics? 14 A. I don't believe I had attorneys in mind when 15 I made that statement. I think I was thinking more of 16 actually private consultants who have been called in to 17 do work. 18 Q. We probably could reach a stipulation on that 19 anyway. I couldn't resist it. I happened to notice it 20 when I read it. It seemed apropos at this point. 21 Based on the analysis you conducted, if 22 during the future sampling as part of the monitoring 23 and compliance program the water quality in the refuge 24 is precisely the same as during the base period at any 25 given sampling stage, what is the probability that the LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0125 STEVEN P. MILLARD, V.2, 3-10-93 1 geometric mean total phosphorus concentration will on 2 any given date exceed the limits calculated in Appendix 3 E of the SWIM Plan? 4 A. I can't make that statement with certainty 5 unless I am willing to make the assumption that the 6 model calculated in Appendix E is in fact a true 7 reflection of the system. In other words, that the 8 particular predictor variables that they used in the 9 model are the correct ones and do a good job of 10 reflecting the variable in the system. 11 Q. If I understood your testimony yesterday, 12 nothing that you have done so far and in none of the 13 data available to you suggests that that is not in fact 14 the correct model? 15 A. Nothing that I have done personally to this 16 point would lead me to believe that that model is 17 either good or bad. 18 Q. Now, I had specifically asked about the 19 Appendix E model, but is the same answer or would the 20 same answer apply if I limited it to Dr. Walker's May 21 '91 articulation of the model? 22 A. Yes, my answer applies to both Dr. Walker's 23 model and the model in Appendix E. 24 Q. If you accept for the purpose of these 25 questions that the model is inaccurate and appropriate LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0126 STEVEN P. MILLARD, V.2, 3-10-93 1 description of the echo system response to stage and 2 stage being the appropriate variable, would your 3 analysis thus far lead you to believe that under those 4 circumstances the variation would be greater than ten 5 percent? 6 MR. BLANK: Object to the question as vague 7 and ambiguous. 8 If you understand it you can answer, Doctor. 9 It might help if you clarified a little bit, 10 Counsel. I am confused as to which model you are 11 referring to now. Appendix E or Dr. Walker's model? 12 Q. Well, I'd take either, but based on Dr. 13 Millard's testimony I think it's pretty clear that he's 14 only analyzed Dr. Walker's articulation of a model in 15 his May '91 paper, that he has not conducted any 16 analysis thus far of the actual model embodied in the 17 SWIM Plan that's being challenged here. So I think we 18 have to limit it to Dr. Walker. 19 Unless you feel you can -- 20 MR. BLANK: I think Dr. Millard's testimony 21 yesterday was that his bootstrap analysis assumed that 22 the model being used by Dr. Walker was the correct or a 23 valid model. He hasn't analyzed the model itself to 24 make that determination. He simply assumed it. 25 MR. FITZGERALD: And I am asking him to LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0127 STEVEN P. MILLARD, V.2, 3-10-93 1 continue that assumption. 2 Q. With that assumption, given the parameters I 3 described that at the time of whatever future sampling 4 program the water quality in Loxahatchee is the same as 5 during the base period, what is the probability that 6 the geometric mean TP concentration will on any given 7 date of sampling exceed the calculated limits 8 calculated in accordance with Dr. Walker's model which 9 is what you have in table 2 for target interim limits, 10 right? That's the outside number, those are Dr. 11 Walker's limits? 12 A. That's correct. 13 Q. And I have asked you to continue your 14 assumption that the model is valid. 15 A. Assuming that the model is valid and there's 16 not any problem with the data that were used to build 17 that model and there hasn't been any change in the 18 sampling procedure or the procedure used to analyze the 19 samples, I think those are enough caveats, then the 20 interim limit given by Dr. Walker should be exceeded on 21 average about ten percent of the time. 22 Q. Saves us a lot of questions. Those caveats I 23 found interesting; you have been mentioning some of 24 them over the day or so we have been here. Don't those 25 same caveats apply to your work if in fact there are LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0128 STEVEN P. MILLARD, V.2, 3-10-93 1 major discrepancies, if you will, or problems in the 2 data collection, the sample analysis, the stage data 3 information, or the relationship between concentration 4 and stage in Loxahatchee, your work doesn't do us any 5 good either because it's all predicated on that data? 6 A. That's correct. My work is all predicated on 7 those four caveats that I mentioned, well, the first 8 three talking about the integrity of the data, the 9 integrity of the lab, the analytical procedure that the 10 lab has used, and the assumption that the model that 11 Dr. Walker used is correct. 12 Q. The same caveats would apply, would they not, 13 if you were analyzing the district's Appendix E model? 14 A. That's true. 15 Q. In reviewing the original data to prepare to 16 do your bootstrap run, did you observe a seasonal 17 sequencing effect in the data about every two months 18 independent of the stage? 19 A. I didn't look at the data closely, the raw 20 data closely enough. I didn't spend much time looking 21 at it before I simply assumed that the model that Dr. 22 Walker used was correct and went on with my bootstrap 23 analysis. 24 Q. Do you understand what I mean by a sequencing 25 effect in the data? LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0129 STEVEN P. MILLARD, V.2, 3-10-93 1 A. No, I don't. 2 Q. Does your bootstrap program have any capacity 3 for dealing or accounting for a seasonal sequencing 4 effect? 5 A. I don't know what you mean by a seasonal 6 sequencing effect. 7 Q. If an additional body of data became 8 available of analysis of observations and sampling 9 efforts in Loxahatchee at the same 14 sites or 10 additional sites that would be incorporated in the 11 analysis and those values at given stages fall within 12 the confidence interval of Dr. Walker's or the 13 district's model, would that increase your confidence 14 or increase our justification in relying on the model 15 as constructed? 16 A. I don't feel it would be compelling evidence 17 that the model is necessarily correct. Sorry, I don't 18 feel it would be compelling evidence that the model 19 used is necessarily the best or close to the best that 20 could be used given other data that's available. 21 Q. What other data? 22 A. For example, input from the stages, from the 23 structures S-5A and S-6 or looking at other terms in 24 the model such as allowing for rather than a step 25 change in time, a linear trend in time or a quadratic LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0130 STEVEN P. MILLARD, V.2, 3-10-93 1 effect in time. 2 Q. Who's explained to you the hydrological 3 implications of water releases from 5A and 6 on the 4 interior of the marsh at Loxahatchee? 5 A. I have talked with both Dr. Lettenmaier and 6 counsel about some of the issues involved just in a 7 very general sense. 8 Q. Do you have a sense of the loading rates from 9 5A and 6, the concentrations in the water that is 10 discharged surface water into the refuge and where that 11 water goes? 12 A. No, not -- I don't have a very good knowledge 13 at this point in time of those quantities. 14 Q. Do you know what portion of the water budget 15 of the refuge is accounted for by those two structures? 16 A. No, I don't. 17 Q. Do you know what the phosphorus input from 18 rainfall is in that area? 19 A. No, I don't. 20 Q. What observational data would you require in 21 order to feel comfortable in employing the Appendix E 22 or the Walker model if the data derived in the future 23 from sampling programs in fact bears a high correlation 24 to the predicted targets and limits generated by those 25 models? LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0131 STEVEN P. MILLARD, V.2, 3-10-93 1 A. What I am having trouble with in your line of 2 questioning is looking at future values of geometric 3 mean and seeing whether they fall in the range of the 4 limits. I would be much more comfortable actually 5 analyzing -- taking the whole data set and looking at a 6 similar model to the one that was derived earlier, 7 either just with that data or perhaps combining the two 8 data sets or some other things. That would be my -- 9 one of my ways of going about trying to verify the 10 validity of the original model either proposed by Dr. 11 Walker or proposed by in Appendix E. 12 Q. Let me see if I understand that. If you had, 13 say, another five years of data analysis and the 14 caveats about how it's done and all that are not a 15 factor, you could take that five-year period, take Dr. 16 Walker's or the district's five years of data during 17 the period of record and do a model that accounts -- 18 that then has a ten-year period of record, and you 19 would then compare that to the earlier models and if 20 the correlation approached unity you would say hey, 21 that was a pretty good model, the earlier one, or it 22 still accounts for the observed reality? 23 A. If there wasn't a lot of discrepancy or a lot 24 of changes in the coefficients in the model after 25 somehow incorporating the new data, whether it means LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0132 STEVEN P. MILLARD, V.2, 3-10-93 1 incorporating the new data with just the baseline 2 period or the original baseline period and post- 3 baseline period and the new data, however that's done 4 after considering several ways of doing it, if there's 5 not a large change in the coefficients then I would 6 feel much more confident in the validity of the model 7 proposed by Dr. Walker or the one in Appendix E, 8 whichever seems to do a better job. 9 Q. As a result of your work did you find any 10 bias in the limits for the refuge? 11 A. In order to talk about bias, I would need to 12 understand better what kind of bias you are looking for 13 or talking about. 14 Q. You defined bias for me yesterday, so let's 15 use your definition. You said it means how far away an 16 estimate is from the true population value. 17 A. At this point I would say I haven't found a 18 bias in whatever sense you want to talk about it, 19 meaning whatever kind of population value you think you 20 are comparing whatever estimate to. The reason being I 21 have not -- I have not done any analysis specifically 22 looking for that sort of thing. 23 Q. You expected in your analysis to find some 24 variance, didn't you? 25 A. Yes, but variance is a different concept from LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0133 STEVEN P. MILLARD, V.2, 3-10-93 1 bias. 2 Q. Exactly. In fact we should not be overly 3 concerned that there is some variance since the 4 bootstrap mechanism is designed to test variance and 5 generate variance for comparison purposes. 6 A. It's not up to me to decide which is more 7 important, variance or bias. In the statistical field 8 there are actually several different ways of estimating 9 quantities. Some ways concentrate more on reducing 10 bias and some ways concentrate more on reducing 11 variance and some ways try to balance the two. 12 (Exhibits-10 and 11 marked.) 13 Q. Doctor, Exhibit-10 is a set of data as is 14 Exhibit-11, analysis results of water quality sampling 15 on two separate dates. For the sake of the questions, 16 Exhibit-10 would be call it the December data and 17 Exhibit-11 the January data. If I were to provide you 18 stage level data from 1-7, 1-8T and 1-9, do you have 19 the capacity to determine based on your familiarity 20 with Dr. Walker's method whether or not the results of 21 the analysis fall within the 90 percentile range of the 22 target and limits generated according to his model for 23 the Loxahatchee refuge? 24 A. Yes, I have the capacity to do that. Not 25 here. LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0134 STEVEN P. MILLARD, V.2, 3-10-93 1 Q. If you look back at your report, Exhibit-8, 2 page 6 -- I see the problem. Is your problem that you 3 didn't bring your calculator and so you cannot 4 calculate the geometric mean concentration? 5 A. That's correct. 6 Q. Is this one not good enough? I am going to 7 make it easier. Let's do it this way. For the sake of 8 this hypothetical question, for the December data, 9 Exhibit-11, assume that -- 10 MR. BLANK: I thought we marked 10 as the 11 December. 12 MR. FITZGERALD: You are correct, it's 10 and 13 11. I had different numbers on my -- 14 MR. BLANK: Do you have extra copies of this, 15 Counsel? 16 MR. FITZGERALD: It was the only set I was 17 able to get in time that was clean, the others were 18 chopped up, but I'll make a copy today for you, 19 Counsel. I can represent, Counsel, that your firm has 20 it. I provided it to Mr. Burgess -- when did we come 21 here, Tuesday? -- Monday. 22 Q. For the sake of this question, assume that 23 the 14 station geometric mean for Exhibit-10 is 8.89 24 parts per billion and the arithmetic mean of the stage 25 data for the three stage stations -- LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0135 STEVEN P. MILLARD, V.2, 3-10-93 1 A. I am sorry, could you repeat? 2 Q. The geometric mean is 8.89. 3 A. That's for the 14 stations for 3 through 16? 4 Q. Yes. With data available from all points as 5 you can see from what you have. Your station 3 should 6 be 7.75, 4 should be 34.41, 5 should be 8.68, et cetera 7 down through, 16 which is 7.44. Your 14 station 8 geometric mean I would suggest or ask you to assume 9 calculates at 8.89 and your stage mean, which is the 10 arithmetic mean, would be 16.89 based on the three 11 stages 7, 8 and 9 being 16.95, 16.86 and 16.87. 12 Actually those you probably could do on my little 13 calculator. If I can ask you -- since it makes more 14 sense to compare this to Appendix E rather than to Dr. 15 Walker, if I can invite your attention to the exhibit 16 on Appendix E which is Exhibit-6. Can you then tell me 17 for that stage level, that mean stage, whether the 18 geometric mean of the 14 stations exceeds the limit 19 calculated in accordance with the formula in Appendix 20 E? 21 A. Well, in the table on page E-21 of Exhibit-6 22 is a listing of the limits that's labeled -- the last 23 column is labeled "Ten Percent Rejection Limit," and 24 those are given for various values of average stage. 25 The exact value 16.89 is not one of those listed. LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0136 STEVEN P. MILLARD, V.2, 3-10-93 1 However, as stage increases that rejection limit 2 decreases. So it would be possible to tell -- let's 3 see, 16.89. There is a value -- the fourth value down 4 has an -- in the fourth row of that table that I am 5 looking at, the average stage is 16.63 and then one 6 that's higher than that would be the seventh row has a 7 stage of 17.11. 8 Oh, sorry, is it okay for me to write on 9 these -- 10 Q. You can write on anything you want. Just 11 tell us when you are writing on it so we have on the 12 record it's yours. 13 A. I am making arrows on the fourth row and the 14 seventh row in this table on page E-21. 15 Q. So that's the stage of 17.11 and the stage of 16 16.63? 17 A. That's correct. 18 Q. Corresponding to date 7812 and 7910 in the 19 left-hand column? 20 A. That's correct. 21 Q. What you are doing is you are looking at the 22 ten percent rejection limit in parts per billion in the 23 extreme right column? 24 A. That's correct. 25 Q. And sort of linearly interpolating in your LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0137 STEVEN P. MILLARD, V.2, 3-10-93 1 mind or do you want to -- 2 A. Well, one way to do it would be to linearly 3 interpolate; however, the linear interpolation will not 4 give you the correct ten percent rejection limit. So 5 the value 8.89 is above the limit for the value of 6 stage at 17.11. It's below the limit for the value of 7 stage at 16.63, so I can't tell you just -- with this 8 information whether this geometric mean 8.89 would in 9 fact be over the computed ten percent rejection limit 10 based on the method used in this appendix. 11 Q. The formula for deriving the precise ten 12 percent rejection limit is right there, isn't it? 13 A. At the bottom of the page, the limit is 14 given -- the formula is given for how to compute the 15 limit in units of the log space. So it would be 16 possible for me to take the value of stage, put that 17 into the formula where the letter S is shown in the 18 formula, compute that and compare that with the -- I 19 would either have to take the logarithym of the 20 geometric mean or take this limit and raise it -- use 21 the exponential transformation on this limit in order 22 to make the comparison. 23 Q. Do you have your copy of Dr. Walker's report 24 with you -- 25 A. No, I don't. LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0138 STEVEN P. MILLARD, V.2, 3-10-93 1 Q. -- that you analyzed? If I provided you 2 the -- with regard to Exhibit-11 a 14-station geometric 3 mean of 7.65 and the stage variable calculates to a 4 mean of 17.17, can you tell from page E-21 of Exhibit-6 5 whether that would fall beyond the ten percent 6 rejection limit? 7 A. I can't tell just from this information 8 because in the table on page E-21 the column that's 9 labeled "S Stage," the largest value of stage given in 10 that column is 17.11. 11 Q. So the observed stage value exceeds the range 12 of values contained in the table 8 paragraph? 13 A. That's correct. 14 Q. You have discussed in your testimony 15 extensively the notion of variance and the variances 16 that is also described in your draft report resulting 17 from your analysis using bootstrap. Did you analyze to 18 determine the extent to which the compliance model, Dr. 19 Walker's iteration for the moment, explains data 20 variances? Maybe there is another way to put it. What 21 percent of variances are -- in the data are accounted 22 for by that model? 23 A. Are you asking did I look at the value of R 24 squared from the fit? 25 Q. Indirectly. LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0139 STEVEN P. MILLARD, V.2, 3-10-93 1 A. At one point I am sure I saw summary 2 statistics from that fit including R squared. I don't 3 recall the value. 4 Q. Could you answer the same question with 5 regard to the district's model as promulgated in the 6 March 13 SWIM Plan Appendix E? 7 A. I never looked at the model. 8 Q. I invite your attention to page E-21. I 9 think I just saw it. 10 A. Oh, yes, R squared is listed on page E-21 for 11 their model. 12 Q. You said yesterday that R squared is a 13 measure of regression fit, cannot be greater than one, 14 and if you multiply the decimal by a hundred you get 15 the percentage. Is it in fact accurate to say that in 16 layman's terms a 67.233 percent R squared value implies 17 that the model accounts for 67 percent plus of the data 18 distribution? 19 A. Yes, you could say it accounts for 20 approximately 60 percent of the variability in the 21 response variable, which in this case is the mean on 22 the log scale. 23 Q. So to take it out of percentages, it accounts 24 for two-thirds of the variance? 25 A. That's correct. LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0140 STEVEN P. MILLARD, V.2, 3-10-93 1 Q. In the field of statistics is a two-third 2 accountability considered good, is it bad, does it 3 depend on the time of day? 4 A. There is no hard-and-fast rule. In the 5 social sciences -- data that's collected in the social 6 sciences an R squared of that large is probably most of 7 the time considered quite good. In the physical 8 sciences, in physics it may be considered quite low, 9 quite poor, quite a poor model. 10 Q. How about in the medical field? 11 A. Again there is no hard-and-fast rule. A lot 12 of -- the problem is trying to summarize a fit just 13 with R squared can be a bit risky because it's possible 14 to get large values of R squared and still have a model 15 that does not fit the data very well. An example of 16 that would be you can have a response variable with one 17 predictor variable and the relationship -- the true 18 relationship might be quadratic, it might be 19 curvalinear. If you fit a straight line model you 20 could get a fairly large value of R squared, but if you 21 actually look at, say, the residuals from that model 22 you can see that you have not accounted for the 23 curvalinear effect, so you would in fact go back and 24 probably add a second order term. 25 Q. If someone conducted the first order LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0141 STEVEN P. MILLARD, V.2, 3-10-93 1 regression and came up with a fit and it generated the 2 67 percent and wanted to know if in fact this was a 3 reasonable or best fit under the circumstances of the 4 case, the data set that was being analyzed, looked at 5 the half order or second order regression to see if an 6 additional term would enhance the fit and it didn't 7 because of the physical limitations of the data set and 8 the reality, you know, the source of the data set, 9 those experiential factors we were referring to 10 earlier, would it be appropriate then to stay with the 11 first order? 12 A. If you -- I am sorry, but it's always hard 13 for me to answer a definite yes or no without caveats. 14 Q. I have noticed. But that's not unique. 15 A. In my opinion, if an attempt was made to add 16 other terms to the model and based on either partial F 17 tests or looking at residual plots or looking at other 18 diagnostics to try to determine whether adding those 19 additional terms helped the fit or not and it turned 20 out -- and also based on scientific input from 21 environmental scientists, if it turned out that adding 22 those terms did not help the model or did not make 23 sense, then I would feel comfortable staying with a 24 less sophisticated model, but I wouldn't say that 25 that's necessarily the best that you could do because LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0142 STEVEN P. MILLARD, V.2, 3-10-93 1 there's always the possibility there are other 2 variables in the model that would help explain that you 3 didn't look at. 4 Q. You mentioned that in the social sciences a 5 67 would be considered high or a very strong fit but 6 that in some of the physical sciences like physics that 7 might be considered an inadequate fit. What has been 8 your experience in the environmental sciences area 9 which sort of doesn't really fit either of the two you 10 just described? If you can answer that. 11 A. I just have to repeat what I said. There is 12 no hard-and-fast rule, and trying to judge a model 13 based simply on R squared can get you into trouble. 14 Q. In one of your articles in Biometrics on 15 "Proof of Safety versus Proof of Hazard" in September 16 of '87, you said any proof of safety or hazard will 17 depend on the size of both the Type I and Type II 18 errors associated with the test. Can you describe for 19 me what you mean by a Type I and a Type II error in 20 statistics? 21 A. The terms Type I error and Type II error are 22 used in the context of hypothesis testing, and the way 23 the problem -- a problem is formulated in hypothesis 24 testing is you start with a null hypothesis which you 25 believe to be true in lieu of any other evidence, any LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0143 STEVEN P. MILLARD, V.2, 3-10-93 1 new evidence that you may collect, and you have an 2 alternative hypothesis. Usually the way you set up the 3 system, you're hoping to show that the null hypothesis 4 is not true. The way the Type I and Type II error are 5 explained classically, it would be easier for me to 6 explain it if I could draw on a piece of paper while I 7 am talking with you. 8 MR. FITZGERALD: Could we mark this as the 9 next numbered exhibit. That would be Exhibit-12. 10 (Exhibit-12 marked.) 11 Q. Doctor, I have provided you a little graphic 12 box-type arrangement here labeled "Possible Outcomes" 13 which has been marked Exhibit-12 for the deposition. 14 The left side is the Test Indication broken down into 15 Compliance or Violation. Across the axis at the top 16 you have Reality, Compliance, Violation. I would like 17 to know if looking at this and applying it -- clearly 18 it says polluter or protector being penalized when 19 there is an error, if you can relate this to the 20 situation that we're dealing with in the SWIM Plan 21 where you should take as a given for this purpose the 22 ideal is to limit phosphorus levels in Loxahatchee 23 National Wildlife Refuge and that if a noncompliance 24 situation arises either the resource is penalized or 25 the polluter is penalized, or let's say the contributor LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0144 STEVEN P. MILLARD, V.2, 3-10-93 1 of phosphorus by more stringent restrictions on their 2 contribution of phosphorus to the system. 3 So that is the framework of reference just 4 for the purposes of this deposition. Can you explain 5 utilizing this what you mean by a Type I error? 6 A. In this case the way I would read this table 7 and put it in context of hypothesis testing, the null 8 hypothesis would be that no violation is occurring, and 9 the alternative hypothesis would be that there is a 10 violation occurring. The null hypothesis and the 11 alternative hypothesis you would label up at the top 12 here under the heading Reality because the null 13 hypothesis and the alternative hypothesis always refer 14 to what is actually going on, and of course the idea 15 behind hypothesis testing is that we don't know what's 16 going on; that's why we have to do an analysis and 17 produce a test statistic. If we knew what was going on 18 we wouldn't have to perform a test. 19 Q. Under that analysis, if the reality is that 20 levels of phosphorus are not exceeding the permissive 21 limit under the compliance test adopted, whatever that 22 may be, but your test indicator, your model that 23 generates the limits indicates a violation, that would 24 be a Type I error? 25 A. That's correct. LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0145 STEVEN P. MILLARD, V.2, 3-10-93 1 Q. You would have falsely identified a pollution 2 situation for shorthand purposes, an excessive 3 phosphorus situation, it would be a false 4 identification? 5 A. That's correct. 6 Q. If, on the other hand, in reality you have a 7 violation, there is an excessive level and you are not 8 complying but your test indicator under your model 9 indicates no violation, you would then have the Type II 10 error where -- and the risk of the Type II error being 11 on the resource; in other words, there is an inability 12 to detect the excessive loading situation or 13 concentration situation? 14 A. That's correct. 15 Q. I saved you from having to draw that out. 16 You make the point in the same article I mentioned 17 earlier that analysts should be concerned about Type II 18 errors as well as Type I and seemed concerned that 19 there is a tendency in the environmental field, and I 20 think you mostly pointed at examples with the EPA of 21 not considering Type II errors? 22 A. Yes, that's true. The reason I stumble when 23 I answer that is because it depends on how the 24 hypothesis is formulated which error tends to be 25 ignored, Type I or Type II, but usually it's the Type LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0146 STEVEN P. MILLARD, V.2, 3-10-93 1 II error, or another way of saying that is the power, 2 the power's defined to be one minus the probability of 3 a Type II error. 4 Q. I was going to get to power in a minute. 5 We'll probably come back to that. Can that same 6 analysis in general be applied to the current situation 7 we have, as I tried to do in Exhibit-12 that we just 8 looked at? Can you test or test the power of the 9 models that Mr. Walker or Dr. Walker is articulating 10 and the district articulated in Appendix E of the SWIM 11 Plan? 12 A. Yes, if you are willing to make all the 13 assumptions about the integrity of the data and the 14 appropriateness of the model and no change in the 15 sampling procedure, et cetera, then you could come up 16 with probability values for a Type I and a Type II 17 error. 18 Q. Did Dr. Lettenmaier ever tell you anything 19 along the lines of addressing those caveats? Did you 20 ask him was there a variation in sampling techniques, 21 is there a problem in the data, did he suggest the 22 existence of additional better data? 23 A. As I recall, the only thing that Dr. 24 Lettenmaier mentioned to me was the problem in locating 25 the stations and that every time that a sampling event LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0147 STEVEN P. MILLARD, V.2, 3-10-93 1 occurred there was some question as to whether 2 precisely the same geographic area was -- water from 3 that area was being sampled or not. That I believe the 4 only problems -- specific problems that I am aware of 5 in the sampling scheme that I know from Dr. 6 Lettenmaier. I also know about other problems in the 7 data and the sampling scheme or analyzing the data 8 based on what I read in Dr. McClave's deposition. 9 Q. But you only had that for a week. 10 A. That's correct. 11 Q. So that didn't figure in any of your 12 analysis; you were not able to factor that in in any 13 way? 14 A. Well, as I have said before, the result of my 15 efforts to date have been focused on analyzing Dr. 16 Walker's model, specifically just doing a bootstrap 17 analysis of that model without worrying about whether 18 that model's correct or whether the -- not worrying 19 about the integrity of the underlying data, et cetera. 20 Q. When you spoke with Dr. Myhre at ESP did you 21 ask him about the integrity of the data? 22 A. No. 23 Q. Did he volunteer any thoughts or comments on 24 that subject? 25 A. No. LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0148 STEVEN P. MILLARD, V.2, 3-10-93 1 Q. Or any of the other caveats you have raised? 2 A. No. 3 Q. In conducting your analysis of Dr. Walker's 4 paper and also in your review of SWIM Plan Appendix E, 5 what if anything did you do to consider the statistical 6 power of the design, the model design? 7 A. I haven't considered that at all to date. 8 Q. Do you currently understand the scope of work 9 that you are to perform to include evaluating the power 10 of the test articulated in Appendix E, that is, the 11 probability of not detecting an adverse change or a 12 noncompliant situation I guess actually is more 13 accurate, a Type II error? 14 A. At this point I haven't been asked to do any 15 sort of analysis on power or Type I error. 16 Q. You talk in your publications about the 17 sample sizes. I assume you mean the overall number of 18 observations required to adequately guard against 19 excessive Type I and Type II errors. What sample sizes 20 would be required in your review to avoid those 21 problems? 22 A. If you read my articles, I think you will see 23 at least in one of them that there's a specific formula 24 for trying to -- for determining sample sizes. If 25 there isn't a formula in the paper, then there is a LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0149 STEVEN P. MILLARD, V.2, 3-10-93 1 reference to a standard textbook, probably the one by 2 Zar, because it turns out that when you try to estimate 3 the sample sizes required to guard against Type I's and 4 Type II errors at certain levels, the formula in the 5 standard classical case of a simple hypothesis test, 6 the formula usually involves the size of the Type I 7 error, the size of the Type II error, the amount of 8 change you are looking for and the variability inherent 9 in your data. So there are five quantities that are 10 traded off. 11 Q. Is that formula the one that's square root of 12 Z equal to or greater than Z subscript I can't read 13 over 2 and then -- 14 A. You are referring to the paper that was 15 printed in Biometrics. I can't recall if we're 16 worrying about Type II error at that point or if we're 17 just worrying about -- I'd have to look at that 18 equation. 19 Q. In that publication "Proof of Safety versus 20 Proof of Hazard," you say, "The statistical methods of 21 error control that have been used to control the 22 probability of false positives in assertions of 23 hazard," and that would be a Type I, "should be used to 24 control the probability of false negatives in 25 assertions of safety," which would be a Type II, LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0150 STEVEN P. MILLARD, V.2, 3-10-93 1 correct? 2 A. That's correct in this context. 3 Q. You have a table in that same publication, a 4 figure that is labeled the probability of incorrectly 5 concluding a site as safe -- and you have a stylized 6 Greek B? 7 A. Beta. 8 Q. -- is a function of sample size Z and the 9 true proportional increase here is in death rate and I 10 think that's the Greek Theta? 11 A. Theta. 12 Q. Is that the type of distribution curve you 13 were talking about earlier or a table that you were 14 referencing? 15 A. That sort of graph can be used to determine 16 what sample size you want for a given power of 17 detecting the change or equivalently or a given level 18 of Type II error. 19 Q. And that same technique could be employed in 20 the present situation, could it not? 21 A. Yes. 22 Q. You have not done that? 23 A. No. 24 Q. Have you been asked to? 25 A. No. LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0151 STEVEN P. MILLARD, V.2, 3-10-93 1 Q. Do you have any plans to? Is this something 2 that occurred to you in the course of your evaluation 3 and review? 4 A. I have not asked to do it at this point. 5 Q. Assuming for the sake of scheduling that you 6 are requested to do the five or six follow-on tasks 7 that you identified in your draft report, how long will 8 it take you to complete those tasks and form your final 9 opinions? 10 A. Depending on the level of detail that would 11 be requested, I would say anywhere from -- anywhere 12 from two weeks to a month in full dedicated time. 13 Q. I assume you have other demands on your time, 14 however? 15 A. That's correct. 16 Q. Based on your understanding of what your work 17 schedule's likely to be over the next four to six weeks 18 and whatever inability that would generate in terms of 19 dedicating full-time to conducting those tests, when 20 would you anticipate you would have your final opinions 21 and work complete? I should say it the other way 22 around, work complete and then final opinions. 23 A. As far as completing all of these six tasks, 24 it may very well -- assuming that I was asked to do all 25 these six tasks, it could very well be that I wouldn't LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0152 STEVEN P. MILLARD, V.2, 3-10-93 1 be finished until the end of June. 2 Q. Are you aware of the scheduling limitations 3 on discovery in this case, when the hearing is set? 4 A. I am aware that the hearing has been set for 5 sometime in I believe mid October. 6 Q. Has anybody indicated to you that your work 7 need not be completed until the end of June? 8 A. No, schedule has not been discussed for any 9 of these follow-on tasks at this point. 10 Q. Are you aware of any other pending requests 11 for you to do work on this case other than the six 12 items that you identify as possible follow-on that 13 would affect your completion of work date end of June? 14 A. I was working -- prior to this deposition I 15 was working -- I began work on looking at how 16 phosphorus concentrations at the inflows S-5A and S-6 17 related to phosphorus concentrations in the marsh. 18 Q. When did you begin that work? 19 A. No more than two weeks ago. 20 Q. How long do you anticipate that analysis will 21 take? 22 A. It really depends on the level of detail that 23 ends up being requested. It could take -- I would 24 guess it would take anywhere from -- it could take 25 anywhere from two weeks to a month again. LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0153 STEVEN P. MILLARD, V.2, 3-10-93 1 Q. When do you expect the scope of that work to 2 be defined? 3 A. Probably very soon after this deposition is 4 over. 5 Q. Were you aware that you are scheduled for 6 follow-up deposition in May pursuant to a comprehensive 7 discovery and scheduling effort by all counsel in the 8 case? 9 MR. BLANK: Actually, Counsel, I think there 10 is a different date. 11 MR. FITZGERALD: Is it not May? I thought it 12 was the end of May. I looked last night and I actually 13 for once forgot to take my calendar with me. 14 MR. BLANK: At least in your letter to me -- 15 MR. FITZGERALD: My letter is right. 16 MR. BLANK: It says July 27th and 28. 17 MR. FITZGERALD: It is July, all right. 18 Maybe Dr. Lettenmaier is in May. 19 MR. BLANK: His initial deposition is in 20 May. 21 MR. FITZGERALD: Because of the association 22 in the publications I probably mixed those up. 23 Q. So you would anticipate having the final 24 opinions and conclusions, et cetera, complete by July 25 27th? LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0154 STEVEN P. MILLARD, V.2, 3-10-93 1 A. I would anticipate having this work 2 completed. 3 Q. And the work on concentrations at inflow 4 structures to Loxahatchee and its implications for 5 interior marsh station concentrations? 6 A. If I have been asked to do that, yes. 7 Q. So you have not been asked to do that? 8 A. Sorry, if I am asked to continue with that. 9 Q. In your publication "Environmental 10 Monitoring, Statistics and the Law: Room for 11 Improvement," you discuss certain EPA regulations and 12 analyze their requirement for monitoring programs, 13 talking about the difference between aliquots and true 14 field or lab replicates, that sort of thing. One of 15 the issues that you address is what you call a second 16 problem with the regulations being their balance or the 17 balance struck there between the power and Type I error 18 associated with the monitoring design required by EPA. 19 You acknowledge that the authors of the regulations 20 were apparently aware of that problem because of a test 21 other than the T test embodied in those regulations was 22 going to be used by anyone that had to be one that 23 according to the regs., quotation, "reasonably balances 24 the probability of falsely identifying a 25 noncontaminated regulated unit hazardous waste site and LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0155 STEVEN P. MILLARD, V.2, 3-10-93 1 the probability of failing to identify a contaminated 2 regulated unit." That again is the Type I/Type II 3 error problem, isn't it? 4 A. That's correct. 5 Q. In defining that reasonable balance between 6 the Type I error which can falsely penalize the 7 hazardous waste site in that instance or a contributor 8 of phosphorus in this case pursuant to Exhibit-12 9 against the probability of a Type II error which allows 10 in the EPA situation a contaminating hazardous waste 11 site to contaminate without remedial measures because 12 it goes undetected, or in the instant case allows what 13 are defined by the plan as excessive phosphorus limits 14 to occur without remedial action in the refuge, in 15 drawing that balance, what can you bring to bear on 16 that in terms of striking that balance on less than the 17 ideal, perfect utterly dispositive and conclusive data 18 that every statistician would want optimally? 19 A. It's my opinion that the best that a 20 statistician can do is once a model has been agreed 21 upon and a particular test has been agreed upon, the 22 services of a statistician come in when the 23 statistician can define what the probability is of a 24 Type I error, what the probability is of a Type II 25 error for a given sample size. It's up to policymakers LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0156 STEVEN P. MILLARD, V.2, 3-10-93 1 to decide what the cost is of making a Type I error or 2 making a Type II error. 3 Q. So once someone in your role, for example, 4 defines the probabilities of respective type of errors 5 based on the best available scientific evidence, you 6 would not see a role for a statistician in making or 7 urging a particular policy judgment which translates 8 into the risk allocation between a Type I and a Type II 9 error? 10 A. It's my opinion that the way that a 11 statistician might be involved in that is maybe helping 12 to quantify what the risks are. 13 Q. Isn't that what you do when you define the 14 probabilities of Type I and Type II error? 15 A. Sorry, let me make myself a little clearer. 16 One thing the statistician can do is define the 17 probabilities of Type I and Type II error. The cost of 18 making those errors is something that has to be decided 19 upon by policymakers. The policymakers may rely on 20 scientific research to come up with what those costs 21 are, and the statistician might be involved in that 22 particular scientific research as well. 23 MR. FITZGERALD: Thank you very much. I have 24 no further questions at this time subject to our right 25 to recall you when you have your final opinions and LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0157 STEVEN P. MILLARD, V.2, 3-10-93 1 conclusions which, as counsel pointed out, we expect to 2 be in late July. However, Counsel, if in fact the work 3 is wrapped sooner, I'd be perfectly happy to take Mr. 4 Millard earlier if we can work that out to the 5 satisfaction of all counsel. You might want to get it 6 over with or you may want to go on vacation the last 7 week of July. Thank you. 8 MR. BLANK: No questions. 9 (Deposition was adjourned at 10:45 a.m.) 10 (Signature not waived.) 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0158 STEVEN P. MILLARD, V.2, 3-10-93 1 S I G N A T U R E 2 3 4 5 I declare under penalty of perjury under the 6 laws of the State of Washington that I have read my 7 within deposition, and the same is true and accurate, 8 same and except for changes and/or corrections, if any, 9 as indicated by me on the CHANGE SHEET flyleaf page 10 hereof. Signed in..............., WA, on 11 the........day of................, 1993. 12 13 14 15 STEVEN P. MILLARD, VOL.2 16 17 18 19 20 21 22 jl 23 24 25 LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717 0159 STEVEN P. MILLARD, V.2, 3-10-93 1 C E R T I F I C A T E 2 STATE OF WASHINGTON ) 3 ) ss. 4 COUNTY OF KING ) 5 I, the undersigned Registered Professional 6 Reporter and an officer of the Court under my 7 commission as a Notary Public for the State of 8 Washington, hereby certify that the foregoing 9 deposition upon oral examination of STEVEN P. MILLARD, 10 Volume 2, was taken before me on March 10, 1993, and 11 transcribed under my direction; 12 That the witness was duly sworn by me to 13 testify truthfully; that the transcript of the 14 deposition is a full, true, and correct transcript; 15 that I am neither attorney for, nor a relative or 16 employee of, any of the parties to the action or any 17 attorney or counsel employed by the parties hereto, nor 18 financially interested in its outcome. 19 IN WITNESS WHEREOF, I have hereunto set my 20 hand and seal this 18th day of March, 1993. 21 22 .................................. 23 NOTARY PUBLIC in and for the State of 24 Washington, residing at Renton. 25 Commission expires 8-16-96. LARSEN & SMITH, INC., 1325 4TH AVE, SEATTLE, WA 98101 (206)623-6717